Re: About muti-Threads in Lucene

2007-08-06 Thread Patrick Kimber
Hi Kai We use the Lucene Index Accessor contribution: http://www.nabble.com/Fwd%3A-Contribution%3A-LuceneIndexAccessor-t17416.html#a47049 Patrick On 06/08/07, Kai Hu [EMAIL PROTECTED] wrote: Hi, How do you solve the problems when add,update,delete documents in muti-threads,use

Re: docFreq takes long time to execute in a multiple index environment

2007-08-06 Thread Daniel Naber
On Monday 06 August 2007 01:40, tierecke wrote:         Term term=new Term(contents, termstr);         TermEnum termenum=multireader.terms(term);         int freq=termenum.docFreq(); IndexReader has a docFreq() method, no need to get a Term enumeration. regards Daniel --

Re: 答复: About muti-Threads in Lucene

2007-08-06 Thread Patrick Kimber
Hi Kai We keep a synchronized map of LuceneIndexAccessor instances, one instance per Directory. The map is keyed on the directory path. We then re-use the accessor rather than creating a new one each time. Patrick On 06/08/07, Kai Hu [EMAIL PROTECTED] wrote: Thanks , Patrick, It is useful.

You are right but it doesn't make it faster.

2007-08-06 Thread tierecke
Thanks Daniel, you are completely right. I changed the code - but it doesn't make it [noticeably faster] - probably behind the scene it does run on the enum. I added some kind of hash table that keeps the docfreq already read so if I meet it again in another document I can retrieve it quickly -

speedup indexing

2007-08-06 Thread SK R
Hi, I have indexed 5 fields and stored 2 of them(field Length is around 1). My index is growing in nature and it is in GB. I need to get search result based on docID only. Scoring, additional sorting, delete and update are never used. None of complicated things required. In my testing

RE: speedup indexing

2007-08-06 Thread Chhabra, Kapil
Try going through: http://wiki.apache.org/lucene-java/ImproveIndexingSpeed Regards, kapilChhabra -Original Message- From: SK R [mailto:[EMAIL PROTECTED] Sent: Monday, August 06, 2007 5:09 PM To: java-user@lucene.apache.org Subject: speedup indexing Hi, I have indexed 5 fields and

Multiple fields vs one field

2007-08-06 Thread Albert Vila
Hi all My data looks like: Document 1 code, title, content, type, language, date, ... Document 2 code, title, content, type, language, date, ... ... Document n code, title, content, type, language, date, ... Now

RE: Multiple fields vs one field

2007-08-06 Thread Chhabra, Kapil
Hey Albert, Just to remind you, that the fields in Lucene are per document and not per index. This means that you can have documents in an index which have different fields altogether. So, in effect, you can all your document types to your existing index. And guess what, you don't need to change

indexing and searching in the same time

2007-08-06 Thread tierecke
Does Lucene allow searching and indexing simultaneously? Yes. However, an IndexReader only searches the index as of the point in time that it was opened. Any updates to the index, either added or deleted documents, will not be visible until the IndexReader is re-opened. So your application must

Re: Multiple fields vs one field

2007-08-06 Thread Albert Vila
Thank you very much, that is right :) El 06/08/2007, a las 14:23, Chhabra, Kapil escribió: Hey Albert, Just to remind you, that the fields in Lucene are per document and not per index. This means that you can have documents in an index which have different fields altogether. So, in effect,

Re: You are right but it doesn't make it faster.

2007-08-06 Thread Paul Elschot
Nir, You can speed this up (maybe a lot) by moving the disk head(s) as little as possible. Have a look at the file formats of Lucene to get the idea. In your outer loop iterate over the readers of the multireader. For each reader iterate over the terms in sorted order. And don't access the

Mixing SpanQuery and BooleanQuery

2007-08-06 Thread Peter Keegan
I'm trying to create a fairly complex SpanQuery from a binary parse tree. I create SpanOrQueries from SpanTermQueries and combine SpanOrQueries into BooleanQueries. So far, so good. The problem is that I don't see how to create a SpanNotQuery from a BooleanQuery and a SpanTermQuery. I want the

Re: Mixing SpanQuery and BooleanQuery

2007-08-06 Thread Erick Erickson
Isn't a SpanAndQuery the same as a SpanNearQuery? Perhaps with interesting slops.. Erick On 8/6/07, Peter Keegan [EMAIL PROTECTED] wrote: I'm trying to create a fairly complex SpanQuery from a binary parse tree. I create SpanOrQueries from SpanTermQueries and combine SpanOrQueries into

Re: Mixing SpanQuery and BooleanQuery

2007-08-06 Thread Peter Keegan
Even without 'interesting' slops, it does appear that SpanNearQuery is a logical AND of all its clauses. I was distracted by the BooleanQuery examples in the javadocs :) thanks, Peter On 8/6/07, Erick Erickson [EMAIL PROTECTED] wrote: Isn't a SpanAndQuery the same as a SpanNearQuery? Perhaps

Re: speedup indexing

2007-08-06 Thread testn
1. If you only search on docId field only, database might be a better solution in this case. 2. To improve indexing speed, you can consider using the trunk code which includes LUCENE-834. The indexing speed will be faster by almost an order of magnitude. SK R wrote: Hi, I have indexed 5

Re: speedup indexing

2007-08-06 Thread Chris Lu
Seems this issue,LUCENE-834, is about query payload https://issues.apache.org/jira/browse/LUCENE-834 Can it help on indexing speed? -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com

Re: You are right but it doesn't make it faster.

2007-08-06 Thread testn
Does it mean you already reuse IndexReader without reopening it? If you haven't done so, please try it out. docFreq() should be really quick. Thanks Daniel, you are completely right. I changed the code - but it doesn't make it [noticeably faster] - probably behind the scene it does run on the

Re: speedup indexing

2007-08-06 Thread Mike Klaas
On 6-Aug-07, at 5:49 PM, Chris Lu wrote: Seems this issue,LUCENE-834, is about query payload https://issues.apache.org/jira/browse/LUCENE-834 Can it help on indexing speed? That should be: https://issues.apache.org/jira/browse/LUCENE-843 On 8/6/07, testn [EMAIL PROTECTED] wrote: 2.

答复: 答复: About muti-Threads in Lucene

2007-08-06 Thread Kai Hu
Hi,Patrick I tested use a map , get a single LuceneIndexAccessor,and get a cached IndexWriter ,but after do operation of update,delete,add documents,I should close the IndexWriter to release the Lock,it will throw Exceptions this IndexWriter is closed when other threads execute

答复: 答复: About muti-Threads in Lucene

2007-08-06 Thread Kai Hu
By the way, Patrick,did you have a problem that IndexSearcher.search(Query query) cann't get the all matched hits.it only return a part of matched hits. my test code is: String key = title:good; Directory directory = FSDirectory.getDirectory(d:\\index\\); IndexSearcher