Hello, Since there are a lot of Term objects in your Query, your application must spend a lot of time collecting information about those Terms.
1/ Do you use RAMDirectory? Loading the whole Directory into memory will increase speed - your index must not be too big though 2/ You are probably not using the QueryParser - so when you are building the Query you could sort the Term objects inside a BooleanQuery. Sorting the Terms will reduce jumps on disk. I have no benchmarks for this, but logically, it should have some positive effect when using FSDirectory. Am I wrong? 3/ There was a patch submitted by Dmitry Serebrennikov (http://www.mail-archive.com/[EMAIL PROTECTED]/msg02762.html) which reduced garbage collecting by limiting the creation of temporary Term objects. This patch has not been included in Lucene code (a bug in it?). Hope it helps. Julien ----- Original Message ----- From: "Jie Yang" <[EMAIL PROTECTED]> To: "Lucene Users List" <[EMAIL PROTECTED]> Sent: Wednesday, November 12, 2003 10:11 PM Subject: Poor Performance when searching for 500+ terms > I know this is rare, But I am building an application > that submits searches having 500+ search terms. A > general example would be > > field1:w1 OR field1:w2 OR ... OR field1:w500 > > For 1 millions documents, the performance is OK if > field1 in each document has less than 50 terms, I can > get result < 1 sec. but if field1 has more than > average 400 terms in each document, the performance > degrades to around 6 secs. > > Is there anyway to improve this? > > And my second questions is that my query often comes > with an AND condition with another search word. for > example: > > field2:w AND (field1:w1 OR field1:w2, ... field1:w500) > > field2:w will only return less than 1000 records out > of 1 millions. then I thought I could use a > StringFilter Object? i.e. search on field2.w first, > thus limit the search for 500 OR only on the field2.w > 1000 results. somewhat like a join in database. But I > checked the code and sees that IndexSearcher always > perfomance the 500 disk searches before calling the > filter object? Any suggestions on this? > > Also does lucene caches results in memory? I see the > performance tends to get better after a few runs, > especailly on searches on fields having small number > of terms. If so, can I manipulate the cache size > somehow to accommdate fields with large number of > terms. > > Many thanks. > > > ________________________________________________________________________ > Want to chat instantly with your online friends? Get the FREE Yahoo! > Messenger http://mail.messenger.yahoo.co.uk > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
