Hi All, I have created a 8GB index of almost 2 million documents. My requirement is to run nearly 0.72 million query on this index. Each query consists of 200 - 400 words. I have created a Boolean Query by ORing these words. But each query is taking nearly 5 - 10 seconds to execute ( 2.78 GHz, 1.5 GB RAM). That's mean the entire batch of 0.72M query will take more than 70 days to execute. Is it expected or there is a way to improve the performance? From earlier posts I gathered that complex query is expected to take more time (this much???).
I have tried some of the improvements mentioned in other posts (e.g. increasing JVM heap space) without much benefit. Please let me know if you can think of any optimization technique given that my requirement is to execute all those queries in a batch run (additional hardware is not an option for me). Also, I just need top 150-200 results for each query. Can that be used to speed up the process? In case I'm doing something wrong I have mentioned below the way I'm constructing the query and few lines of logs IndexSearcher sh = new IndexSearcher("IndexPath"); for each query { BooleanQuery bq = new BooleanQuery(); For each word in the query text { bq.add(new TermQuery(new Term("text", tktext)), BooleanClause.Occur.SHOULD); } sh.search(bq); } sh.close(); Performance Log (Query Length = No. Of Words; Time= Millisecond) Query Length: 332 Time Taken: 8609 Query Length: 276 Time Taken: 5172 Query Length: 345 Time Taken: 9313 Thanks in advance, Somnath