Some thoughts 1) Very common words can have a large impact on performance. Use MoreLikeThis to build an optimised boolean query. It costs very little for this to determine how rare/common a term is compared to the cost of a query having to iterate reader.termDocs for a common word. Try iterating across reader.termDocs(veryCommonTerm) for yourself - I just measured 10ms for a single term in a 250k index. If your query must include a very common term turn it into a cached filter. 2) Any retrieval of document content in your timed loop will be costly. Even if you are just reading the title field Lucene may be pulling all of a larger "fullArticleContent" field off disk. Look at the new features in the latest SVN version for partial read of a document's fields - it looks from your code like you only need the title from the results.
Cheers Mark ----- Original Message ---- From: Somnath Banerjee <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Wednesday, 24 January, 2007 5:02:49 AM Subject: Re: Long Query Performance Here is the code. Let me know if you need any clarification // MaxConcepts is set to 100 long stTime = System.currentTimeMillis(); // bq is the Boolean query constructed out of the title of the query document TopDocs docs = searcher.search(bq, null, MaxConcepts); // Store the title of the result documents in a HashTable for(int i=0; i<MaxConcepts && i<docs .scoreDocs.length; i++) { String title = reader.document(docs .scoreDocs[i].doc).get("title"); encycloConcepts.put(title, docs .scoreDocs[i].score); // encycloConcepts is a HashTable } System.out.println("Query Length: "+clauseCnt+" Time Taken: "+( System.currentTimeMillis()-stTime)); // clauseCnt is the number of clause in the query On 1/24/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: > > > that's still doesn't tell us what you are doing -- "query time" can mean a > lot of things ... are you using the Hits class? are you iterating over > results? are you pulling out stored fields? are you sorting? are you using > any Filters? > > questions about improving concrete performance can only be answered by > looking at concrete code -- not vague discussions about the type of > activity being performed. > > > > > -Hoss > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > ___________________________________________________________ Copy addresses and emails from any email account to Yahoo! Mail - quick, easy and free. http://uk.docs.yahoo.com/trueswitch2.html --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]