Some thoughts
1) Very common words can have a large impact on performance. Use MoreLikeThis 
to build an optimised boolean query. It costs very little for this to determine 
how rare/common a term is compared to the cost of a query having to iterate 
reader.termDocs for a common word. Try iterating across 
reader.termDocs(veryCommonTerm) for yourself - I just measured 10ms for a 
single term in a 250k index. If your query must include a very common term turn 
it into a cached filter.
2) Any retrieval of document content in your timed loop will be costly. Even if 
you are just reading the title field Lucene may be pulling all of a larger 
"fullArticleContent" field off disk. Look at the new features in the latest SVN 
version for partial read of a document's fields - it looks from your code like 
you only need the title from the results.

Cheers
Mark

----- Original Message ----
From: Somnath Banerjee <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wednesday, 24 January, 2007 5:02:49 AM
Subject: Re: Long Query Performance

Here is the code. Let me know if you need any clarification

// MaxConcepts is set to 100

long stTime = System.currentTimeMillis();

// bq is the Boolean query constructed out of the title of the query
document
TopDocs docs = searcher.search(bq, null, MaxConcepts);

// Store the title of the result documents in a HashTable
for(int i=0; i<MaxConcepts && i<docs .scoreDocs.length; i++) {

         String title = reader.document(docs
.scoreDocs[i].doc).get("title");
         encycloConcepts.put(title, docs .scoreDocs[i].score);  //
encycloConcepts
is a HashTable

}

System.out.println("Query Length: "+clauseCnt+" Time Taken: "+(
System.currentTimeMillis()-stTime));   // clauseCnt is the number of clause
in the query

On 1/24/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:
>
>
> that's still doesn't tell us what you are doing -- "query time" can mean a
> lot of things ... are you using the Hits class? are you iterating over
> results? are you pulling out stored fields? are you sorting? are you using
> any Filters?
>
> questions about improving concrete performance can only be answered by
> looking at concrete code -- not vague discussions about the type of
> activity being performed.
>
>
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>





                
___________________________________________________________ 
Copy addresses and emails from any email account to Yahoo! Mail - quick, easy 
and free. http://uk.docs.yahoo.com/trueswitch2.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to