[ https://issues.apache.org/jira/browse/LUCENE-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558386#action_12558386 ]
Paul Elschot commented on LUCENE-893: ------------------------------------- I think the different results of 26 May 2007 for conjunction queries and disjunction queries may be caused by the use of TermScorer.skipTo() in conjunctions and TermScorer.next() in disjunctions. That points to different optimal buffer sizes for conjunctions (smaller because of the skipping) and for disjunctions (larger because all postings are going to be needed). LUCENE-430 is about reducing term buffer size for the case when the buffer is not going to be used completely because of the small number of documents containing the term. In all, I think it makes sense to allow the (conjunction/disjunction)Scorer to choose the maximum buffer size for the term, and let the term itself choose a lower value when it needs less than that. Another way to promote sequential reading for disjunction queries is to process all their terms sequentially, i.e. one term at a time. In lucene this is currently done by Filters for prefix queries and ranges. Unfortunately this cannot be done when the combined frequency of the terms in each document is needed. In that case DisjunctionSumScorer could be used, with larger buffers on the terms that are contained in many documents. > Increase buffer sizes used during searching > ------------------------------------------- > > Key: LUCENE-893 > URL: https://issues.apache.org/jira/browse/LUCENE-893 > Project: Lucene - Java > Issue Type: Improvement > Components: Search > Affects Versions: 2.1 > Reporter: Michael McCandless > > Spinoff of LUCENE-888. > In LUCENE-888 we increased buffer sizes that impact indexing and found > substantial (10-18%) overall performance gains. > It's very likely that we can also gain some performance for searching > by increasing the read buffers in BufferedIndexInput used by > searching. > We need to test performance impact to verify and then pick a good > overall default buffer size, also being careful not to add too much > overall HEAP RAM usage because a potentially very large number of > BufferedIndexInput instances are created during searching > (# segments X # index files per segment). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]