[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657466#action_12657466 ]
Doug Cutting commented on LUCENE-1483: -------------------------------------- bq. I would actually be fine with keeping HitCollector, adding a default "setNextReader" method, that either throws UOE or (if we are strongly against exceptions) returns "false" indicating it cannot handle sequential readers. Could we instead add a new HitCollector subclass, that adds the setNextReader, then use 'instanceof' to decide whether to wrap or not? bq. I really don't fully understand BooleanScorer! The original version of BooleanScorer uses a ~16k array to score windows of docs. So it scores docs 0-16k first, then docs 16-32k, etc. For each window it iterates through all query terms and accumulates a score in table[doc%16k]. It also stores in the table a bitmask representing which terms contributed to the score. Non-zero scores are chained in a linked list. At the end of scoring each window it then iterates through the linked list and, if the bitmask matches the boolean constraints, collects a hit. For boolean queries with lots of frequent terms this can be much faster, since it does not need to update a priority queue for each posting, instead performing constant-time operations per posting. The only downside is that it results in hits being delivered out-of-order within the window, which means it cannot be nested within other scorers. But it works well as a top-level scorer. The new BooleanScorer2 implementation instead works by merging priority queues of postings, albeit with some clever tricks. For example, a pure conjunction (all terms required) does not require a priority queue. Instead it sorts the posting streams at the start, then repeatedly skips the first to to the last. If the first ever equals the last, then there's a hit. When some terms are required and some terms are optional, the conjunction can be evaluated first, then the optional terms can all skip to the match and be added to the score. Thus the conjunction can reduce the number of priority queue updates for the optional terms. Does that help any? > Change IndexSearcher to use MultiSearcher semantics for multiple subreaders > --------------------------------------------------------------------------- > > Key: LUCENE-1483 > URL: https://issues.apache.org/jira/browse/LUCENE-1483 > Project: Lucene - Java > Issue Type: Improvement > Affects Versions: 2.9 > Reporter: Mark Miller > Priority: Minor > Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, > LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, > LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, > LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch > > > FieldCache and Filters are forced down to a single segment reader, allowing > for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org