callers should be able to advance()/jump() without fear
-------------------------------------------------------
Key: LUCENE-2917
URL: https://issues.apache.org/jira/browse/LUCENE-2917
Project: Lucene - Java
Issue Type: Improvement
Reporter: Robert Muir
Fix For: Bulk Postings branch
Currently, in various places in the code (TermScorer, ExactPhraseScorer) there
are optimizations
that assume advance/jump is heavy, and for short doc-distances etc, they next()
their way instead.
This sort of logic should instead be in the codec: jump/advance should always
be fast.
Its the codecs responsibility to make this happen: jump/advance need not
involve using skip data.
For example: in the fixed layout from LUCENE-2905, various forms of
block-skipping can take place
to do this operation without skip data (this is implemented in its docs and
docsAndPositionsEnums,
but not yet its bulk postings enums).
For block codecs, they should always avoid trying to skip if the target is
likely within-block,
and if the target is likely only a few blocks away, it can still be faster not
to skip, as skipping
out of block requires several fills. In the fixed layout we can do these sort
of 'fast scans' where
in the docsenum case, we keep the freqs buffer one step behind the docs buffer,
skipping it when
we pass over it, and only filling freqs a single time at the end... in the
docsandpositions case
we can do the exact same thing with positions.
I think as part of this, we should tighten the API for the bulkpostings jump,
it should require the
current doc (the old enums knew this implicitly) to allow for different jump
impls. For positions
i think its at least fair to require the caller to pass in the pending
positions count.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]