[
https://issues.apache.org/jira/browse/LUCENE-6545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580802#comment-14580802
]
Hoss Man commented on LUCENE-6545:
----------------------------------
Some relevant comments from rmuir in the original issue...
bq. If i disable the ord-sharing optimization in DocTermOrds, all 3 seeds pass.
So I think there is a bug in e.g. FixedGap/BlockTerms dictionary or something
like that. Maybe BasePostingsFormatTestCase does not adequately exercise
methods like size()/ord()/seek(ord). It should be failing!
bq. is the problem the "extra" terms introduced by precision step? Maybe crank
precisionStep down and see if expected/actual change. Maybe the current
optimization is unsafe in that case and yields a bogus valueCount including the
range terms, which screws up things down the road.
bq. Now we know: its that this DocTermOrds optimization is conceptually broken
with precisionStep. This just causes problems downstream but its not filtering
out the "range terms" and that is the root cause. It cannot return the terms
dict directly, it needs to wrap it with something that filters those out.
Methods like NumericUtils.intTerms()/longTerms() are close, but those currently
do not yet support ord() and seek(ord) which is needed here.
{quote}
1) DocTermsOrds has an optimization in case the terms dictionary supports
ord(). its broken if you are filtering out a subset of the terms, because it
just passes the entire termsenum. Note this optimization never happens, except
for a few oddball terms dicts we have, which support ord(). thats why it fails
with them.
2) those oddball terms dicts are just fine. Nothing wrong with them, its
doctermsords that does the wrong thing.
3) I do not have an opinion on the optimization. its probably easy to fix, but
i would just disable it as you suggest for now, since it only impacts tests or
if someone explicitly uses one of these term dictionaries with this
functionality.
{quote}
> optimize DocTermOrds in cases where the underlying TermEnum being wraped
> supports ord()
> ---------------------------------------------------------------------------------------
>
> Key: LUCENE-6545
> URL: https://issues.apache.org/jira/browse/LUCENE-6545
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Hoss Man
>
> Prior to LUCENE-6529, DocTermOrds had an optimization when the TermEnum of
> the field being Uninverted already supported ord().
> This optimization was removed in LUCENE-6529 (see r1684704) because it was
> found to produce incorrect results for numeric fields that had a
> precisionStep.
> This issue is to track the possibility of re-adding a correct version of this
> optimization.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]