[ 
https://issues.apache.org/jira/browse/LUCENE-6545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580802#comment-14580802
 ] 

Hoss Man commented on LUCENE-6545:
----------------------------------

Some relevant comments from rmuir in the original issue...

bq. If i disable the ord-sharing optimization in DocTermOrds, all 3 seeds pass. 
So I think there is a bug in e.g. FixedGap/BlockTerms dictionary or something 
like that. Maybe BasePostingsFormatTestCase does not adequately exercise 
methods like size()/ord()/seek(ord). It should be failing!

bq. is the problem the "extra" terms introduced by precision step? Maybe crank 
precisionStep down and see if expected/actual change. Maybe the current 
optimization is unsafe in that case and yields a bogus valueCount including the 
range terms, which screws up things down the road.

bq. Now we know: its that this DocTermOrds optimization is conceptually broken 
with precisionStep. This just causes problems downstream but its not filtering 
out the "range terms" and that is the root cause. It cannot return the terms 
dict directly, it needs to wrap it with something that filters those out. 
Methods like NumericUtils.intTerms()/longTerms() are close, but those currently 
do not yet support ord() and seek(ord) which is needed here.

{quote}
1) DocTermsOrds has an optimization in case the terms dictionary supports 
ord(). its broken if you are filtering out a subset of the terms, because it 
just passes the entire termsenum. Note this optimization never happens, except 
for a few oddball terms dicts we have, which support ord(). thats why it fails 
with them.
2) those oddball terms dicts are just fine. Nothing wrong with them, its 
doctermsords that does the wrong thing.
3) I do not have an opinion on the optimization. its probably easy to fix, but 
i would just disable it as you suggest for now, since it only impacts tests or 
if someone explicitly uses one of these term dictionaries with this 
functionality.
{quote}

> optimize DocTermOrds in cases where the underlying TermEnum being wraped 
> supports ord()
> ---------------------------------------------------------------------------------------
>
>                 Key: LUCENE-6545
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6545
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Hoss Man
>
> Prior to LUCENE-6529, DocTermOrds had an optimization when the TermEnum of 
> the field being Uninverted already supported ord().
> This optimization was removed in LUCENE-6529 (see r1684704) because it was 
> found to produce incorrect results for numeric fields that had a 
> precisionStep.
> This issue is to track the possibility of re-adding a correct version of this 
> optimization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to