[
https://issues.apache.org/jira/browse/LUCENE-8300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468695#comment-16468695
]
Jim Ferenczi commented on LUCENE-8300:
--------------------------------------
distinct is a bit misleading because you check for overlaps here and not
distinct intervals so maybe something like unorderedNonOverlapping ? That's
verbose but more easy to understand ;).
Can you also add tests for other cases than just repeating the same term ? The
new source should be able to find unordered intervals that don't overlap from
Intervals.unordered(Intervals.phrase("the world cup"), Intervals.term("world"),
Intervals.term("cup")) which is different than just finding duplicates interval
of different sources.
> Add unordered-distinct IntervalsSource
> --------------------------------------
>
> Key: LUCENE-8300
> URL: https://issues.apache.org/jira/browse/LUCENE-8300
> Project: Lucene - Core
> Issue Type: New Feature
> Reporter: Alan Woodward
> Assignee: Alan Woodward
> Priority: Major
> Attachments: LUCENE-8300.patch
>
>
> [~mattweber] pointed out on LUCENE-8196 that {{Intervals.unordered()}}
> doesn't check to see if its subintervals overlap, which means that for
> example {{Intervals.unordered(Intervals.term("a"), Intervals.term("a"))}}
> would match a document with {{a}} appearing only once. This ticket will
> introduce a new function, {{Intervals.unordered_distinct()}}, that ensures
> that all subintervals within an unordered interval do not overlap.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]