[
https://issues.apache.org/jira/browse/LUCENE-8477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16602195#comment-16602195
]
Adrien Grand commented on LUCENE-8477:
--------------------------------------
I'd be tempted to just document this behavior for now. I'm afraid that
introducing non-minimized intervals will introduce similar corner-cases to what
we have with spans and sloppy phrase queries?
Rewriting automatically feels a bit wrong given that we would be replacing an
IntervalsSource with another IntervalsSource that has different matches.
However this is something that could be implemented on top of intervals in
query parsers by having an intermediate representation of IntervalsSources and
push disjunctions to the top?
> Improve handling of inner disjunctions in intervals
> ---------------------------------------------------
>
> Key: LUCENE-8477
> URL: https://issues.apache.org/jira/browse/LUCENE-8477
> Project: Lucene - Core
> Issue Type: New Feature
> Reporter: Alan Woodward
> Priority: Major
>
> The current implementation of the disjunction interval produced by
> {{Intervals.or}} is a direct implementation of the OR operator from the Vigna
> paper. This produces minimal intervals, meaning that (a) is preferred over
> (a b), and (b) also over (a b). This has advantages when it comes to
> counting intervals for scoring, but also has drawbacks when it comes to
> matching. For example, a phrase query for ((a OR (a b)) BLOCK (c)) will not
> match the document (a b c), because (a) will be preferred over (a b), and (a
> c) does not match.
> This ticket is to discuss the best way of dealing with disjunctions.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]