[ https://issues.apache.org/jira/browse/LUCENE-8196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597355#comment-16597355 ]
Martin Hermann commented on LUCENE-8196: ---------------------------------------- [~romseygeek] 1) I agree that this might be a solution, but as it differs from the setting of the paper should be done very carefully. 2) Internal slop seems like a great idea! You're right, my example wasn't very good and {{Intervals.phrase()}} already does that. But still, if you think of a bigger query and e.g. one slop (say, {{"a ("big bad" OR evil) wolf", one additional token allowed somewhere}}), the problem remains. I don't really see how 'internal slop' would differ from 'normal slop' (doesn't it measure the exact same thing?), but it seems rather easy to implement and like something that would be desirable and solve this issue. 3) I'm not quite sure if I understand that correctly. Do you mean using a gap in the query and rewrite it to something like {noformat} "bad wolf" (slop 1) contained by "big GAP wolf" (slop 2) {noformat} or adding the gap automatically somewhere down the way? I think in the first case it'd still be possible to construct some (maybe a little bit more complicated) examples that can't be solved like that and where the minimal intervals behaviour does not match intuition. Again, while a lot of these queries may seem quite exotic, I think that intervals will get used a lot various programmatically generated queries (as spans do now), and there pretty much anything can happen. > Add IntervalQuery and IntervalsSource to expose minimum interval semantics > across term fields > --------------------------------------------------------------------------------------------- > > Key: LUCENE-8196 > URL: https://issues.apache.org/jira/browse/LUCENE-8196 > Project: Lucene - Core > Issue Type: New Feature > Reporter: Alan Woodward > Assignee: Alan Woodward > Priority: Major > Fix For: 7.4 > > Attachments: LUCENE-8196-debug.patch, LUCENE-8196.patch, > LUCENE-8196.patch, LUCENE-8196.patch, LUCENE-8196.patch, LUCENE-8196.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This ticket proposes an alternative implementation of the SpanQuery family > that uses minimum-interval semantics from > [http://vigna.di.unimi.it/ftp/papers/EfficientAlgorithmsMinimalIntervalSemantics.pdf] > to implement positional queries across term-based fields. Rather than using > TermQueries to construct the interval operators, as in LUCENE-2878 or the > current Spans implementation, we instead use a new IntervalsSource object, > which will produce IntervalIterators over a particular segment and field. > These are constructed using various static helper methods, and can then be > passed to a new IntervalQuery which will return documents that contain one or > more intervals so defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org