[ 
https://issues.apache.org/jira/browse/LUCENE-8196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597355#comment-16597355
 ] 

Martin Hermann commented on LUCENE-8196:
----------------------------------------

[~romseygeek]

1) I agree that this might be a solution, but as it differs from the setting of 
the paper should be done very carefully.
 
2) Internal slop seems like a great idea! You're right, my example wasn't very 
good and {{Intervals.phrase()}} already does that. But still, if you think of a 
bigger query and e.g. one slop (say, {{"a ("big bad" OR evil) wolf", one 
additional token allowed somewhere}}), the problem remains. I don't really see 
how 'internal slop' would differ from 'normal slop' (doesn't it measure the 
exact same thing?), but it seems rather easy to implement and like something 
that would be desirable and solve this issue.
 
3) I'm not quite sure if I understand that correctly. Do you mean using a gap 
in the query and rewrite it to something like
{noformat}
"bad wolf" (slop 1) contained by "big GAP wolf" (slop 2)
{noformat}
or adding the gap automatically somewhere down the way? I think in the first 
case it'd still be possible to construct some (maybe a little bit more 
complicated) examples that can't be solved like that and where the minimal 
intervals behaviour does not match intuition.

Again, while a lot of these queries may seem quite exotic, I think that 
intervals will get used a lot various programmatically generated queries (as 
spans do now), and there pretty much anything can happen.

> Add IntervalQuery and IntervalsSource to expose minimum interval semantics 
> across term fields
> ---------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-8196
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8196
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Alan Woodward
>            Assignee: Alan Woodward
>            Priority: Major
>             Fix For: 7.4
>
>         Attachments: LUCENE-8196-debug.patch, LUCENE-8196.patch, 
> LUCENE-8196.patch, LUCENE-8196.patch, LUCENE-8196.patch, LUCENE-8196.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> This ticket proposes an alternative implementation of the SpanQuery family 
> that uses minimum-interval semantics from 
> [http://vigna.di.unimi.it/ftp/papers/EfficientAlgorithmsMinimalIntervalSemantics.pdf]
>  to implement positional queries across term-based fields.  Rather than using 
> TermQueries to construct the interval operators, as in LUCENE-2878 or the 
> current Spans implementation, we instead use a new IntervalsSource object, 
> which will produce IntervalIterators over a particular segment and field.  
> These are constructed using various static helper methods, and can then be 
> passed to a new IntervalQuery which will return documents that contain one or 
> more intervals so defined.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to