[jira] [Commented] (LUCENE-8229) Add a method to Weight to retrieve matches for a single document

Alan Woodward (JIRA) Wed, 28 Mar 2018 12:47:18 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16418007#comment-16418007
 ]


Alan Woodward commented on LUCENE-8229:
---------------------------------------

{quote}A caller might want all fields, or perhaps just some
{quote}
I've done it this way to keep the API as simple as possible.  If we start 
iterating over multiple fields then MatchesIterator becomes a lot more 
complicated, and I don't think it gains us anything?  If consumers want to get 
the matches on multiple fields, then they can call Weight.matches() multiple 
times.

Re payloads, I think of them as a search-time feature, and not really relevant 
here.  Let's keep this API focussed.

I have tried putting something similar to the MatchesIterator on Scorer, but it 
doesn't really fit.  Scorers are designed to iterate over matching documents 
very efficiently, and lots of them have optimizations which mean that positions 
and/or offsets aren't actually available - for example, things like TermInSet 
or AutomatonQuery get rewritten to bitsets, or disjunctions can use bulk 
scorers, or the query cache can intercept things.  Whereas Weight already has 
explain(), which has similar semantics to this - useful information that you 
might sometimes want for your TopDocs, but not something you want to be running 
against every matching document.  And if anything, there are more Scorer 
implementations than Weights, so it would be more invasive a change.

> Add a method to Weight to retrieve matches for a single document
> ----------------------------------------------------------------
>
>                 Key: LUCENE-8229
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8229
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Alan Woodward
>            Assignee: Alan Woodward
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The ability to find out exactly what a query has matched on is a fairly 
> frequent feature request, and would also make highlighters much easier to 
> implement.  There have been a few attempts at doing this, including adding 
> positions to Scorers, or re-writing queries as Spans, but these all either 
> compromise general performance or involve up-front knowledge of all queries.
> Instead, I propose adding a method to Weight that exposes an iterator over 
> matches in a particular document and field.  It should be used in a similar 
> manner to explain() - ie, just for TopDocs, not as part of the scoring loop, 
> which relieves some of the pressure on performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8229) Add a method to Weight to retrieve matches for a single document

Reply via email to