[jira] [Commented] (LUCENE-8229) Add a method to Weight to retrieve matches for a single document

David Smiley (JIRA) Thu, 05 Apr 2018 09:17:31 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16427185#comment-16427185
 ]


David Smiley commented on LUCENE-8229:
--------------------------------------

It's really looking great Alan.  I looked over your patch a bit more....

* I wonder if "Matches" sounds too generic; perhaps "PositionMatches" to 
emphasize it has position information and not simply matching document IDs?
* It's a shame that every Weight must implement this (no default impl) because 
even a no-match response requires knowledge of the field.  Is the distinction 
important to know the field?  I suppose it might be useful for figuring out 
generically which fields a query references... but no not really because you 
have to execute it on a matching document first to even figure that out with 
this API.
* Matcher.EMPTY (a empty version of MatchesIterator) should perhaps be moved to 
MatchesIterator?  Come to think of it, maybe MatchesIterator could be 
Matches.Iterator (inner class of Matches)?  (avoids polluting the busy .search 
namespace).
* RE payloads: I appreciate you want to keep things simple for now.  I've heard 
of putting OCR document offset information in them, for example, and a 
highlighter might want this.  A highlighter might want whatever metadata is 
being put in a payload, even if it is relevancy oriented -- consider a 
relevancy debugger tool that could show you what's in the payload.  This might 
not even be a "highlighter" per-se.

> Add a method to Weight to retrieve matches for a single document
> ----------------------------------------------------------------
>
>                 Key: LUCENE-8229
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8229
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Alan Woodward
>            Assignee: Alan Woodward
>            Priority: Major
>         Attachments: LUCENE-8229.patch
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The ability to find out exactly what a query has matched on is a fairly 
> frequent feature request, and would also make highlighters much easier to 
> implement.  There have been a few attempts at doing this, including adding 
> positions to Scorers, or re-writing queries as Spans, but these all either 
> compromise general performance or involve up-front knowledge of all queries.
> Instead, I propose adding a method to Weight that exposes an iterator over 
> matches in a particular document and field.  It should be used in a similar 
> manner to explain() - ie, just for TopDocs, not as part of the scoring loop, 
> which relieves some of the pressure on performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-8229) Add a method to Weight to retrieve matches for a single document

Reply via email to