[ https://issues.apache.org/jira/browse/LUCENE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16417849#comment-16417849 ]
David Smiley commented on LUCENE-8229: -------------------------------------- This is really interesting [~romseygeek]! Here's your proposed signature: {{public MatchesIterator matches(LeafReaderContext context, int doc, String field) throws IOException}} * I'm unsure about this new matches method requiring a field reference, thus insisting all fields in the query match the field in this argument. A caller might want all fields, or perhaps just some. This could easily be converted to a Predicate<String> to match the field. * Add payloads to {{MatchesIterator}} * Perhaps {{matches}} should take an int for the PostingsEnum flags. This way it could choose to ask for offsets and/or payloads. Or maybe just always get both to keep the API simpler, assuming the perf difference is negligible for practical uses of this feature (which sounds plausible to me). It could be added later if desired. Yeah, lets not now then. Have you considered a very different approach of modifying Scorer to expose more information about the matches in a document? I'm just thinking out-loud here; might be a bad idea ;-). Maybe I'm saying the same thing as "adding positions to Scorers" as you reference in the description, but maybe it could hang off indirectly using the {{MatchesIterator}} you developed here. Your proposed {{Weight.matches(...)}} is a visitor-like thing and we already have Scorer doing that. Lots of Weight classes to be modified; I wonder if it's less invasive at the Scorer? Hmm. > Add a method to Weight to retrieve matches for a single document > ---------------------------------------------------------------- > > Key: LUCENE-8229 > URL: https://issues.apache.org/jira/browse/LUCENE-8229 > Project: Lucene - Core > Issue Type: New Feature > Reporter: Alan Woodward > Assignee: Alan Woodward > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The ability to find out exactly what a query has matched on is a fairly > frequent feature request, and would also make highlighters much easier to > implement. There have been a few attempts at doing this, including adding > positions to Scorers, or re-writing queries as Spans, but these all either > compromise general performance or involve up-front knowledge of all queries. > Instead, I propose adding a method to Weight that exposes an iterator over > matches in a particular document and field. It should be used in a similar > manner to explain() - ie, just for TopDocs, not as part of the scoring loop, > which relieves some of the pressure on performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org