[ https://issues.apache.org/jira/browse/LUCENE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16418708#comment-16418708 ]
Jim Ferenczi commented on LUCENE-8229: -------------------------------------- I like the proposal here. For simple queries it makes the extraction of matched positions trivial. Though I wonder how the complex queries would handle this, for instance the AutomatonQuery cannot just return an enum over all matching terms, we have a special handling of this query in highlighters to avoid the explosion for instance. What is your current plan to handle this query ? Should it return null for simplicity or should it try to expand the automaton with a limit on the number of terms ? I prefer the former which is safe and if users want to check the matching of a complex automaton they can use use a MemoryIndex for each TopDocument and change the query to use the rewrite method that builds a boolean query. > Add a method to Weight to retrieve matches for a single document > ---------------------------------------------------------------- > > Key: LUCENE-8229 > URL: https://issues.apache.org/jira/browse/LUCENE-8229 > Project: Lucene - Core > Issue Type: New Feature > Reporter: Alan Woodward > Assignee: Alan Woodward > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The ability to find out exactly what a query has matched on is a fairly > frequent feature request, and would also make highlighters much easier to > implement. There have been a few attempts at doing this, including adding > positions to Scorers, or re-writing queries as Spans, but these all either > compromise general performance or involve up-front knowledge of all queries. > Instead, I propose adding a method to Weight that exposes an iterator over > matches in a particular document and field. It should be used in a similar > manner to explain() - ie, just for TopDocs, not as part of the scoring loop, > which relieves some of the pressure on performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org