[jira] [Commented] (LUCENE-4825) PostingsHighlighter support for positional queries

Robert Muir (JIRA) Tue, 12 Mar 2013 13:01:14 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600404#comment-13600404
 ]


Robert Muir commented on LUCENE-4825:
-------------------------------------

I think it supports positional queries, just in a different way. 

I don't really like the way the standardhighlighter does this myself. I would 
prefer if we avoided the slow stuff
those things do in this highlighter (because we already have other ones that do 
that). This one instead puts more effort
on trying to summarize the document with respect to the query terms (which is 
faster, and for some cases, better use of cpu time).

I think a good improvement would be to letting the proximity of terms within 
passages influence the scoring. Its not necessary to actually gather anything 
about the query to do this and wouldnt be confusing and would still support all 
queries that support extractTerms().

On the other hand we can always create variants of this highlighter that do as 
you suggest, so that it leaves the user with more choices. But I just would 
prefer we don't try to force PostingsHighlighter work like the other 
highlighters for the reasons i mentioned.

                
> PostingsHighlighter support for positional queries
> --------------------------------------------------
>
>                 Key: LUCENE-4825
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4825
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/highlighter
>    Affects Versions: 4.2
>            Reporter: Luca Cavanna
>
> I've been playing around with the brand new PostingsHighlighter. I'm really 
> happy with the result in terms of quality of the snippets and performance.
> On the other hand, I noticed it doesn't support positional queries. If you 
> make a span query, for example, all the single terms will be highlighted, 
> even though they haven't contributed to the match. That reminds me of the 
> difference between the QueryTermScorer and the QueryScorer (using the 
> standard Highlighter).
> I've been trying to adapt what the QueryScorer does, especially the 
> extraction of the query terms together with their positions (what 
> WeightedSpanTermExtractor does). Next step would be to take that information 
> into account within the formatter and highlight only the terms that actually 
> contributed to the match. I'm not quite ready yet with a patch to contribute 
> this back, but I certainly intend to do so. That's why I opened the issue and 
> in the meantime I would like to hear what you guys think about it and  
> discuss how best we can fix it. I think it would be a big improvement for 
> this new highlighter, which is already great!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-4825) PostingsHighlighter support for positional queries

Reply via email to