[
https://issues.apache.org/jira/browse/LUCENE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600918#comment-13600918
]
Luca Cavanna commented on LUCENE-4825:
--------------------------------------
Hey Robert,
sorry but I don't quite understand why it would become an orange? :)
I mean, the PostingsHighlighter does (among others) two great things:
1) reads offsets from the postings list, as its name says
2) summarizes the content giving nice sentences as output
I think the two above features are a great improvement and pretty much what
everybody would like to have!
I'm proposing to add support for positional queries, as a third optional
feature. We would need to read the spans from the positional queries in order
to highlight only the proper terms, otherwise the output is wrong from a user
perspective. Would this make it that slower? I don't mean to reanalyze the
text...
Don't get me wrong you must be right but I would like to understand more.
You're saying that instead of adding 3) to 2) and 1) we should have another
highlighter that does 1) 2) and 3)?
> PostingsHighlighter support for positional queries
> --------------------------------------------------
>
> Key: LUCENE-4825
> URL: https://issues.apache.org/jira/browse/LUCENE-4825
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/highlighter
> Affects Versions: 4.2
> Reporter: Luca Cavanna
>
> I've been playing around with the brand new PostingsHighlighter. I'm really
> happy with the result in terms of quality of the snippets and performance.
> On the other hand, I noticed it doesn't support positional queries. If you
> make a span query, for example, all the single terms will be highlighted,
> even though they haven't contributed to the match. That reminds me of the
> difference between the QueryTermScorer and the QueryScorer (using the
> standard Highlighter).
> I've been trying to adapt what the QueryScorer does, especially the
> extraction of the query terms together with their positions (what
> WeightedSpanTermExtractor does). Next step would be to take that information
> into account within the formatter and highlight only the terms that actually
> contributed to the match. I'm not quite ready yet with a patch to contribute
> this back, but I certainly intend to do so. That's why I opened the issue and
> in the meantime I would like to hear what you guys think about it and
> discuss how best we can fix it. I think it would be a big improvement for
> this new highlighter, which is already great!
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]