[
https://issues.apache.org/jira/browse/LUCENE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12611453#action_12611453
]
Tavi Nathanson commented on LUCENE-794:
---------------------------------------
Hey everyone,
I'm having some trouble getting SpanScorer to act the way I'd like for proper
highlighting, and I'm wondering if anyone has any suggestions.
I have two fields: text_raw and text_stemmed. text_raw, as the name suggests,
stores unstemmed (tokenized) text while text_stemmed stores stemmed (tokenized)
text.
I have queries that look over both fields. For, example, I may have the query
+(text_raw:"apple sauce" text_stemmed:orange). This query matches "apple sauce
oranges" but it does not match "apples sauces orange" (because "apple sauce" is
not stemmed). I'd like to be able to highlight accordingly: I want "apple,"
"sauce," and "oranges" to all be highlighted.
So, even though it is in fact the raw text that ends up getting highlighted,
I'm looking for a way to build SpanScorer such that I don't need to limit
myself to one field ("field" is one of the arguments to the constructor).
Thanks!
Tavi
> Extend contrib Highlighter to properly support PhraseQuery, SpanQuery,
> ConstantScoreRangeQuery
> -----------------------------------------------------------------------------------------------
>
> Key: LUCENE-794
> URL: https://issues.apache.org/jira/browse/LUCENE-794
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Other
> Reporter: Mark Miller
> Priority: Minor
> Fix For: 2.3.2
>
> Attachments: MultiPhraseQueryExtraction.patch,
> SpanHighlighter-01-26-2008.patch, SpanHighlighter-01-28-2008.patch,
> SpanHighlighter-02-10-2008.patch, SpanHighlighter-RemovSysOut.patch,
> spanhighlighter.patch, spanhighlighter10.patch, spanhighlighter11.patch,
> spanhighlighter12.patch, spanhighlighter2.patch, spanhighlighter3.patch,
> spanhighlighter5.patch, spanhighlighter6.patch, spanhighlighter7.patch,
> spanhighlighter8.patch, spanhighlighter9.patch,
> spanhighlighter_24_January_2008.patch, spanhighlighter_patch_4.zip
>
>
> This patch adds a new Scorer class (SpanQueryScorer) to the Highlighter
> package that scores just like QueryScorer, but scores a 0 for Terms that did
> not cause the Query hit. This gives 'actual' hit highlighting for the range
> of SpanQuerys, PhraseQuery, and ConstantScoreRangeQuery. New Query types are
> easy to add. There is also a new Fragmenter that attempts to fragment without
> breaking up Spans.
> See http://issues.apache.org/jira/browse/LUCENE-403 for some background.
> There is a dependency on MemoryIndex.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]