[ https://issues.apache.org/jira/browse/LUCENE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12611453#action_12611453 ]
Tavi Nathanson commented on LUCENE-794: --------------------------------------- Hey everyone, I'm having some trouble getting SpanScorer to act the way I'd like for proper highlighting, and I'm wondering if anyone has any suggestions. I have two fields: text_raw and text_stemmed. text_raw, as the name suggests, stores unstemmed (tokenized) text while text_stemmed stores stemmed (tokenized) text. I have queries that look over both fields. For, example, I may have the query +(text_raw:"apple sauce" text_stemmed:orange). This query matches "apple sauce oranges" but it does not match "apples sauces orange" (because "apple sauce" is not stemmed). I'd like to be able to highlight accordingly: I want "apple," "sauce," and "oranges" to all be highlighted. So, even though it is in fact the raw text that ends up getting highlighted, I'm looking for a way to build SpanScorer such that I don't need to limit myself to one field ("field" is one of the arguments to the constructor). Thanks! Tavi > Extend contrib Highlighter to properly support PhraseQuery, SpanQuery, > ConstantScoreRangeQuery > ----------------------------------------------------------------------------------------------- > > Key: LUCENE-794 > URL: https://issues.apache.org/jira/browse/LUCENE-794 > Project: Lucene - Java > Issue Type: Improvement > Components: Other > Reporter: Mark Miller > Priority: Minor > Fix For: 2.3.2 > > Attachments: MultiPhraseQueryExtraction.patch, > SpanHighlighter-01-26-2008.patch, SpanHighlighter-01-28-2008.patch, > SpanHighlighter-02-10-2008.patch, SpanHighlighter-RemovSysOut.patch, > spanhighlighter.patch, spanhighlighter10.patch, spanhighlighter11.patch, > spanhighlighter12.patch, spanhighlighter2.patch, spanhighlighter3.patch, > spanhighlighter5.patch, spanhighlighter6.patch, spanhighlighter7.patch, > spanhighlighter8.patch, spanhighlighter9.patch, > spanhighlighter_24_January_2008.patch, spanhighlighter_patch_4.zip > > > This patch adds a new Scorer class (SpanQueryScorer) to the Highlighter > package that scores just like QueryScorer, but scores a 0 for Terms that did > not cause the Query hit. This gives 'actual' hit highlighting for the range > of SpanQuerys, PhraseQuery, and ConstantScoreRangeQuery. New Query types are > easy to add. There is also a new Fragmenter that attempts to fragment without > breaking up Spans. > See http://issues.apache.org/jira/browse/LUCENE-403 for some background. > There is a dependency on MemoryIndex. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]