[ 
https://issues.apache.org/jira/browse/SOLR-553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596350#action_12596350
 ] 

Bojan Smid commented on SOLR-553:
---------------------------------

I am playing around with LUCENE-794 integration into Solr. I have two options:

1) add LUCENE-794 code to current implementation in DefaultSolrHighlighter 
where client would provide request parameter (say useSpanScorer) if he wants to 
use new functionality. In case he didn't provide the parameter, he would get 
old functionality.

or

2) to provide LUCENE-794 highlighting in new SolrHighlighter, for instance in 
class PhraseQuerySolrHighlighter

I would appreciate any comments on this.

Also, since I already test some of this code, I noticed that we still wouldn't 
get exact behavior from description. For instance, in text  ax bx cx dx ax bx

for phrase query "ax bx cx"

the result is : <span>ax</span><span>bx</span><span>cx</span> dx ax bx

Which means that we got fix part of the problem (words from unrelated snippets 
are no longer highlighted), but we still wouldn't get whole phrase highlighted 
inside single tag.

> Highlighter does not match phrase queries correctly
> ---------------------------------------------------
>
>                 Key: SOLR-553
>                 URL: https://issues.apache.org/jira/browse/SOLR-553
>             Project: Solr
>          Issue Type: New Feature
>          Components: highlighter
>    Affects Versions: 1.2
>         Environment: all
>            Reporter: Brian Whitman
>         Attachments: highlighttest.xml
>
>
> http://www.nabble.com/highlighting-pt2%3A-returning-tokens-out-of-order-from-PhraseQuery-to16156718.html
> Say we search for the band "I Love You But I've Chosen Darkness"
> .../selectrows=100&q=%22I%20Love%20You%20But%20I\'ve%20Chosen%20Darkness%22&fq=type:html&hl=true&hl.fl=content&hl.fragsize=500&hl.snippets=5&hl.simple.pre=%3Cspan%3E&hl.simple.post=%3C/span%3E
> The highlight returns a snippet that does have the name altogether:
> Lights (Live) : <span>I</span> <span>Love</span> <span>You</span> But 
> <span>I've</span> <span>Chosen</span> <span>Darkness</span> :
> But also returns unrelated snips from the same page:
> Black Francis Shop "<span>I</span> Think <span>I</span> <span>Love</span> 
> <span>You</span>"
> A correct highlighter should only return
> Lights (Live) : <span>I Love You But I've Chosen Darkness</span>
> And no snippets that do not match the phrase exactly.
> LUCENE-794 (not yet committed, but seems to be ready) fixes up the problem 
> from the Lucene end. Solr should get it too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to