[ 
https://issues.apache.org/jira/browse/SOLR-556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Kotthoff updated SOLR-556:
-------------------------------

    Attachment: solr-highlight-multivalued.patch

Patch against SVN HEAD to treat multi valued fields like single valued fields 
when highlighting by looping over the field values and accumulating the 
highlighted snippets.

This corrects the behaviour I've described and simplifies the code. The 
downside is that it may impose a performance penalty for large numbers of 
snippets. The code breaks out of the loop when enough snippets have been found 
without considering the other values of the fields, which means that the 
returned snippets may not be the best ones.

> Highlighting of multi-valued fields returns snippets which span multiple 
> different values
> -----------------------------------------------------------------------------------------
>
>                 Key: SOLR-556
>                 URL: https://issues.apache.org/jira/browse/SOLR-556
>             Project: Solr
>          Issue Type: Bug
>          Components: highlighter
>    Affects Versions: 1.3
>         Environment: Tomcat 5.5
>            Reporter: Lars Kotthoff
>            Priority: Minor
>         Attachments: solr-highlight-multivalued.patch
>
>
> When highlighting multi-valued fields, the highlighter sometimes returns 
> snippets which span multiple values, e.g. with values "foo" and "bar" and 
> search term "ba" the highlighter will create the snippet "foo<em>ba</em>r". 
> Furthermore it sometimes returns smaller snippets than it should, e.g. with 
> value "foobar" and search term "oo" it will create the snippet "<em>oo</em>" 
> regardless of hl.fragsize.
> I have been unable to determine the real cause for this, or indeed what 
> actually goes on at all. To reproduce the problem, I've used the following 
> steps:
> * create an index with multi-valued fields, one document should have at least 
> 3 values for these fields (in my case strings of length between 5 and 15 
> Japanese characters -- as far as I can tell plain old ASCII should produce 
> the same effect though)
> * search for part of a value in such a field with highlighting enabled, the 
> additional parameters I use are hl.fragsize=70, hl.requireFieldMatch=true, 
> hl.mergeContiguous=true (changing the parameters does not seem to have any 
> effect on the result though)
> * highlighted snippets should show effects described above

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to