Tim Retout created SOLR-11516:
---------------------------------

             Summary: Unified highlighter with word separator never gives 
context to the left
                 Key: SOLR-11516
                 URL: https://issues.apache.org/jira/browse/SOLR-11516
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: highlighter
    Affects Versions: 7.1, 6.4
            Reporter: Tim Retout


When using the unified highlighter with hl.bs.type=WORD, I am not able to get 
context to the left of the matches returned; only words to the right of each 
match are shown.  I see this behaviour on both Solr 6.4 and Solr 7.1.

Without context to the left of a match, the highlighted snippets are much less 
useful for understanding where the match appears in a document.

As an example, using the techproducts data with Solr 7.1, given a search for 
"apple", highlighting the "features" field:

http://localhost:8983/solr/techproducts/select?hl.fl=features&hl=on&q=apple&hl.bs.type=WORD&hl.fragsize=30&hl.method=unified

I see this snippet:

"<em>Apple</em> Lossless, H.264 video"

Note that "Apple" is anchored to the left.  Compare with the original 
highlighter:

http://localhost:8983/solr/techproducts/select?hl.fl=features&hl=on&q=apple&hl.fragsize=30

And the match has context either side:

", Audible, <em>Apple</em> Lossless, H.264 video"

(To complicate this, in general I am not sure that the unified highlighter is 
respecting the hl.fragsize parameter, although [SOLR-9935] suggests support was 
added.  I included the hl.fragsize param in the unified URL too, but it's 
making no difference unless set to 0.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to