[ 
https://issues.apache.org/jira/browse/SOLR-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16213063#comment-16213063
 ] 

David Smiley commented on SOLR-11516:
-------------------------------------

Ok.  At least with respect to the issue title... this issue can probably be 
won't-fix or perhaps outright remove WORD & CHARACTER options since they are 
bad options.  (FWIW I didn't don't recall including them; they were probably 
inherited options from similar code for the FVH).

Can you try simply increasing the hl.fragsize a bunch more?  And then if the 
result is too long then trimming client-side?

FWIW there is an already coded option on the Lucene end of this to have 
hl.fragsize be a target/average such that the snippet will break on the side 
closest to the target (either ahead or before).  There is no Solr option to 
enable this; it's a TODO.  The current setting picks the earliest always, even 
if the next break is only a word beyond the target.

Snippeting is hard to satisfy everyone with.  There are many ways to skin this 
cat.

> Unified highlighter with word separator never gives context to the left
> -----------------------------------------------------------------------
>
>                 Key: SOLR-11516
>                 URL: https://issues.apache.org/jira/browse/SOLR-11516
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: highlighter
>    Affects Versions: 6.4, 7.1
>            Reporter: Tim Retout
>
> When using the unified highlighter with hl.bs.type=WORD, I am not able to get 
> context to the left of the matches returned; only words to the right of each 
> match are shown.  I see this behaviour on both Solr 6.4 and Solr 7.1.
> Without context to the left of a match, the highlighted snippets are much 
> less useful for understanding where the match appears in a document.
> As an example, using the techproducts data with Solr 7.1, given a search for 
> "apple", highlighting the "features" field:
> http://localhost:8983/solr/techproducts/select?hl.fl=features&hl=on&q=apple&hl.bs.type=WORD&hl.fragsize=30&hl.method=unified
> I see this snippet:
> "<em>Apple</em> Lossless, H.264 video"
> Note that "Apple" is anchored to the left.  Compare with the original 
> highlighter:
> http://localhost:8983/solr/techproducts/select?hl.fl=features&hl=on&q=apple&hl.fragsize=30
> And the match has context either side:
> ", Audible, <em>Apple</em> Lossless, H.264 video"
> (To complicate this, in general I am not sure that the unified highlighter is 
> respecting the hl.fragsize parameter, although [SOLR-9935] suggests support 
> was added.  I included the hl.fragsize param in the unified URL too, but it's 
> making no difference unless set to 0.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to