HTMLStripReader improvement - padding corrected for hexadecimal entities, 
option not to emit padding at all added
-----------------------------------------------------------------------------------------------------------------

                 Key: SOLR-882
                 URL: https://issues.apache.org/jira/browse/SOLR-882
             Project: Solr
          Issue Type: Improvement
            Reporter: Dawid Weiss
            Priority: Trivial
         Attachments: patch

Improvements to HTMLStripHighlighter:

- fix padding of hexadecimal entities (currently off by 1)
- add an option not to emit padding at all. In certain applications padding 
emitted after entities such as ó may split words that are in fact single 
terms.
- add entities that are recognized when written all in uppercase and recognized 
by browsers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to