Holger Floerke created SOLR-4686:
------------------------------------

             Summary: HTMLStripCharFilter and Highlighter generates invalid HTML
                 Key: SOLR-4686
                 URL: https://issues.apache.org/jira/browse/SOLR-4686
             Project: Solr
          Issue Type: Bug
          Components: highlighter
    Affects Versions: 4.1
            Reporter: Holger Floerke


Using the HTMLStripCharFilter may yield to an invalid HTML highlight.

The HTMLStripCharFilter has a special treatment of inline-elements (eg. "a", 
"b", ...). For theese elements the CharFilter ignores the tag and does not 
insert any split-character.

If you index
"""
<a>xxx</a>
"""
you get the word "xxx" starting at position 3 ending on position 10(!) 

If you highlight a search on "xxx", you will get
"""
<a><em>xxx</a></em>
"""
which is invalid HTML.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to