wrong handling of ignorableWhitespace/characters in SafeContentHandler and 
WriteoutContentHandler
-------------------------------------------------------------------------------------------------

                 Key: TIKA-190
                 URL: https://issues.apache.org/jira/browse/TIKA-190
             Project: Tika
          Issue Type: Bug
    Affects Versions: 0.3
            Reporter: Uwe Schindler


During investigation of TIKA-189, I found out the following:
The patch TIKA-188 does everything correct (if looking at the output), but the 
internal handling is incorrect. XHTMLContentHandler inserts ignorableWhitespace 
with the tabs and newlines, but the superclass SafeContentHandler has a bug 
that forwards ignorableWhitespace() to the decorators characters() event 
(copy'n'paste-error). Fixing this, the tests fail, because 
WriteoutContentHandler has no ignorableWhitespace() and removes all whitespace.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to