[ https://issues.apache.org/jira/browse/TIKA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jukka Zitting resolved TIKA-303. -------------------------------- Resolution: Invalid Assignee: Jukka Zitting The XHTMLContentHandler is not meant to be used the way you use it in the test case. The purpose of the XHTMLContentHandler wrapper is to make it easier for Tika Parser implementations to generate valid XHTML output. There's no need for code that calls the Parser interface to use XHTMLContentHandler, as the parse() method is already guaranteed to produce valid XHTML. > XHTMLContentHandler mishandles headers > -------------------------------------- > > Key: TIKA-303 > URL: https://issues.apache.org/jira/browse/TIKA-303 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 0.4, 1.0 > Reporter: Benson Margulies > Assignee: Jukka Zitting > Attachments: tika-tc.patch > > > XHTMLContentHandler.startDocument does not note that it has been called. So > then lazyStartDocument will happen and embed an extra layer of > head/title/body processing. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.