[ https://issues.apache.org/jira/browse/TIKA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Benson Margulies updated TIKA-303: ---------------------------------- Attachment: tika-tc.patch Patch to 1.5 that adds test case for this issue. > XHTMLContentHandler mishandles headers > -------------------------------------- > > Key: TIKA-303 > URL: https://issues.apache.org/jira/browse/TIKA-303 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 0.4, 1.0 > Reporter: Benson Margulies > Attachments: tika-tc.patch > > > XHTMLContentHandler.startDocument does not note that it has been called. So > then lazyStartDocument will happen and embed an extra layer of > head/title/body processing. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.