Error parsing HTML partial with AutoDetect parser -------------------------------------------------
Key: TIKA-377 URL: https://issues.apache.org/jira/browse/TIKA-377 Project: Tika Issue Type: Bug Components: parser Affects Versions: 0.6 Reporter: Brett S. Attachments: test.html I get the following error parsing a html file containing a partial HTML document. TIKA-237: Illegal SAXException from org.apache.tika.parser.xml.dcxmlpar...@3a43af The following conditions need to exist in the file for the error to be thrown: + A HTML comment before any HTML tags + More than one top level HTML tag I will attach a test file to reproduce -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.