John Gibson created TIKA-1000:
---------------------------------
Summary: secure-processing not supported by some JAXP
implementations and causes mime type detection to fail
Key: TIKA-1000
URL: https://issues.apache.org/jira/browse/TIKA-1000
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 1.2
Environment: Android 2.3.6
Reporter: John Gibson
The XmlRootExtractor class tries to set the secure-processing feature that JAXP
requires all parser implementations to support. Unfortunately Android (and
presumably some other parsers) don't support the feature. When run it causes
the following exception: "org.xml.sax.SAXNotRecognizedException: Feature
'http://javax.xml.XMLConstants/feature/secure-processing' is not recognized."
However this exception is swallowed and ignored by XmlRootExtractor which
returns null. When org.apache.tika.mime.MimeTypes sees that no root element
was found it assumes that the file is not valid XML and downgrades the result
to text/plain.
This was fixed long ago by TIKA-271, but as Michael Pisula points out, commit
1004050 broke it again. I'd simply reopen that issue, but I don't have
permission to do that.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira