David Hara created TIKA-1141:
--------------------------------
Summary: javascript files that contain "<html" are detected as
text/html
Key: TIKA-1141
URL: https://issues.apache.org/jira/browse/TIKA-1141
Project: Tika
Issue Type: Bug
Components: mime
Affects Versions: 1.2
Reporter: David Hara
Priority: Minor
The Mimetypes detector will return text/html as the mimetype for any javascript
file that contains the string "<html" in it. I believe this is due to the rule
<match value="<html" type="string" offset="0:8192"/> in the
tika-mimetypes.xml file.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira