Tim Allison created TIKA-2936:
---------------------------------
Summary: The stricter StandardHtmlDetector extracts some header
charsets where our legacy detector doesn't
Key: TIKA-2936
URL: https://issues.apache.org/jira/browse/TIKA-2936
Project: Tika
Issue Type: Improvement
Reporter: Tim Allison
In a small handful of cases, the standards based charset detector is able to
extract a header while our default html detector is not.
Let's look into these cases to improve our html detector.
List of files incoming...
--
This message was sent by Atlassian Jira
(v8.3.2#803003)