Tim Allison created TIKA-2933:
---------------------------------
Summary: Revisit "replacement" encoding mappings in
StandardHtmlEncodingDetector.
Key: TIKA-2933
URL: https://issues.apache.org/jira/browse/TIKA-2933
Project: Tika
Issue Type: Bug
Reporter: Tim Allison
Revisit "replacement" encoding mappings in StandardHtmlEncodingDetector.
I'm finally getting around to running the comparisons between our legacy
HTMLEncodingDetector and the newer StandardHTMLEncodingDetector. More analysis
is required, but the newer one is, generally, much better. One area for
improvement/explanation, though is in the "replacement" encoding.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)