[
https://issues.apache.org/jira/browse/TIKA-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andreas Meier updated TIKA-2592:
Attachment: StandardCharsets_unsupported_by_IANA.txt
> HTML with charset unicode handled as utf-16 in
[
https://issues.apache.org/jira/browse/TIKA-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ken Krugler updated TIKA-2592:
--
Attachment: IANA Charset names.txt
> HTML with charset unicode handled as utf-16 instead utf-8
>
[
https://issues.apache.org/jira/browse/TIKA-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ken Krugler updated TIKA-2592:
--
Priority: Minor (was: Major)
> HTML with charset unicode handled as utf-16 instead utf-8
> -
[
https://issues.apache.org/jira/browse/TIKA-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ken Krugler updated TIKA-2592:
--
Issue Type: Improvement (was: Bug)
> HTML with charset unicode handled as utf-16 instead utf-8
> ---
[
https://issues.apache.org/jira/browse/TIKA-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andreas Meier updated TIKA-2592:
Attachment: TestHTMLCharsetCP1256.html
TestHTMLCharsetArabicCP1256.html
> HTML with c
[
https://issues.apache.org/jira/browse/TIKA-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andreas Meier updated TIKA-2592:
Attachment: fix-for-TIKA2592-contributed-by-Andreas-Meier.patch
> HTML with charset unicode handled a