[
https://issues.apache.org/jira/browse/TIKA-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-3516.
-------------------------------
Fix Version/s: 2.1.0
Assignee: Tim Allison
Resolution: Fixed
I've gotten no feedback on the original issue or on this one for how to solve
the problem. Please re-open if you have a different preferred solution.
> Unexpected charset IBM424_rtl detected for utf_8 file by CharsetDetector
> --------------------------------------------------------------------------
>
> Key: TIKA-3516
> URL: https://issues.apache.org/jira/browse/TIKA-3516
> Project: Tika
> Issue Type: Bug
> Components: detector, parser
> Reporter: Chaitra Rajappa
> Assignee: Tim Allison
> Priority: Major
> Fix For: 2.1.0
>
>
> Hi,
> The CharsetDetector detects the wrong charset for a file as IBM424_rtl.
> Resulting in exception
> *_java.nio.charset.UnsupportedCharsetException: IBM424_rtl 17 at
> java.nio.charset.Charset.forName(Charset.java:531)_*
> I see there is also an existing ticket with the same issue thats not been
> fixed.
> https://issues.apache.org/jira/browse/TIKA-2396
> Please suggest the changes to fix this.
> Versions being used:
> apache-core - 1.20
> apache-parsers-1.20
> Thanks
--
This message was sent by Atlassian Jira
(v8.3.4#803005)