[
https://issues.apache.org/jira/browse/TIKA-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chaitra Rajappa updated TIKA-3516:
----------------------------------
Description:
Hi,
The CharsetDetector detects the wrong charset for a file as IBM424_rtl.
Resulting in exception
*_java.nio.charset.UnsupportedCharsetException: IBM424_rtl 17 at
java.nio.charset.Charset.forName(Charset.java:531)_*
I see there is also an existing ticket with the same issue thats not been fixed.
https://issues.apache.org/jira/browse/TIKA-2396
Please suggest the changes to fix this.
Versions being used:
apache-core - 1.20
apache-parsers-1.20
Thanks
was:
Hi,
The CharsetDetector detects the wrong charset for a file as IBM424_rtl.
Resulting in exception
*_java.nio.charset.UnsupportedCharsetException: IBM424_rtl 17 at
java.nio.charset.Charset.forName(Charset.java:531)_*
I see there is also an existing ticket with the same issue thats not been fixed.
https://issues.apache.org/jira/browse/TIKA-2396
Please suggest the changes to fix this.
Thanks
> Unexpected charset IBM424_rtl detected for utf_8 file by CharsetDetector
> --------------------------------------------------------------------------
>
> Key: TIKA-3516
> URL: https://issues.apache.org/jira/browse/TIKA-3516
> Project: Tika
> Issue Type: Bug
> Components: detector, parser
> Reporter: Chaitra Rajappa
> Priority: Major
>
> Hi,
> The CharsetDetector detects the wrong charset for a file as IBM424_rtl.
> Resulting in exception
> *_java.nio.charset.UnsupportedCharsetException: IBM424_rtl 17 at
> java.nio.charset.Charset.forName(Charset.java:531)_*
> I see there is also an existing ticket with the same issue thats not been
> fixed.
> https://issues.apache.org/jira/browse/TIKA-2396
> Please suggest the changes to fix this.
> Versions being used:
> apache-core - 1.20
> apache-parsers-1.20
> Thanks
--
This message was sent by Atlassian Jira
(v8.3.4#803005)