[ 
https://issues.apache.org/jira/browse/TIKA-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17395000#comment-17395000
 ] 

Tim Allison commented on TIKA-3516:
-----------------------------------

Do you want us to fallback to utf8 if the detected charset isn't supported?

See also: https://issues.apache.org/jira/browse/TIKA-1236

> Unexpected charset IBM424_rtl detected for  utf_8  file by CharsetDetector
> --------------------------------------------------------------------------
>
>                 Key: TIKA-3516
>                 URL: https://issues.apache.org/jira/browse/TIKA-3516
>             Project: Tika
>          Issue Type: Bug
>          Components: detector, parser
>            Reporter: Chaitra Rajappa
>            Priority: Major
>
> Hi,
>  The CharsetDetector detects the wrong charset for a file as IBM424_rtl. 
>  Resulting in exception 
> *_java.nio.charset.UnsupportedCharsetException: IBM424_rtl 17 at 
> java.nio.charset.Charset.forName(Charset.java:531)_*
> I see there is also an existing ticket with the same issue thats not been 
> fixed.
> https://issues.apache.org/jira/browse/TIKA-2396
>  Please suggest the changes to fix this. 
> Versions being used:
> apache-core - 1.20
> apache-parsers-1.20
> Thanks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to