[
https://issues.apache.org/jira/browse/TIKA-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14729416#comment-14729416
]
Ken Krugler commented on TIKA-856:
----------------------------------
The language-detector project has support for Japanese, Korean, Simplified
Chinese and Traditional Chinese.
> Support CJK (Chinese, Japanese and Korean) language detection
> -------------------------------------------------------------
>
> Key: TIKA-856
> URL: https://issues.apache.org/jira/browse/TIKA-856
> Project: Tika
> Issue Type: New Feature
> Components: languageidentifier
> Affects Versions: 1.0
> Environment: All
> Reporter: James Sullivan
> Assignee: Ken Krugler
> Labels: Chinese, Japanese
> Attachments: ja.ngp
>
>
> Support language detection of CJK (Chinese, Japanese and Korean).
> Some estimates have Chinese users overtaking English users on the Internet
> so it is important that these languages used by large number of people be
> supported.
> See TIKA-855
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)