[
https://issues.apache.org/jira/browse/TIKA-492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14729427#comment-14729427
]
Ken Krugler commented on TIKA-492:
----------------------------------
Currently the language-detector library I'm integrating (see TIKA-1723) doesn't
support any of the three Sami languages. I'd open an issue at that project (see
https://github.com/optimaize/language-detector/). So closing this issue, unless
somebody wants to (a) port the current built-in Tika detector to the new
architecture, and (b) follow up with Jan about getting training text, and (c)
add the new profiles. I'll wait a few days.
> Add language identification support for North Sami, Lule Sami and South Sami
> ----------------------------------------------------------------------------
>
> Key: TIKA-492
> URL: https://issues.apache.org/jira/browse/TIKA-492
> Project: Tika
> Issue Type: New Feature
> Components: languageidentifier
> Affects Versions: 0.7
> Reporter: Jan Høydahl
> Assignee: Ken Krugler
> Priority: Minor
>
> We need added support for Sami languages.
> According to document "Requirements for support for Sami languages in data
> processing" (http://www.samit.no/01-850-51.pdf) Tika will get "Basic Level"
> support by detecting North Sami, Lule Sami and South Sami.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)