[
https://issues.apache.org/jira/browse/TIKA-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603498#comment-17603498
]
Lenne Hendrickx commented on TIKA-3850:
---------------------------------------
If I append Spanish text to the original text, the language is correctly
identified as Spanish.
{noformat}
Hola! Donde puedo contactar para una garantía? Me pongo el bolso en el hombro.
Arreglo mi cabello y tomo un gran respiro. Camino tranquilamente por el
callejón, de regreso a la calle.{noformat}
{noformat}
language: es
score = 0.999996
{noformat}
> Spanish text is incorrectly detected as Galician
> ------------------------------------------------
>
> Key: TIKA-3850
> URL: https://issues.apache.org/jira/browse/TIKA-3850
> Project: Tika
> Issue Type: Bug
> Components: languageidentifier
> Affects Versions: 2.4.1
> Environment: org.apache.tika:tika-langdetect-optimaize:2.4.1
> org.apache.tika:tika-core:2.4.1
> Reporter: Lenne Hendrickx
> Priority: Minor
>
> The following Spanish text is incorrectly detected as Galician.
> {noformat}
> Hola! Donde puedo contactar para una garantía?{noformat}
> The es and gl models are loaded into the language detector.
> Language result:
> {noformat}
> language: gl
> score: 0.999995{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)