Zaheer Beig created TIKA-1405:
---------------------------------

             Summary: German content detected as French
                 Key: TIKA-1405
                 URL: https://issues.apache.org/jira/browse/TIKA-1405
             Project: Tika
          Issue Type: Bug
          Components: languageidentifier
    Affects Versions: 1.4
         Environment: Linux
            Reporter: Zaheer Beig


Hi,
We are using Apache Tika 1.4  for document conversion to text and language 
detection in one of our project. We are facing below issues with language 
detection:

1. When the text is in all UPPER CASE, even though the language is English, it 
gets detected as Estonian.
2. For many of our German content , language gets detected as French [Though 
this is not the case for all German content]

Any update on this will be very helpful.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to