Improve encoding detection speed and accuracy
---------------------------------------------

                 Key: TIKA-322
                 URL: https://issues.apache.org/jira/browse/TIKA-322
             Project: Tika
          Issue Type: Improvement
          Components: mime
            Reporter: Jukka Zitting
            Priority: Minor


The encoding detection code we took from ICU4J is not very efficient and 
sometimes produces odd results when more than one encoding matches the given 
input data. It would be good to refactor the code to be faster for 
easy-to-detect encodings and to have better heuristics in case multiple matches 
are found.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to