El 16/11/2017 a las 11:25, Torsten Bonde Christiansen via Lazarus escribió:
Hi List.

I am reading some text of some .csv files, but the encoding of the files is not always the same. In fact it may vary greatly from a lot of european encodings, UTF8 and asian encoding.


Hello,

Some years ago I wrote a code that must be trained which guess encoding and language. The problems are that it must be trained with large texts and of course the result are only statistical and only quite good over quite large texts (like 1000 chars or more) so it is not good for single sentences.

If you are interested I can dive into old codes to catch it.


--

--
_______________________________________________
Lazarus mailing list
[email protected]
https://lists.lazarus-ide.org/listinfo/lazarus

Reply via email to