----- Message d'origine ----- 
De: "Philippe Verdy" <[EMAIL PROTECTED]>



> On Monday, July 14, 2003 10:14 PM, [EMAIL PROTECTED]
<[EMAIL PROTECTED]> wrote:
>
> > Are there any libraries out there (open-source or otherwise) that can
> > be used to detect the character encoding of a file or data stream?
>
> Yes, but these libraries actually try to detect the actual encoded
> language, based on strict validity rules to discriminate first the
> possible encodings, then statistic rules to try matching the
> languages with their various encoded byte sequences, then with
> the help of common words.

I know one such library (http://quebec.alis.com/castil/essai_silc.cgi) and
it does not use a three-step approach as you outline it above, but a single
one.

In any case, I believe Peter has an idea how these libraries work and their
limitations, he is rather looking for one with its limitations.

P. Andries
- o -  0 - o -
Textes Unicode en fran�ais
http://pages.infinit.net/hapax



Reply via email to