----- Message d'origine ----- De: "Philippe Verdy" <[EMAIL PROTECTED]>
> On Monday, July 14, 2003 10:14 PM, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > > Are there any libraries out there (open-source or otherwise) that can > > be used to detect the character encoding of a file or data stream? > > Yes, but these libraries actually try to detect the actual encoded > language, based on strict validity rules to discriminate first the > possible encodings, then statistic rules to try matching the > languages with their various encoded byte sequences, then with > the help of common words. I know one such library (http://quebec.alis.com/castil/essai_silc.cgi) and it does not use a three-step approach as you outline it above, but a single one. In any case, I believe Peter has an idea how these libraries work and their limitations, he is rather looking for one with its limitations. P. Andries - o - 0 - o - Textes Unicode en fran�ais http://pages.infinit.net/hapax

