On Tuesday 20 February 2007 16:54, D Bera wrote: > > From what I know it is terribly hard to detect encodings i.e. > differentiate between an iso-* encoding and utf8 encoding. Any > document with any iso* encoding is also a valid utf8 encoded document. >
I have found the program chardet at http://chardet.feedparser.org/, which is based on statistical methods for detecting the encoding of files and is an adaptation of the method used in netscape browsers, written in python. This would be very useful for beagle, so I wonder whether beagle will implement this algorithm (for a description of it, check http://www.mozilla.org/projects/intl/UniversalCharsetDetection.html) or should I propose this to the mono guys? I can start working on it, though you shouldn't expect much, as I'm not a CS guy. Regards ____________________________________________________________________________________ Finding fabulous fares is fun. Let Yahoo! FareChase search your favorite travel sites to find flight and hotel bargains. http://farechase.yahoo.com/promo-generic-14795097 _______________________________________________ Dashboard-hackers mailing list [email protected] http://mail.gnome.org/mailman/listinfo/dashboard-hackers
