Hey folks, We are having a bit of trouble in deciding (*) how to deal with files in an encoding different than the system encoding. By default, we use UTF8 everywhere and assume everything is in UTF8. Some file formats or data sources specify their encoding (emails, html files, office documents etc.) so those are not a problem.
If non-UTF8 is used for filenames and such, a lot of non-beagle things also break; we are trying to use MONO_EXTERNAL_ENCODINGS to deal with this case. (**). For other files, depending on the file format, either UTF8 or the platform encoding is used. Its really a clumsy affair. Apparently Windows XP has a system setting "how should I handle non-unicode programs" where it is posible to assign a ISO8859-1 codepage. I have no idea how it determines if data is in non-UT8 encoding. So, even though someone could have a different system encoding, a completely different encoding could be used for file data and metadata. Its a perfect encoding mess :-/. I know its not possible to always determine the right encoding. We could have a BEAGLE_LANG variable, which if set, would specify the encoding to use while extracting data regardless of the System encoding. Probably most apps will fail while displaying that data, but being an indexer how far should beagle push its indexing ability. Any suggestions on what could be done to use the right encoding as closely as possible ? - dBera (*) http://bugzilla.gnome.org/show_bug.cgi?id=524077 (**) "non UTF8 folders are not indexed" - in progress - http://bugzilla.gnome.org/show_bug.cgi?id=440458 -- ----------------------------------------------------- Debajyoti Bera @ http://dtecht.blogspot.com beagle / KDE / Mandriva / Inspiron-1100 _______________________________________________ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers