Carl Zwanzig writes: > > I choose Mnogosearch because of its ability to support russian character > > encoding > > OK, any decent search engine should support UTF, which will pretty much > include all encodings.
That's not really the way this works. No UTF even includes Latin-1 *as an encoding*, although all do include it in their *repertoire* (and as integers the code points are the same). I would guess most of the decent ones convert to Unicode internally, but you still have the problem of recognizing the external encoding. Thing is, the Russian environment is like Japanese: a plethora of legacy encodings, still in common use (at least for Japanese, don't know Russian, but I'd bet you still see at least UTF-8, ISO-8859-5, KOI-8R, and Alternativiy, especially in archival applications), and especially in mail they tend to get all mixed together, not necessarily with a proper MIME type (although rarely in any one message ;-). So a lot of software is written to do auto-detection of encodings (which is language specific) and either convert on-the-fly in the main storage or for the indexes only. I don't know if that's true for Mnogosearch, but at the very least we'd need to ask the OP if that's part of his requirement. Steve ------------------------------------------------------ Mailman-Users mailing list -- mailman-users@python.org To unsubscribe send an email to mailman-users-le...@python.org https://mail.python.org/mailman3/lists/mailman-users.python.org/ Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/ https://mail.python.org/archives/list/mailman-users@python.org/