Carl Zwanzig writes:

 > > I choose Mnogosearch because of its ability to support russian character
 > > encoding
 > 
 > OK, any decent search engine should support UTF, which will pretty much 
 > include all encodings.

That's not really the way this works.  No UTF even includes Latin-1
*as an encoding*, although all do include it in their *repertoire*
(and as integers the code points are the same).  I would guess most of
the decent ones convert to Unicode internally, but you still have the
problem of recognizing the external encoding.

Thing is, the Russian environment is like Japanese: a plethora of
legacy encodings, still in common use (at least for Japanese, don't
know Russian, but I'd bet you still see at least UTF-8, ISO-8859-5,
KOI-8R, and Alternativiy, especially in archival applications), and
especially in mail they tend to get all mixed together, not
necessarily with a proper MIME type (although rarely in any one
message ;-).  So a lot of software is written to do auto-detection of
encodings (which is language specific) and either convert on-the-fly
in the main storage or for the indexes only.

I don't know if that's true for Mnogosearch, but at the very least
we'd need to ask the OP if that's part of his requirement.

Steve
------------------------------------------------------
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
    https://mail.python.org/archives/list/mailman-users@python.org/

Reply via email to