On July 19, 2006 at 18:44, Andrew Shirrayev wrote: > one BIG letter > $ ls -l > -rw-r--r-- 1 andrews andrews 29197818 Jul 19 18:42 mbox.200410.one > $ wc mbox.200410.one > 1199719 2958850 29197818 mbox.200410.one > > 1st way: > > <TextEncode> > utf-8; MHonArc::UTF8::to_utf8; MHonArc/UTF8.pm > </TextEncode>
Did you also set: <-- With data translated to UTF-8, it simplifies CHARSETCONVERTERS --> <CharsetConverters override> default; mhonarc::htmlize </CharsetConverters> <-- Need to also register UTF-8-aware text clipping function --> <TextClipFunc> MHonArc::UTF8::clip; MHonArc/UTF8.pm </TextClipFunc> If you use TEXTENCODE, you can avoid dealing with MHonArc::CharEnt with the above CHARSETCONVERTERS. Without the above, MHonArc will convert all non-ASCII UTF-8 sequences into entity references. In general, if you use TEXTENCODE, you should also redefine CHARSETCONVERTERS appropriately. --ewh --------------------------------------------------------------------- To sign-off this list, send email to [EMAIL PROTECTED] with the message text UNSUBSCRIBE MHONARC-DEV
