???????? wrote: > > > If they're supposed to be UTF-8 and aren't, then certainly normal > > tools shouldn't have to deal with malformed sequences. If you write > > a special tool to fix malformed sequences somehow (e.g., delete files > > with malformed sequences), then of course you're going to be dealing > > with the byte level and not (just) the character level. > > If normal tools completely wet the bed at the sight of malformed sequences, > then they are poorly designed.
Some of them are following specifications (e.g., the specifications that say certain UTF-8 readers (and XML processor, maybe?) should reject malformed sequences or reject inputs with malformed sequences (for security reasons)). Daniel -- Daniel Barclay [EMAIL PROTECTED] -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/