Re: [Nmh-workers] I like neither green eggs and ham nor MIME

Ken Hornstein Fri, 18 Jul 2014 06:54:07 -0700

>I am not at all secure about how the standard GNU utilities will handle
>non-ascii characters. For example, 'wc -c', just counts bytes. True,
>the man page talks about bytes, not characters, but I am still left
>uncomfortable.  Then there are the dozens of bash, python, and perl
>scripts that I have accumulated over the years.


My experience has been that a modern system handles 8-bit characters just
fine.

Now, where things get a little tricky is with multibyte character sets
like UTF-8.  Not everyone has broken from the paradigm that 1 byte == 1
character, like you noted (we had to do a bunch of work in the format
engine to fix that).  But since UTF-8 has the excellent property that
non-ASCII characters look like just 8-bit characters but won't ever
be mistaken for ASCII (not a surprise, since it was designed by two
of the original Unix geeks) I haven't come across a program where it
truely breaks.  I don't write in Python, but Perl support for UTF-8 is
excellent and I would be shocked if the situation for Python wasn't the
same.

I jumped whole-hog into UTF-8 a few years ago, and I haven't regretted
it one bit.

--Ken

_______________________________________________
Nmh-workers mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/nmh-workers

Re: [Nmh-workers] I like neither green eggs and ham nor MIME

Reply via email to