>If we were to use $LANG/$LC_CTYPE to convert incoming data to UTF-8 >in the same manner, and process (and store!) everything internally >as UTF-8, all of this nonsense would go away. Similarly, we could >convert from UTF-8 -> $LANG/$LC_CTYPE on the way out. And we could ship >everything off-site with one of only two character sets: ascii, or utf8.
I ... do not think this would solve this particular problem. The issue here seems to be a) nmh programs were given 8 bit characters, and b) the locale was set to US-ASCII. If you are going to assume that all INPUT is unconditionally UTF-8, then yes, that would solve this problem. But you say above you want to use LANG/LC_CTYPE to convert to UTF-8 on input; that would have failed given the problem as stated. And like I've said before: I think this effort would a) require a new library dependency (for UTF-8 processing, since we couldn't use the locale functions anymore) and b) result in no gain in functionality. Like, I'm squinting really hard here, and I can't see how it would have changed anything. And last time we discussed this, people screamed at the thought of assuming UTF-8 for input; I interpreted that suggestion as a non-starter. --Ken _______________________________________________ Nmh-workers mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/nmh-workers
