As I've said in previous messages, I've been working on the "replyfilter" Perl script to improve the functionality of replying to MIME messages. So far I am pretty happy with the results (check out the latest version if you're interested, it's in $(srcdir)/docs/contrib/replyfilter), but I have run into one annoying wrinkle.
Right now the script uses "par" to format long text in the reply message. But I have discovered that in some cases par mangles the output when dealing with UTF-8. Specifically, if the to-be-quoted text contains a non-breaking space (U+00A0) that is encoded in UTF-8 as 0xc2 0xa0, and I guess that par sees the 0xa0 as a space and replaces it with a 0x20, which results in an invalid UTF-8 sequence. So far that's the only problem I've run into; other UTF-8 sequences work fine. My simple solution is to simply replace any occurences of U+00A0 with a space, and that seems to solve the problem. But I am thinking that it is only a matter of time before I run into other UTF-8 that par handles poorly. I was wondering if anyone knows of any par-like utilities that are UTF-8 aware? Before people mention it ... yes, I am aware that there is a i18n patch for par. I tried that, but it did not help (a brief look at it leads me to think that the core problem is that even with that patch par is calling isspace(), where it should be calling iswspace()). --Ken _______________________________________________ Nmh-workers mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/nmh-workers
