Re: Bug reported regarding Unicode handling in email address

2021-06-12 Thread Ken Hornstein
>Last i looked they use a gigantic chunk of memory in mbstate_t or >so (128 byte?). 128 bytes is considered 'gigantic'? :-) While I am not a huge fan of the POSIX locale functions, thankfully we can mostly get by without them. Basically we use iconv() to convert from the source character set to

Re: Bug reported regarding Unicode handling in email address

2021-06-12 Thread Ken Hornstein
>> What sorry excuse for an MUA are you using over there? :-) > >That would be exmh. Hey, don't drag us fellow exmh users into YOUR mix-up! :-) I'm puzzled as to the process you use to compose the reply. Because if it was being run through mhbuild, there is NO way it should have ever encoded a

Re: Bug reported regarding Unicode handling in email address

2021-06-12 Thread Valdis Klētnieks
On Sat, 12 Jun 2021 10:04:36 +0100, Ralph Corderoy said: > What sorry excuse for an MUA are you using over there? :-) That would be exmh. > And why doesn't it complain at you when it spots the attempt to send > these transgressions onto the wire? That's a very good question - I *thought* I

Re: Bug reported regarding Unicode handling in email address

2021-06-12 Thread Steffen Nurpmeso
Ralph Corderoy wrote in <20210612103715.a572c21...@orac.inputplus.co.uk>: |>> I am aware that some people, for reasons I cannot comprehend, want |>> to run in the "C" locale |> |> I do that, not so much because I want to, but because that's what |> happens when no LC_* env variables (nor

Re: Bug reported regarding Unicode handling in email address

2021-06-12 Thread Ralph Corderoy
Hi kre, > If the draft contained Content-Type, right from the beginning (either > auto set as part of repl or comp processing, or manually inserted), > then we wouldn't need to be guessing what charset it was using, would > we? Yes, we would need to guess because the Content-Type only describes

Re: Bug reported regarding Unicode handling in email address

2021-06-12 Thread Ralph Corderoy
Hi Ken, > Probably the best way to do that is using mhbuild directives. > That is, you can today do stuff like: > > # [... utf-8 text here ...] > # [... iso-8859-1 text here ...] > # [... HTML text here ...] The input to mhbuild can be that, it's true, though a text editor might only handle it

Re: Bug reported regarding Unicode handling in email address

2021-06-12 Thread Ralph Corderoy
Hi Valdis, Your email was interesting. Ken wrote ¯\_(ツ)_/¯ which in UTF-8 is $ hd <<<'¯\_(ツ)_/¯' c2 af 5c 5f 28 e3 83 84 29 5f 2f c2 af 0a |..\_(...)_/...| 000e $ and in Unicode is $ iconv -f utf-8 -t ucs-4le <<<'¯\_(ツ)_/¯' | > hexdump -ve '8/4

Re: Bug reported regarding Unicode handling in email address

2021-06-12 Thread Ralph Corderoy
Hi Ken, > > Complain precisely > > Well ... I am not sure this feeling is universal: > > https://lists.nongnu.org/archive/html/nmh-workers/2014-04/msg00213.html > https://lists.nongnu.org/archive/html/nmh-workers/2015-03/msg00045.html They're about emails which were faulty before they reached

Re: Bug reported regarding Unicode handling in email address

2021-06-12 Thread Valdis Klētnieks
On Fri, 11 Jun 2021 14:04:36 -0400, Ken Hornstein said: > character. This obviously works best if your local character set is > UTF-8. I am aware that some people, for reasons I cannot comprehend, > want to run in the "C" locale but PRETEND that their character set > is UTF-8 and this approach