Hi, i fear this is getting a bit out of hand...
Stefan Sperling wrote on Thu, Sep 21, 2023 at 02:12:50PM +0200: > On Thu, Sep 21, 2023 at 01:25:01PM +0200, Walter Alejandro Iglesias wrote: >> I corrected many of the things you pointed me, but not all. The >> function I use to check utf8 is mine, I use it in a pair of little >> programs which I've *hardly* checked for memory leacks. I know my >> function looks BIG :-), but I know for sure that it does the job. > We already have code in libc that does this, see the function > _citrus_utf8_ctype_mbrtowc in lib/libc/citrus/citrus_utf8.c. > Please use the libc interface if at all possible, it is best to > have just one place to fix when a UTF-8 parser bug is found. In general, the tool for checking the validity of UTF-8 strings is a simple loop around mblen(3) if you want to report the precise positions of errors found, or simply mbstowcs(3) with a NULL pwcs argument if you are content with a one-bit "valid" or "invalid" answer. But checking the validity of UTF-8 is probably beyond the scope of a simple tool like mail(1), i think. All i suggested was checking the validity of US-ASCII when that encoding is selected - in a separate patch to be considered *after* support for the MIME headers has gone in. As Stefan says, adding a hand-written UTF-8 parser to mail(1) is clearly not acceptable. Yours, Ingo