Hi,

i fear this is getting a bit out of hand...

Stefan Sperling wrote on Thu, Sep 21, 2023 at 02:12:50PM +0200:
> On Thu, Sep 21, 2023 at 01:25:01PM +0200, Walter Alejandro Iglesias wrote:

>> I corrected many of the things you pointed me, but not all.  The
>> function I use to check utf8 is mine, I use it in a pair of little
>> programs which I've *hardly* checked for memory leacks.  I know my
>> function looks BIG :-), but I know for sure that it does the job.

> We already have code in libc that does this, see the function
> _citrus_utf8_ctype_mbrtowc in lib/libc/citrus/citrus_utf8.c.
> Please use the libc interface if at all possible, it is best to
> have just one place to fix when a UTF-8 parser bug is found.

In general, the tool for checking the validity of UTF-8 strings
is a simple loop around mblen(3) if you want to report the precise
positions of errors found, or simply mbstowcs(3) with a NULL pwcs
argument if you are content with a one-bit "valid" or "invalid" answer.

But checking the validity of UTF-8 is probably beyond the scope of a
simple tool like mail(1), i think.  All i suggested was checking the
validity of US-ASCII when that encoding is selected - in a separate
patch to be considered *after* support for the MIME headers has gone in.

As Stefan says, adding a hand-written UTF-8 parser to mail(1) is
clearly not acceptable.

Yours,
  Ingo

Reply via email to