Very informative. Thanks.

----- Original message -----
From: Steffen Daode Nurpmeso <[email protected]>
To: Yarin <[email protected]>
Cc: [email protected]
Subject: Re: mbstowcs() null termination
Date: Fri, 30 Mar 2012 18:59:56 +0200

Hi,

Yarin wrote [2012-03-30 17:01+0200]:
> Hello,
> 
> Is mbstowcs() suppose to null-terminate? I ask because, on OpenBSD 4.9 
> (generic, no patches), it never null terminates.
> Even though the C90 draft seems to imply that it should when there's enough 
> room.
> 
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n869/
> 
> I'm unsure of the expected behavior here, and wasn't able to find anything on 
> the web, which is why I ask this, rather than submit a bug report.
> 
> Thanks

I like the POSIX stuff better from within the browser:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/mbstowcs.html
(or http://pubs.opengroup.org/onlinepubs/9699919799/ to get the
full frameset and then search for the function).

The relevant OpenBSD code seems to be in
src/lib/libc/citrus/citrus_{none,utf8}.c and behaves correctly for
a small test like this linked against -current:

  #include <stdio.h>
  #include <stdlib.h>
  #include <wchar.h>

  int
  main(void)
  {
    wchar_t dst[16];
    size_t res;

    (void)wmemset(dst, (wchar_t)-1, 16);
    res = mbstowcs(dst, "CuCaRaZa", 16);
    printf("len=%zu, TERM=%d\n", res, dst[res] == 0);

    (void)wmemset(dst, (wchar_t)-1, 16);
    res = mbstowcs(dst, "CuCaRaZa", 8);
    printf("len=%zu, TERM=%d\n", res, dst[res] == 0);

    return (0);
  }

  ?0%1[steffen@obsdc src]$ cc -o zt test.c
  ?0%1[steffen@obsdc src]$ ./zt                                 
  len=8, TERM=1
  len=8, TERM=0

It seems that the UTF-8 version had a fix after tag 4.9 had been
applied, and that has not been backported into 4.9.
>From what i see at a glance invalid sequence errors could have
occurred due to usage of the false length parameter.

But say - UTF-8 locales were pretty cutting edge then .. ?

--steffen
Forza Figa!

Reply via email to