Very informative. Thanks. ----- Original message ----- From: Steffen Daode Nurpmeso <[email protected]> To: Yarin <[email protected]> Cc: [email protected] Subject: Re: mbstowcs() null termination Date: Fri, 30 Mar 2012 18:59:56 +0200
Hi, Yarin wrote [2012-03-30 17:01+0200]: > Hello, > > Is mbstowcs() suppose to null-terminate? I ask because, on OpenBSD 4.9 > (generic, no patches), it never null terminates. > Even though the C90 draft seems to imply that it should when there's enough > room. > > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n869/ > > I'm unsure of the expected behavior here, and wasn't able to find anything on > the web, which is why I ask this, rather than submit a bug report. > > Thanks I like the POSIX stuff better from within the browser: http://pubs.opengroup.org/onlinepubs/9699919799/functions/mbstowcs.html (or http://pubs.opengroup.org/onlinepubs/9699919799/ to get the full frameset and then search for the function). The relevant OpenBSD code seems to be in src/lib/libc/citrus/citrus_{none,utf8}.c and behaves correctly for a small test like this linked against -current: #include <stdio.h> #include <stdlib.h> #include <wchar.h> int main(void) { wchar_t dst[16]; size_t res; (void)wmemset(dst, (wchar_t)-1, 16); res = mbstowcs(dst, "CuCaRaZa", 16); printf("len=%zu, TERM=%d\n", res, dst[res] == 0); (void)wmemset(dst, (wchar_t)-1, 16); res = mbstowcs(dst, "CuCaRaZa", 8); printf("len=%zu, TERM=%d\n", res, dst[res] == 0); return (0); } ?0%1[steffen@obsdc src]$ cc -o zt test.c ?0%1[steffen@obsdc src]$ ./zt len=8, TERM=1 len=8, TERM=0 It seems that the UTF-8 version had a fix after tag 4.9 had been applied, and that has not been backported into 4.9. >From what i see at a glance invalid sequence errors could have occurred due to usage of the false length parameter. But say - UTF-8 locales were pretty cutting edge then .. ? --steffen Forza Figa!
