On 08/15/2014 18:44, Garrett D'Amore wrote:
I don't know why the icon module is dumping core, but recognize you can't
just wchar_t "Hello"
The wide characters have to be initialized properly; it's not generally
possible to this from constant values directly. Instead you have to
convert to the wide characters first, from another format. I recommend, if
you are running in a UTF-8 locale (such as ru_RU.UTF-8), that you use
mbstowcs() to convert a UTF-8 string to wchar_t's. You should then be able
to convert from UCS-4 to UTF-8. (Note carefully though, the fact that the
wchar_t's are UCS-4 is *not an interface*. The encoding of wchar_t's is a
platform implementation detail.
Here's an example:
...
wchar_t wcs[32];
char utf8[32];
size_t inlen, outlen;
iconv_t hdl;
setlocale(LC_ALL, "ru_RU.UTF-8");
mbstowcs(&wcs, "спасибо болшой", sizeof (wcs) / sizeof (wcs[0]));
// wcs now contains UCS-4 version of Russian thank you very much
...
inlen = wcslen(wcs) * sizeof (wchar_t);
outlen = sizeof (utf8);
hdl = iconv_open("UTF-8", "UCS-4");
iconv(hdl, wcs, &inlen, utf8, &outlen);
// utf8 now contains "спасибо болшой"
Let's try...
char out[1024];
iconv_t cd;
int ret;
wchar_t in[1024];
size_t inlen;
size_t outsz=sizeof(out);
setlocale(LC_ALL,"ru_RU.UTF-8");
mbstowcs(in,"Привет!",sizeof (in) / sizeof (in[0]));
inlen=wcslen(in) * sizeof (wchar_t);
cd = iconv_open("UTF-8","UCS-4");
if (cd == (iconv_t)-1) {
(void) fprintf(stderr, "iconv_open failed\n");
return (1);
}
iconv(cd,&in,&inlen,&out,&outsz);
$ ./test_utf8_mbchar
Segmentation Fault (core dumped)
Note that the above is most definitely *not* the recommended way to get to
UCS-4. The only formally correct way to get to UCS-4 from UTF-8 is to use
iconv() to convert from UTF-8. The only APIs that you should formally be
sending wchar_t's to are the wide character routines (e.g. wcslen()).
Passing wchar_t's directly to iconv as I've done above is technically
incorrect, although I believe in the case above it will work.
Note that this will not work in the "C" locale.
And what is the recommended way of converting wchar_t * to UTF-8 char *?
--
Best regards,
Alexander Pyhalov,
system administrator of Computer Center of Southern Federal University
-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription:
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com