On Thursday, June 23, 2005 1:21 AM Roger Leigh wrote: > In this case, you can output wide strings to narrow streams, and > narrow strings to wide streams. In order to be able to do this, I > assume that the C runtime must know something of the execution > charsets in order to do the conversion, otherwise you wouldn't get > readable output. Additionally, when you output a wide string with > wprintf(), it must be recoded to the narrow representation for > output??. > > The above link is wrong.
Sorry, I cannot understand what you are trying to say. What is the "above link"? At any rate, the behaviour of wide stdio functions is dependent on glibc, not on gcc. Gcc job stops when it has encoded the wchar_t[] inside the binary. Next, it is open (from the C standard point of view) what is the external representation of a wide stream. Assuming only UCS in sight, I can read the Standard as allowing the C runtime to store anything as UTF-8 (so fwprintf will do a UTF-32 to UTF-8 conversion on each flush, and fgetwc will convert on the fly UTF-8 codepoint into a wchar_t UCS-32); or else I can read the Standard as allowing the C runtime to store wide stream as UTF-32 (that's what is doing Microsoft, by the way, where you are using the wide stdio functions to deal with "Unicode" text files). It is also my understanding it was the intention of the Committee to let this unspecified. I am open to corrections on this, since this is something I never sorted clearly while discussing with the involved members of the committee. > I thought that given the C runtime's > knowledge of the execution charsets, it would recode the output into > the locale charset. This does not appear to be the case, however. While it could be an option, as I read above, I believe from a performance point of view this is not the "best" option (cf. the other thread, and Anders' point about not trying to do "the right thing" unless the programmer explicitely asks for it.) So I understand that the libc writers, when they have the choice, choose the short track and encode wide streams as UTF-32. Antoine -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
