Re: C source and execution encodings

Roger Leigh Thu, 23 Jun 2005 04:06:06 -0700

On Thu, Jun 23, 2005 at 12:54:34AM +0100, Simos Xenitellis wrote:
> Roger Leigh wrote:
> 
> >A while back, I made the useful discovery that GCC accepts UTF-8
> >encoded C source by default, and in the generated object code uses
> >UTF-8 for narrow (char) strings, and UTF-32/UCS-4 for wide (wchar_t)
> >strings.
> >
> Googling for "gcc utf-8" brings up a discussion from this list (Dec 
> 2004) which references the GCC documentation.
> The archive of that discussion starts at
> http://mail.nl.linux.org/linux-utf8/2004-11/index.html#00008
> 
> The behaviour of the compiler regarding Unicode strings can be 
> controlled with preprocessor options.
> The page for this is
> http://gcc.gnu.org/onlinedocs/gcc-4.0.0/gcc/Preprocessor-Options.html#Preprocessor-Options


That's interesting, thanks.  I'm still a little confused, though.

#include <locale.h>
#include <stdio.h>
#include <wchar.h>

int
main (void)
{
  setlocale (LC_ALL, "");
  printf("‘Name1’\n");
  printf("%ls\n", L"‘Name2’");
  fwide(stderr, 1);
  fwprintf(stderr, L"‘Name3’\n");
  fwprintf(stderr, L"%s\n", "‘Name4’");
  printf("‘Name5’\n");
  return 0;
}

Try running this in a C locale!

$ ./test
'Name3'
‘Name1’
‘Name5’

Only 'Name3' printed as I would have expected.  The others printed
UTF-8, or nothing at all.


Regards,
Roger

-- 
Roger Leigh

                Printing on GNU/Linux?  http://gimp-print.sourceforge.net/
                GPG Public Key: 0x25BFB848.  Please sign and encrypt your mail.

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: C source and execution encodings

Reply via email to