Re: UTF16 and GCC

Markus Kuhn Fri, 13 Jul 2001 06:24:33 -0700
On Fri, 13 Jul 2001, Bruno Haible wrote:
> > So if you assume that the source file is in UTF-8 normal string
> > literals should be UTF-8.
>
> Yes, but only if the compiler is gcc, and no "coding:" marker is at
> the top of the file, and no overruling command line option has been
> given.

Why does gcc not simply follow the standard POSIX rules and use
nl_langinfo(CODESET) to determine the encoding of source code from the
encoding? Strictly portable C source code should be in ASCII only,
and good POSIX implementations do not support ASCII-incompatible
locales, so there are no dangers added by locale-dependency. If
people want to use anything beyond ASCII, the locale is the single central
switch of choice for designating the used encoding. The shell offers
with

  LC_CTYPE=en_US.UTF-8 gcc ...

a standard per-invocation syntax that is just as convenient as
non-standard command line options.

About the ugly "coding:" marker convention: Will iconv/recode have
functions to update these markers as the encoding of the file get's
converted? Will email software too, otherwise, how do these marker's
interact with MIME when source files get sent around? I really don't
think, these GNU proprietary character encoding markers are a good idea
and I hope they can be disabled in favour of locale-dependency and won't
catch on. We really don't need a completely independent emacs-specific
character encoding marker scheme.

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/
Re: UTF16 and GCC

Reply via email to