RE: UTF16 and GCC

Nuesser, Wilhelm Thu, 12 Jul 2001 04:33:49 -0700
Hi,

some comments on this topic from the originators of this patch ;-)

> 
> On Wed, 11 Jul 2001, Roozbeh Pournader wrote:
> 
> > Take a look at:
> >
> >     ftp://ftp.sap.com/pub/i18N/utf16/ugcc-2.95.2
> >
> > which proposes UTF-16 support in GCC. I believe they are 
> after getting it
> > into gcc and then put it on standard track somewhere.
> 

Not quite, more the other way round. We have always tried to combine these
things:
it is a proposal for a new C/C++ standard extension already ... and 
we want it in gcc now. 

The reasoning behind this approach is detailed in and based on a proposal 
of the Unicode consortium, see
        http://wwwold.dkuug.dk/jtc1/sc22/wg20/docs/n830-utf-16-c.txt
and 
        http://www.unicode.org/unicode/members/L2001/01220.htm

(Sorry, the latter is only for members of the Unicode consortium)

To summarize: 
for our sort of application (ie. high memory load, cross platform, 
many, many strings in memory, networked etc.) utf16 based coding 
is the most efficient - internal - presentation, i.e. the one 
with the highest median information density, for the 
great majority of characters.

> Well, producing a patch against an ancient release rather 
> than mainline
> CVS - the preprocessor has been rewritten since 2.95 - is a 
> waste of time,

OK, let�s talk about history:
we posted a proposal for an UFT16-based approach to linux-utf8 et al.
around ONE year ago. Around this time we needed a working compiler 
able to deal with utf16-literals and the productive GNU compiler 
at this time was gcc 2.95.2. So we created a patch for 2.95.2. 
We are using this compiler since then and - btw- we are quite 
happy with it ;-)
So, certainly we don�t want to deal with 2.95.2 now, we think more of
gcc 3.1 ...

> and since there is no documentation I can't tell what the patch is
> supposed to do.

There is documentation:
ftp://ftp.sap.com/pub/i18N/utf16/ugcc-2.95.2/U_literal_in_GCC.doc
Please have a look at it, although it is MS Word ....

> Systems for string literals in specified character sets have been
> discussed on the WG14 reflector, but AFAICT without any 
> working papers yet
> even in the WG14 document register, so actually adding such a 
> feature to
> GCC would be very premature, but the authors of that patch 
> still ought to
> read and follow http://gcc.gnu.org/contribute.html if they 
> want any GCC
> developers to comment meaningfully on it.  For now if they want to do
> anything actually *useful* on i18n support in GCC they'd be better
> dicussing with the preprocessor maintainers what the plans 
> are for fixing
> the issues discussed at 
> http://gcc.gnu.org/projects/cpplib.html#charset

Oops, no, we _don�t_ want to write arbitrary char literals in our code. We
do not
write NON-Ascii chars in our code, we will stick to pure ascii!
But we need another _internal_ presentation of strings in memory during
runtime, 
for example for comparing a user given string with other information inside
our application.

> (taking very careful account of some WG14 reflector 
> discussions and the
> differences between C and C++).
>

That�s OK for me, before we actually start contributing to gcc we will check
for the rules and if we miss something please tell us.

Best regards

Willi N��er
SAP 



-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/
RE: UTF16 and GCC

Reply via email to