On Thu, 12 Jul 2001, Nuesser, Wilhelm wrote:
> There is documentation:
> ftp://ftp.sap.com/pub/i18N/utf16/ugcc-2.95.2/U_literal_in_GCC.doc
> Please have a look at it, although it is MS Word ....
MS Word is not a reasonable format for free software documentation. In
this case, documentation should be in the form of patches to GCC's Texinfo
manual, included in the patch itself and distributable (as GCC's manual
is) under the GNU Free Documentation License.
Read through all the discussions in the past couple of months on
gcc-patches about Apple's attempt to contribute support for Pascal string
literals to GCC. This should give some idea of the issues you need to
address in the documentation and testcases.
As far as I am concerned, all GCC patches should come with thorough
documentation, testcases that cover every line of code added or changed as
far as reasonably practicable, and should fully following the GNU Coding
Standards, the GCC coding conventions and the other instructions for
contributing; and if they don't, I will preferentially comment on the lack
of these rather than on substantive issues of design and implementation,
since these guidelines are designed to make code easier to read,
understand and comment on the substance of.
> Oops, no, we _don�t_ want to write arbitrary char literals in our
> code. We do not write NON-Ascii chars in our code, we will stick to
> pure ascii! But we need another _internal_ presentation of strings in
> memory during runtime, for example for comparing a user given string
> with other information inside our application.
You would still, at the very least, need to ensure that UCNs (\u and \U)
in your UTF-16 strings end up appropriately encoded in the binary (as
single UTF-16 characters or as surrogate pairs, depending on the value
specified).
--
Joseph S. Myers
[EMAIL PROTECTED]
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/