----- Original Message ----- From: "Hallvard B Furuseth" <[EMAIL PROTECTED]> To: "Philippe Verdy" <[EMAIL PROTECTED]> Cc: "Clark Cox" <[EMAIL PROTECTED]>; "Unicode Mailing List" <[EMAIL PROTECTED]> Sent: Sunday, January 11, 2004 8:18 PM Subject: Re: [OT] ASCII support in C/C++ (was: doubt)
> Philippe Verdy writes: > >From: "Clark Cox" <[EMAIL PROTECTED]> > >> Actually, both the C and C++ standards require that the char type be > >> at least 8-bits. that is, the signed char type must be able to > >> represent the values in the range [-127, 127], and the unsigned char > >> type must be able to represent the values in the range [0, 255]. Any C > >> or C++ compiler that cannot meet those requirements is non-conformant. > > > > Yes of course (however this depends on which standard you discuss here... > > No, it doesn't. > > > The language itself does not require it, just the implementation guidelines > > for applications on generic OS. > > The C and C++ languages are defined by the C and C++ standards. As > Clark says, the standards do require this. See for example ISO C > section 5.2.4.2.1 (Sizes of integer types <limits.h>). > > > If you look at some C compilers created for microcontrolers or hardware > > devices, you'll see that it supports the full core language, > > If it does, it has 8-bit 'char' or wider. Otherwise it is not a C > compiler, however much it might claim to be. It is a compiler for a > language _ressembling_ C. All this relates to the language that was standardized very lately by ISO and initially by ANSI (in collaboration with the initial designers Kernighan and Richie who designed the language to write Unix). There are still a lot of code needing support of the K&R C language, which is a de-facto (rather than de-jure) standard, as it was specified in the first edition of "the C language" by Brian Kernighan & Dennis Richie (Prentice-Hall, 1978) and translated into languages (1983 for the French edition) . There are still a lot of systems which ONLY support a K&R C compliant compiler (without "void", "signed char", "long long", and function prototypes) but not the ANSI C american standard, or the late ISO C standard. And most of these systems do not have all what is required to support POSIX. And lots of other C++ compilers that were written and used on systems long before the ISO C standard was published, and still not implementing the full ANSI C standard. Not all platforms are supporting fully IEEE-compliant floatting point operations as well (because there's no FPU and fully implementing it by software would impact too much performance). So the POSIX and ISO C requirements cannot be applied to these systems. Note that even on PC systems, the FPU is not always fully IEEE-compliant, and deficiencies are supported by the mathematical libraries, or by the underlying OS if it can "patch" the code on the fly by modifying the way some instructions will be computed through emulation. Look at the initial question in this list by "Deepak Chand Rathore" yesterday: it's widely open, and the question is about how any C compiler could affect the supposed complete support of ASCII in various platforms in their default working locale (not all environments have the support for multiple locales, only a default locale is supposed to be present, but this default locale is not necessarily mapped to mean "ASCII supported" and "US English" as it is in POSIX systems which define the "C" locale.) So the question is related to portability. Portability is possible only on platforms supporting at least the same minimum standard. This affects the way a software is written to handle characters and strings. Adapting the software for other previous versions of the standard or even to the widely deployed K&R 1978 de-facto standard is not a stupid question. The question is then to know, before writing the software, what kind of problems can be expected for the representation of datatypes across various systems one wish to support with the C-written software. It would be fun if all systems really had an available compiler that support the minimum standard needed to compile the C source. But too many C programs do not simply specify to which C standard (de-jure like ANSI C or ISO C, or de-facto like K&R 1978) the code was written for. Many programmers just say "it's C language", but in fact there are really several C languages, one per specification (I include the K&R C 1978 version as a plain language with an effective specification).

