So I've been wong for POSIX, but using MS compilers, I did not figure out th at it violated the POSIX specs for this point.
How can however the C/C++ standards adapt to the situation? After all, POSIX is old, no longer maintained (it is violated on many systems), not absolutely a standard for C and C++ themselves. There are at least several profiles for "POSIX" locales: - DOS, Windows and OS/2 as defined by Microsoft (and IBM), but aldo in compilers for these systems by other brands - Unix and Linux (where gcc has been ported) - VMS has its own specificities, ... IBM is a bit smarter because it allows selecting the compatibility layers for emulating various OS'es (including IBM versions of Unix, or other OS'es made and maintained by IBM, so that their applications can be ported to Windows; IBM versions of Unix also provide locales emulating DOS and Windows). I can't remember the tricky details about how to select them by a locale ID, or with some other environment variables at compile time or run time. Many programs in fact cn't rely only on POSIX locales and provide their own compatibility layer based on detection of the target OS on which the program will run. In other words, these profiles are dependant in fact of the OS families. Even libraries for gcc on Windows use the MS definitions of Windows locales, but interact correctly with other Windows programs (but in fact programs compiled on Windows, even with gcc, never run in the POSIX locale implemented byt Microsoft only as an option, not installed by default, and in practice not maintained in the old package for the "Unix comptibility POSIX subsystem" for NT). I do think that these old unmaintained POSIX properties should effectively be replaced to use better properties based on the Unicode standard (leave POSIX in the limbs now, it has never been portable), and that the C/C++ st andards should evolve to use Unicode properties, rather than POSIX properties (**except** on systems running in **their** own locales localy c alled "POSIX", with their specificities). 2013/11/6 Karl Williamson <[email protected]> > On 11/06/2013 03:43 AM, Steffen Daode Nurpmeso wrote: > >> Philippe Verdy <[email protected]> wrote: >> |2013/11/5 Steffen Daode <[email protected]> >> |> (The problem i'm facing is that _PRINT and _GRAPH cannot be set >> |> for some properties from PropList.txt, say, _PRINT can't be set >> |> for U+0009, CHARACTER TABULATION (ht), since it's a Cc, but in >> | >> |TAB is "printable" (for the isprint() macro in standard C librries) >> because >> |it has a whitespace property, even if its general category is very >> weakly >> >> Nope according to POSIX, Vol. 1: Base Definitions, 7.3.1. LC_CTYPE ([1]): >> >> print >> Define characters to be classified as printable characters, >> including the <space>. >> >> In the POSIX locale, all characters in class graph shall be >> included; no characters in class cntrl shall be included. >> >> In a locale definition file, characters specified for the >> keywords upper, lower, alpha, digit, xdigit, punct, graph, and >> the <space> are automatically included in this class. No >> character specified for the keyword cntrl shall be specified. >> >> [1] <http://pubs.opengroup.org/onlinepubs/9699919799/ >> basedefs/V1_chap07.html#tag_07_03_01> >> >> Verifieable under LC_ALL=en_GB.UTF-8 in Mac OS X Snow Leopard >> (which admittedly uses very old Citrus data, i always wonder why all >> those Gigabytes of «Software Update»s don't tweak that, not to >> talk about GNU make 3.81 and all the other buggy or non-compliant >> stuff, but that is a different story): >> >> #include <stdio.h> >> #include <ctype.h> >> #include <wctype.h> >> int main(void) { >> printf("%d %d\n",isprint('\t'), wcwidth(L'\t')); >> return 0; >> } >> >> ?0[steffen@sherwood tmp]$ cc -o zt t.c && ./zt >> 0 -1 >> >> |The character mapping for the isprint() macro is defined by an >> expression >> |based on existing Unicode properties. Most C libraries optimize this >> >> But i agree that POSIX has to move towards Unicode definitions, >> and more byte- than bitwise. >> >> --steffen >> >> > The only vendor I'm aware of that makes TAB a printable is Microsoft. Thus > Philippe is wrong about this except for MS products. > > MS makes TAB also a control, violating the Posix standard by having it be > both printable and a control. This is true in all locales I've seen under > MS except the C locale. (MS also has other Posix violations, such as > having isdigit() match superscript numbers.) > >

