Folks, I always believed that Ç was in GSM 7 bit alphabet, but not ç (it is stupid, but that's beyond the point).
But I was pointed to that document recently: http://www.unicode.org/Public/MAPPINGS/ETSI/GSM0338.TXT excerpts: # This table contains the data the Unicode Consortium has on how # ETSI GSM 03.38 7-bit default alphabet characters map into Unicode. # This mapping is based on ETSI TS 100 900 V7.2.0 (1999-07), with # a correction of 0x09 to *small* c-cedilla, instead of *capital* # C-cedilla. # (...) # # The ETSI GSM 03.38 specification shows an uppercase C-cedilla # glyph at 0x09. This may be the result of limited display # capabilities for handling characters with descenders. However, the # language coverage intent is clearly for the lowercase c-cedilla, as shown # in the mapping below. The mapping for uppercase C-cedilla is shown # in a commented line in the mapping table. I believe it is relevant to Kannel because there is to and from GSM 7-bit alphabet conversions in Kannel, of course, for MO/MT transmissions. In Kannel implementation, seemingly relevant excerpts from gateway-1.4.3/gwlib/latin1_to_gsm.h include: /* 0xc7 */ 0x09, /* pc: NON PRINTABLE */ (Ç) and /* 0xe7 */ NRP, /* pc: NON PRINTABLE */ (ç) What do you think? Should both of these chars rather map to 0x09? Have you ever seen a phone displaying ç from 0x09 from a GSM 7 bit message (me never)? -- Guillaume Cottenceau
