"Philippe Verdy" writes: > Nulls are legal Unicode characters, also for use in plain text and since > ever in ASCII, and all ISO 8-bit charset standards. Why do you want that a > legal Unicode string containing NULL (U+0000) *characters* become illegal > when converted to C strings?
Why do you need a nul? They're not exactly legal characters in plain text; I know of no program that would do anything constructive with them in plain text. A file with arbitrary control characters in it is generally not a plain text file; an escape code certainly has no fixed meaning and where it does have meaning it does things, like underlining and highlighting and other things, that aren't exactly plain text. > A null *CHARACTER* is valid in C string, because C does not mandate the > string encoding (which varies according to locale conventions at run-time). That's specious. The string encoding in C since time immortal has generally been a variety of ASCII or EBCDIC, both of which make the null character the null byte. > Using pure UTF-8 in C strings would not be conforming to either Unicode or C > conventions because it will illegitimately restrict the legal embedding of > U+0000 in strings... That's nothing new; C has restricted the embedding of U+0000 in strings since the very first compiler. ASCII is no different from UTF-8 here. I've never seen code to make strings in C that hold nulls; I've never send anybody use that as a reason that Java or any other language was better than C. The fact that you can't put NUL in a C string is both true and seemingly moot. Java's solution to emit it to a C string are creative and probably useful for the situation, but should never have been written to disk. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm

