"Philippe Verdy" writes:

> Nulls are legal Unicode characters, also for use in plain text and since 
> ever in ASCII, and all ISO 8-bit charset standards. Why do you want that a 
> legal Unicode string containing NULL (U+0000) *characters* become illegal 
> when converted to C strings? 

Why do you need a nul? They're not exactly legal characters in plain text;
I know of no program that would do anything constructive with them in
plain text. A file with arbitrary control characters in it is generally
not a plain text file; an escape code certainly has no fixed meaning and
where it does have meaning it does things, like underlining and highlighting
and other things, that aren't exactly plain text.
 
> A null *CHARACTER* is valid in C string, because C does not mandate the 
> string encoding (which varies according to locale conventions at run-time). 

That's specious. The string encoding in C since time immortal has generally
been a variety of ASCII or EBCDIC, both of which make the null character
the null byte. 

> Using pure UTF-8 in C strings would not be conforming to either Unicode or C 
> conventions because it will illegitimately restrict the legal embedding of 
> U+0000 in strings... 

That's nothing new; C has restricted the embedding of U+0000 in strings since
the very first compiler. ASCII is no different from UTF-8 here.

I've never seen code to make strings in C that hold nulls; I've never send 
anybody
use that as a reason that Java or any other language was better than C. The fact
that you can't put NUL in a C string is both true and seemingly moot. Java's
solution to emit it to a C string are creative and probably useful for the 
situation,
but should never have been written to disk.

-- 
___________________________________________________________
Sign-up for Ads Free at Mail.com
http://promo.mail.com/adsfreejump.htm



Reply via email to