On 15/11/2004 05:48, Doug Ewell wrote:

...

Peter Kirk <peterkirk at qaya dot org> wrote:

...

Otherwise what would happen? Would it be acceptable for Java programs
to crash, or even throw error messages, if presented with Unicode
strings including U+0000?



Peter, what do you think? Is that what I said? I said it should signal
the end of the string, as it does in C.



In another message, Doug wrote:

I'd still like to know what practical, real-world TEXT-related benefits
would derive from allowing U+0000 in strings of TEXT in a C program.



The practical situation which I have in mind (although not important to me personally as I do very little programming - I am making this point more for the general good) is when (hypothetically) I am trying to write a program in C, or Java, or whatever, to process an arbitrary string of Unicode characters, perhaps received from the Internet, before handing them on to a higher level processor. My program works fine until someone, for whatever (possibly malicious) reason, sends a string containing U+0000. At that point my program crashes, or does something I did not intend which may be a security risk. It might well be a security risk if the task of my program is to scan the string for security issues, and if none are found it passes on the Unicode string including U+0000 and what follows it.

What should my program have done? It could have flagged U+0000 as an illegal character, but it is not; there might be a good reason for it being in the string, and it is not the business of my program to interpret such things. If I am going to use string handling at all, I need to use some kind of escape mechanism to stop this legal U+0000 being misinterpreted. For better or for worse, this Java provides a mechanism for this situation.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/





Reply via email to