On Fri, Jan 4, 2013 at 6:08 PM, Stephan Stiller <[email protected]>wrote:
> Is there a most general sense in which there are constraints beyond all > characters being from within the range U+0000 ... U+10FFFF? If one is > concerned with computer security, oddities that are absolute should raise a > flag; somebody could be messing with my system. > If you are concerned with computer security, then I suggest you read http://www.unicode.org/reports/tr36/ "Unicode Security Considerations". For example, the original C datatype named "string", as it is understood > and manipulated by the C standard library, has an *absolute* prohibition > against U+0000 anywhere inside. > That's not as much a prohibition as an artifact of NUL-termination of strings. In more modern libraries, the string contents and its explicit length are stored together, and you can store a 00 byte just fine, for example in a C++ string. markus

