----- Original Message ----- From: "Theodore H. Smith" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Saturday, August 02, 2003 12:32 PM Subject: Questions on ZWNBS
> Hi list, > > I have some questions on the ZWNBS. While I don't actually need this > myself, someone I know needs this. > > > Where? Specifically, where does it say FEFF shouldn't be in a string? It does not say that. > > Certainly, FEFF shouldn't be considered a BOM anywhere but at the start > > of a string, but does it say you just can't use that value? And if so, > > how are you supposed to use a ZWNBSP?! > I'm thinking that 0xFEFF shouldn't be in a UTF16BE string, except at > the start right? Wrong! U+FEFF has two different uses, ZWNBS and BOM In a UTF-16BE string (and also in a UTF-16LE string) it is _always_ a ZERO WIDTH NO-BREAK SPACE, and _never_ a BOM, regardles if it is at the beginning of the file or not. Not that there is much use for a ZWNBS at the beginning of a file, but suppose that jou have a routine that removes BOM's at the beginning of files. Then it should _not_ remove a ZWNBS at the beginning of a UTF-16BE text, even though a ZWNBS there makes no sense. > For other kinds of UTF, I'm not sure if it is allowed or not. I know it > is allowed in UTF16LE. although discouraged. > > Instead of "can't use ZWNBS", I think that char is discouraged. Where > is the rule that discourages it? The use of U+FEFF as ZWNBS is afaik not discouraged. As for the use UTF-16 with BOM I cannot cite a rule which discourages it, but it is something I would expect to be discouraged. Using UTF-16BE or UTF-16LE instead is much simpler.