On 15 Jul 2008, at 18:22, Igor Bukanov wrote:

> The currently proposed rule for byte-order-mark (BOM) characters in
> ES4 sources is to replace them by whitespace outside of tokens. But
> what is exactly the tokens in a case like -<bom>-?
>
> AFAICS it would be treated as - - turning cases like:
>  -<bom>-a;
> into
>  - -a;
> versus
>  --a;
> that would be with current ES3 implementations.
>
> Regards, Igor
> _

Hmmm. according do UnicodeCheck app on my mac (and thus to one version  
or other of the Unicode spec) a BOM (uFEFF) is 'ZERO WIDTH NO-BREAK  
SPACE'

•       NamesList:
                = BYTE ORDER MARK (BOM), ZWNBSP
                • may be used to detect byte order by contrast with the  
noncharacter code point FFFE
                • use as an indication of non-breaking is deprecated; see 2060  
instead
                → (zero width space - 200B)
                → (word joiner - 2060)
                → (<not a character> - FFFE)
•       Designated in Unicode 1.1

I'd say that a BOM should be treated just like any ordinary whitespace  
char - namely that it should invalid in spaces, and beyond that why is  
any conversion needed, since its a valid unicode character...

-ash
_______________________________________________
Es4-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es4-discuss

Reply via email to