Re: BOM inside tokens

Ash Berlin Tue, 15 Jul 2008 10:42:31 -0700

On 15 Jul 2008, at 18:39, Ash Berlin wrote:

>
> On 15 Jul 2008, at 18:22, Igor Bukanov wrote:
>
>> The currently proposed rule for byte-order-mark (BOM) characters in
>> ES4 sources is to replace them by whitespace outside of tokens. But
>> what is exactly the tokens in a case like -<bom>-?
>>
>> AFAICS it would be treated as - - turning cases like:
>> -<bom>-a;
>> into
>> - -a;
>> versus
>> --a;
>> that would be with current ES3 implementations.
>>
>> Regards, Igor
>> _
>
> Hmmm. according do UnicodeCheck app on my mac (and thus to one version
> or other of the Unicode spec) a BOM (uFEFF) is 'ZERO WIDTH NO-BREAK
> SPACE'
>
> •     NamesList:
>               = BYTE ORDER MARK (BOM), ZWNBSP
>               • may be used to detect byte order by contrast with the
> noncharacter code point FFFE
>               • use as an indication of non-breaking is deprecated; see 2060
> instead
>               → (zero width space - 200B)
>               → (word joiner - 2060)
>               → (<not a character> - FFFE)
> •     Designated in Unicode 1.1
>
> I'd say that a BOM should be treated just like any ordinary whitespace
> char - namely that it should invalid in spaces, and beyond that why is
> any conversion needed, since its a valid unicode character...
>


Invalid in *identifiers*


_______________________________________________
Es4-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es4-discuss

Re: BOM inside tokens

Reply via email to