> Mark Davis <[EMAIL PROTECTED]> wrote:
> 
> > - when one of the BOM-allowing UTFs starts with a BOM, you know the
> > encoding*, and you strip off the BOM when you get the content.
> >
> > *assuming that no UTF-16 file has U+0000 as the first character.
> 
> In the real world, this is a pretty good assumption -- almost as good,

A simple test page of UTF-16 encoded U+0000...U+00FF comes to mind.  But yes,
I'm being mean.

> in fact, as the one I've been stating for years:  that no Unicode file
> will have a zero-width no-break space (intended as such) as the first
> character.
> 

Reply via email to