> Mark Davis <[EMAIL PROTECTED]> wrote: > > > - when one of the BOM-allowing UTFs starts with a BOM, you know the > > encoding*, and you strip off the BOM when you get the content. > > > > *assuming that no UTF-16 file has U+0000 as the first character. > > In the real world, this is a pretty good assumption -- almost as good,
A simple test page of UTF-16 encoded U+0000...U+00FF comes to mind. But yes, I'm being mean. > in fact, as the one I've been stating for years: that no Unicode file > will have a zero-width no-break space (intended as such) as the first > character. >

