James,

NCRs *are* markup. And validating that the encoding matches
the declaration (e.g. UTF-8 is not ill-formed) has nothing
whatsoever to do with content, but all with verifying that
the file conforms to the HTML specification.

All this is completely different from spelling and grammar
checking.

The thread started when someone complained that a validator
was unable to understand UTF-8 encoded files. Once you go
from HTML to XHTML or XML, those 'validators' are
themselves invalid, as XML requires all parsers to support
UTF-8.

Again, the HTML validation service from W3C is able to deal
with UTF-8 and even will warn about the 'UTF-8BOM' issue.
You should reasonably be able to expect that other tools
that call themselves validators match the functionality
of that service - or get out of that business.

A./



Reply via email to