James, NCRs *are* markup. And validating that the encoding matches the declaration (e.g. UTF-8 is not ill-formed) has nothing whatsoever to do with content, but all with verifying that the file conforms to the HTML specification.
All this is completely different from spelling and grammar checking. The thread started when someone complained that a validator was unable to understand UTF-8 encoded files. Once you go from HTML to XHTML or XML, those 'validators' are themselves invalid, as XML requires all parsers to support UTF-8. Again, the HTML validation service from W3C is able to deal with UTF-8 and even will warn about the 'UTF-8BOM' issue. You should reasonably be able to expect that other tools that call themselves validators match the functionality of that service - or get out of that business. A./