"Tab Atkins Jr." <jackalm...@gmail.com>, 2012-08-12 15:43 -0700:

> What Dimitri said, but to address your comment directly, DTD-based
> validation is long-dead, at least when applied to HTML.  A DTD can't
> capture the validity requirements that the HTML spec already imposes,
> so it's irrelevant if it also can't validate a document containing
> custom elements.  The current validator used by the W3C is a
> combination of (iirc) constrains expressed in Schematron and custom
> Java code.

The core of the backend for the W3C Nu Markup Validator
(http://validator.w3.org/nu/) and validator.nu is James Clark's Jing, a
Relax NG implementation. The backend doesn't actually use Schematron, for
performance reasons. Instead it has some Java code to perform the
equivalent the of assertions-based checking that Schematron provides but
that can't be done with grammar-based checking alone (whether with Relax NG
or anything else). No grammar-based schema language is capable of
expressing all the constraints in HTML spec. Things like checking the data
types (microsyntaxes) of attribute values requires custom code --
especially if you want to report useful messages for errors (something
regexp-based checking it totally useless for). Also, more to the point
here, things like the fact that arbitrary attribute names prefixed with
"data-" are valid -- grammar-based checkers can't handle that at all. So
the validator.nu backend has some custom code that Henri wrote that drops
those data-* attributes -- basically, filters them out -- before the Jing
part of the toolchain even sees them.


Michael[tm] Smith http://people.w3.org/mike

Reply via email to