Should the FeedValidator catch improperly escaped XHTML?

Sam Ruby Thu, 31 Aug 2006 05:08:45 -0700


M. David Peterson wrote:

viahttp://www.oreillynet.com/xml/blog/2006/08/and_the_winner_of_the_best_ind_1.html#comment-75533
    There's one other small problem though: they put XHTML as CDATA in
    "html" text constructs, while they're supposed to contain HTML 4.
    And since it's XHTML, they should embed it directly in "xhtml"
    constructs...
Anthony brings out a good point >http://www.oreillynet.com/xml/blog/2006/08/and_the_winner_of_the_best_ind_1.html#comment-75822<,
    Odd that the validator isn't saying anything about this.

Should it, or is this an edge case that can be difficult, at best, to catch?


At the moment, the HTML content is passed through the following:

http://docs.python.org/lib/module-HTMLParser.html

Note that this parser includes a handle_startendtag method, which is nota part of the HTML standard. Given the rather loose nature of HTML,this only tends to catch things like unmatched angle brackets and quotes.

Also, there are a number of tools that attempt to produce well-formedXHTML, but don't do so consistently enough to drop the content into anAtom feed in such a manner.


- Sam Ruby

Re: Can/Does/Should the FeedValidator catch improperly escaped XHTML?

Reply via email to