On Saturday 06 August 2005 13:41, Bjoern Hoehrmann wrote:
> * Gábor Szabó wrote:
> >Reading the blog of Geoff about the OSCON session
> >http://www.onlamp.com/pub/wlg/7523
> >I just remembered an open issue for me.
> >
> >How do you test if an HTML page is in one of the w3 standards ?
> >There is the w3 validator online at http://validator.w3.org/ but I cannot
> > use that for my ongoing tests. I need something command line or better
> > yet somthing like Test::W3 ?
>
> Well, the W3C Markup Validator is just a "thin" wrapper around the
> OpenSP SGML processor (the onsgmls command line too to be precise),
> it just does some character encoding detection and deals with mime
> types and doctypes, other than that it's just a HTML formatter. With
> my (experimental) HTML::Encoding, HTML::Doctype, SGML::Parser::OpenSP
> and the (experimental, only in CVS) OpenSP version 1.5.2 you could
> write a command line tool for that in < 100 lines, see e.g. the script
> at <http://lists.w3.org/Archives/Public/public-qa-dev/2004Nov/0002>.

Interesting. In any case, there's also html-tidy which is more self-contained:

http://tidy.sourceforge.net/

It has a Perl interface on CPAN:

http://search.cpan.org/dist/HTML-Tidy/

(there seems to be more related modules in the search).

It's nice, but I recall that with the same input file, it did not catch some 
problems that the W3C Validator then yelled at. (I don't recall what file it 
was, sorry).

Regards,

        Shlomi Fish

---------------------------------------------------------------------
Shlomi Fish      [EMAIL PROTECTED]
Homepage:        http://www.shlomifish.org/

Tcl is LISP on drugs. Using strings instead of S-expressions for closures
is Evil with one of those gigantic E's you can find at the beginning of 
paragraphs.

Reply via email to