On Tuesday, January 7, 2003, at 04:26 AM, Eric Jain wrote:
I created a Lucene <index> task to parse HTML using JTidy and index the contents. I ran it on Ant's documentation and Javadocs many times and corrected a lot of the errors it kicked out (and there were a lot of them!).Thanks. What I was looking for was a task to do validation in the sense of<http://jakarta.apache.org/ant/external.html#Anteater> <http://jakarta.apache.org/commons/latka/index.html> <http://webtest.canoo.com/webtest/manual/WebTestHome.html>
checking the syntax for conformance to the official W3C standards (something
like http://validator.w3.org/), doesn't seem to be supported directly by any
of those tools.
It would be pretty easy to take what I've done and make it into a general purpose <htmlvalidate> task and even have it fail the build if errors are detected if thats what you want.
Have a look at the lucene-sandbox repository for contributions/ant:
http://cvs.apache.org/viewcvs/jakarta-lucene-sandbox/contributions/ant/
The HtmlDocument class is the one you're looking for.
Erik
--
To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>