Hi,

The issue of Web pages whose HTML is fouled up to the point of
impluckability (add this to Merriam-Webster!) comes up over and over
again.

The standard solution would be to use wget with the right options to
download all that's needed, then run tidy on the file(s) in question,
and then pluck the local files.  This is quite cumbersome, and one
loses the original URL in the plucked PDB.

How about adding an option to plucker-build for filtering each
downloaded file through tidy?

This should only be a minor hack, the tidying occurs in the right
place in the pipeline, and it increases plucker-build's practical
usability without placing additional burden on the user.

Justus

-- 
Justus H. Piater, Ph.D.         http://www.montefiore.ulg.ac.be/~piater/
Institut Montefiore, B28        Phone: +32-4-366-2279
Universit� de Li�ge, Belgium    Fax:   +32-4-366-2620

_______________________________________________
plucker-list mailing list
[email protected]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list

Reply via email to