Hi, The issue of Web pages whose HTML is fouled up to the point of impluckability (add this to Merriam-Webster!) comes up over and over again.
The standard solution would be to use wget with the right options to download all that's needed, then run tidy on the file(s) in question, and then pluck the local files. This is quite cumbersome, and one loses the original URL in the plucked PDB. How about adding an option to plucker-build for filtering each downloaded file through tidy? This should only be a minor hack, the tidying occurs in the right place in the pipeline, and it increases plucker-build's practical usability without placing additional burden on the user. Justus -- Justus H. Piater, Ph.D. http://www.montefiore.ulg.ac.be/~piater/ Institut Montefiore, B28 Phone: +32-4-366-2279 Universit� de Li�ge, Belgium Fax: +32-4-366-2620 _______________________________________________ plucker-list mailing list [email protected] http://lists.rubberchicken.org/mailman/listinfo/plucker-list

