> Would anyone know of some utility that would work like Plucker in > that regard, but would only d/l the HTML? Or is there a way to get > the distiller/desktop interface to d/l the requested sites, but > -not- turn them into the plucker doc? I'm working on some projects, > and I'd love to be able to d/l large numbers of pages with the > exactness of Plucker's d/l capabilities, but I need to keep them > in HTML so I can work with them on my computer instead of my Palm > device.
wget, pavuk, sitescooper, LWP, harvest-ng, HTTP Fetcher, Metis, etc.
Hit Freshmeat and put in 'spider' or something like that to find out the
others.
I highly recommend the LWP or sitescooper approach, since you can also
extend it to pre-process files as they come in via other perl modules
(HTML::LinkExtor, XML::Parser, etc.). The second-best in my opinion is
of course pavuk, which can do cookies, authentication, javacript, and
thousands of other things.
d.
signature.asc
Description: This is a digitally signed message part

