This is kind of a RTFM question, but I know that Sean always wants
chances to show off his modules and I'm weak on package subclassing
and the such.

Here's my question: I'd like to go through a site (remotely) and search for
some tags (for now, it's sufficient to simply make a list of URLs that
contain the tags). I know that I could use lwp-rget to download the whole
site and then grep every page, but that's horrible overkill, since I just
want to parse each page instead of saving it.

The pseudocode is something like

get: 
 grab url
 parse HTML
 foreach internal link, get link (remember depth!)
 if specialtag found, push url onto foundlist
 
print foundlist


Is the right thing to do to copy lwp-rget and splice in some HTML::Filter
code, or is there a better approach?

Reply via email to