I don't know anything about filter
but for that I would just pull each page into memory
and use a regular expression looking for that tag.
Perl's got some of the best regs in the business and
works a lot faster than grep.

Ashley Jones

--- Brad Johnson <[EMAIL PROTECTED]> wrote:
> This is kind of a RTFM question, but I know that
> Sean always wants
> chances to show off his modules and I'm weak on
> package subclassing
> and the such.
> 
> Here's my question: I'd like to go through a site
> (remotely) and search for
> some tags (for now, it's sufficient to simply make a
> list of URLs that
> contain the tags). I know that I could use lwp-rget
> to download the whole
> site and then grep every page, but that's horrible
> overkill, since I just
> want to parse each page instead of saving it.
> 
> The pseudocode is something like
> 
> get: 
>  grab url
>  parse HTML
>  foreach internal link, get link (remember depth!)
>  if specialtag found, push url onto foundlist
>  
> print foundlist
> 
> 
> Is the right thing to do to copy lwp-rget and splice
> in some HTML::Filter
> code, or is there a better approach?


__________________________________________________
Do You Yahoo!?
Yahoo! Mail � Free email you can access from anywhere!
http://mail.yahoo.com/

Reply via email to