I don't know anything about filter
but for that I would just pull each page into memory
and use a regular expression looking for that tag.
Perl's got some of the best regs in the business and
works a lot faster than grep.
Ashley Jones
--- Brad Johnson <[EMAIL PROTECTED]> wrote:
> This is kind of a RTFM question, but I know that
> Sean always wants
> chances to show off his modules and I'm weak on
> package subclassing
> and the such.
>
> Here's my question: I'd like to go through a site
> (remotely) and search for
> some tags (for now, it's sufficient to simply make a
> list of URLs that
> contain the tags). I know that I could use lwp-rget
> to download the whole
> site and then grep every page, but that's horrible
> overkill, since I just
> want to parse each page instead of saving it.
>
> The pseudocode is something like
>
> get:
> grab url
> parse HTML
> foreach internal link, get link (remember depth!)
> if specialtag found, push url onto foundlist
>
> print foundlist
>
>
> Is the right thing to do to copy lwp-rget and splice
> in some HTML::Filter
> code, or is there a better approach?
__________________________________________________
Do You Yahoo!?
Yahoo! Mail � Free email you can access from anywhere!
http://mail.yahoo.com/