Re: [UPHPU] Web site scraping

Daniel C. Thu, 25 Sep 2008 07:47:59 -0700

You'll probably want to start with this:
http://us3.php.net/manual/en/book.http.php

then if it were me I'd just use regular expressions combined with
sanity checks to grab the info you're looking for off the page(s).
There's no reason to parse the entire page into a DOM structure, then
fiddle with it some more to get what you want out of it, when you
could just use a regex (which are pretty dang quick).

Hard to give anything more specific without knowing what you're looking for.

Dan

On Thu, Sep 25, 2008 at 8:52 AM, Nathan Lane <[EMAIL PROTECTED]> wrote:
> I want to make what in effect is a website scraper using PHP, but it isn't
> obvious how this would best be done. I've tried using DOMDocument and I'm
> not sure if that's the best option or not. I'd really like to use something
> where I could use XPath to get the elements out that I want. Recently I
> wrote a similar program in C# that I call HttpAnalyzer. Could I just use
> that with PHP (i.e. call it from PHP) to get what I'm looking for? Any
> suggestions?
>
> --
> Nathan Lane

_______________________________________________

UPHPU mailing list
[email protected]
http://uphpu.org/mailman/listinfo/uphpu
IRC: #uphpu on irc.freenode.net

Re: [UPHPU] Web site scraping

Reply via email to