Re: [PHP] Re: Static and/or Dynamic site scraping using PHP

2009-05-04 Thread haliphax
On Thu, Apr 30, 2009 at 8:03 AM, 9el  wrote:
> On Thu, Apr 30, 2009 at 3:33 AM, 9el  wrote:
>> I just got a project to do on PHP of scraping the body items from
>> static sites or just html sites.
>> Could you experts please suggest me some quick resources?
>>
>> I have to make an WP plugin with the data as well.
>
> Any expert there yet? Was looking for urgent advices on accomplishing the 
> task.

http://www.regular-expressions.info and preg_match are your best friend(s).


-- 
// Todd

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP] Re: Static and/or Dynamic site scraping using PHP

2009-05-03 Thread Ashley Sheridan
On Sat, 2009-05-02 at 13:08 -0400, Paul M Foster wrote:
> On Sat, May 02, 2009 at 10:40:04PM +0600, Lenin wrote:
> 
> > On Sat, May 2, 2009 at 10:01 PM,
> > wrote:
> > 
> > > Je suis actuellement absent du bureau aussi TEST !
> > >
> > > I dont get it why I get this automated mail everytime I send message to
> > this thread.  :-/
> 
> My French is rusty, but it looks like it says something like "I'm out of
> the office". So it would appear this  person has an autoreply
> going.
> 
> Paul
> 
> -- 
> Paul M. Foster
> 
I've got it for every email I send to the list as well! It's annoying,
but the TEST bit just makes it funny!


Ash
www.ashleysheridan.co.uk


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP] Re: Static and/or Dynamic site scraping using PHP

2009-05-02 Thread Paul M Foster
On Sat, May 02, 2009 at 10:40:04PM +0600, Lenin wrote:

> On Sat, May 2, 2009 at 10:01 PM,
> wrote:
> 
> > Je suis actuellement absent du bureau aussi TEST !
> >
> > I dont get it why I get this automated mail everytime I send message to
> this thread.  :-/

My French is rusty, but it looks like it says something like "I'm out of
the office". So it would appear this  person has an autoreply
going.

Paul

-- 
Paul M. Foster

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP] Re: Static and/or Dynamic site scraping using PHP

2009-05-02 Thread Lenin
On Sat, May 2, 2009 at 10:01 PM,
wrote:

> Je suis actuellement absent du bureau aussi TEST !
>
> I dont get it why I get this automated mail everytime I send message to
this thread.  :-/


Re: [PHP] Re: Static and/or Dynamic site scraping using PHP

2009-05-02 Thread Lenin
I thought I would get some more experts giving me more insight about the
methods of scraping.

I want to grab the body content of pages say of Wordpress but not through
RSS. I would assume the pages are static only. And try to scrape the  body
content but avoiding  sidebar, footer, header etc.

I tried with the DOM and its fun. But just wanting to know some expert
experience on specific to my problem.

Thanks in advance.


[PHP] Re: Static and/or Dynamic site scraping using PHP

2009-04-30 Thread Shawn McKenzie
9el wrote:
> On Thu, Apr 30, 2009 at 3:33 AM, 9el  wrote:
>> I just got a project to do on PHP of scraping the body items from
>> static sites or just html sites.
>> Could you experts please suggest me some quick resources?
>>
>> I have to make an WP plugin with the data as well.
> 
> Any expert there yet? Was looking for urgent advices on accomplishing the 
> task.
> 
> Thanks
> 
> Lenin
> 
> www.twitter.com/nine_L

If you're just capturing and using the body, the load with
file_get_contents() and use preg_match() to select the body or
individual tags, etc...  For more control, maybe try this:

$doc = new DOMDocument();
$doc->loadHTMLFile('http://example.com/page.html');

Then use:  http://php.net/manual/book.dom.php

-- 
Thanks!
-Shawn
http://www.spidean.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP] Re: Static and/or Dynamic site scraping using PHP

2009-04-30 Thread 9el
On Thu, Apr 30, 2009 at 3:33 AM, 9el  wrote:
> I just got a project to do on PHP of scraping the body items from
> static sites or just html sites.
> Could you experts please suggest me some quick resources?
>
> I have to make an WP plugin with the data as well.

Any expert there yet? Was looking for urgent advices on accomplishing the task.

Thanks

Lenin

www.twitter.com/nine_L

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php