Re: [Fwd: Re: [PHP] error while reading google-search-results]

2003-03-08 Thread Jens Lehmann
James Holden wrote:


Welcome to a mine field of problems :-)

1. The url you have entered is invalid.  Thats a good first check to 
make usually.  Try /search?q=test to get that bit sorted.
Ok, this was just a typo. :)

2. Google prevents known useragents from accessing it's content as it 
believes you are acting as a spider or a search engine stealing thier 
content.  To counter this you need to use a new url capturing method and 
specifically set the name of the useragent.  Use 'MSIE'.  You can use 
curl, lwp etc to do this kind of thing.
Curl is excellent, fast, highly configurable and execellent with secure 
connections. 
I heard about curl, but not about lwp. I can't find documentation for 
lwp on php.net. I must check out if I can use one of these.

3. I suggest you don't do this - they prevent it for a reason.
It's part of a larger project and not especially related to google. The 
script should be able to read in any website and do some processing. One 
of its functions should (later) be to determine search engine positions. 
Is there a simple way to do this without reading in the search results 
of google (or other search engines)?

Jens



--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [Fwd: Re: [PHP] error while reading google-search-results]

2003-03-08 Thread James
LWP is a perl thing.  Curl is probably the best thing to use.
Have you tried using googles php api which they provide free?
http://www.google.com/apis/

Jens Lehmann [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 James Holden wrote:
 
 
  Welcome to a mine field of problems :-)
 
  1. The url you have entered is invalid.  Thats a good first check to
  make usually.  Try /search?q=test to get that bit sorted.

 Ok, this was just a typo. :)

  2. Google prevents known useragents from accessing it's content as it
  believes you are acting as a spider or a search engine stealing thier
  content.  To counter this you need to use a new url capturing method and
  specifically set the name of the useragent.  Use 'MSIE'.  You can use
  curl, lwp etc to do this kind of thing.
  Curl is excellent, fast, highly configurable and execellent with secure
  connections.

 I heard about curl, but not about lwp. I can't find documentation for
 lwp on php.net. I must check out if I can use one of these.

  3. I suggest you don't do this - they prevent it for a reason.

 It's part of a larger project and not especially related to google. The
 script should be able to read in any website and do some processing. One
 of its functions should (later) be to determine search engine positions.
 Is there a simple way to do this without reading in the search results
 of google (or other search engines)?

 Jens






-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php