Re: [PHP] CURL cannot connect to URL - IP address - after successful connection

2010-04-29 Thread Gary .
On 4/29/10, ioan...@btinternet.com wrote:
> On 2010/04/29 19:46, Gary . wrote:
>> Failed to connect to host is a pretty strange error if they're doing
>> anything regarding cookies and so on, IMO - I think I'd expect at
>> least a connection to be established before they decide they don't
>> like you. Have you used curl's --trace&  --trace-ascii options?
>
> Is that debug_backtrace() in php

Not sure :-P

> as I am not using the command line
> (can't work out how to get the window up having downloaded curl, I am
> not up to 'building libraries' that seems to be needed).

Windows? http://curl.haxx.se/download.html *n*x variants should allow
installing via their packge management systems.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] CURL cannot connect to URL - IP address - after successful connection

2010-04-29 Thread ioan...@btinternet.com



On 2010/04/29 19:46, Gary . wrote:

On 4/25/10, ioan...@btinternet.com wrote:

I can return a target page - once, but then on refresh within a few
hours the script curl_error is that it cannot connect to the host and
return is empty.


Failed to connect to host is a pretty strange error if they're doing
anything regarding cookies and so on, IMO - I think I'd expect at
least a connection to be established before they decide they don't
like you. Have you used curl's --trace&  --trace-ascii options?


Is that debug_backtrace() in php, as I am not using the command line 
(can't work out how to get the window up having downloaded curl, I am 
not up to 'building libraries' that seems to be needed).


debug_backtrace() does not give any useful information other than saying 
the target link fails to connect (this is after it connects once, and 
then on refresh and for several hours does not connect). I guess there 
is some program that notes the calling IP address and if it is in a 
range it does not like, adds it to a list and refuses subsequent 
connections to the same address for a while.  Cookies are not required 
when using the browser directly.


John

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] CURL cannot connect to URL - IP address - after successful connection

2010-04-29 Thread Gary .
On 4/25/10, ioan...@btinternet.com wrote:
> I can return a target page - once, but then on refresh within a few
> hours the script curl_error is that it cannot connect to the host and
> return is empty.

Failed to connect to host is a pretty strange error if they're doing
anything regarding cookies and so on, IMO - I think I'd expect at
least a connection to be established before they decide they don't
like you. Have you used curl's --trace & --trace-ascii options?

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] CURL cannot connect to URL - IP address - after successful connection

2010-04-28 Thread Tommy Pham
> -Original Message-
> From: ioan...@btinternet.com [mailto:ioan...@btinternet.com]
> Sent: Wednesday, April 28, 2010 7:03 AM
> To: 'PHP'
> Subject: Re: [PHP] CURL cannot connect to URL - IP address - after
successful
> connection
> 
> I think the answer is: ISPs have a different range of addresses from host
> providers, so it is possible to block requests from host servers, so from
> scripts.
> 
> John
> 

That's possible but very unlikely in your case.  Since you were to able to
get the necessary information on the 1st request, but failed on subsequent
means it has an anti-bot mechanism in place.  Bypassing anti-bot requires
in-depth reverse-engineering of the targeted site.

Regards,
Tommy


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] CURL cannot connect to URL - IP address - after successful connection

2010-04-28 Thread ioan...@btinternet.com
I think the answer is: ISPs have a different range of addresses from 
host providers, so it is possible to block requests from host servers, 
so from scripts.


John

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] CURL cannot connect to URL - IP address - after successful connection

2010-04-26 Thread Tommy Pham
> -Original Message-
> From: ioan...@btinternet.com [mailto:ioan...@btinternet.com]
> Sent: Monday, April 26, 2010 7:10 AM
> To: Tommy Pham
> Subject: Re: [PHP] CURL cannot connect to URL - IP address - after
successful
> connection
> 
> 
> 
> On 2010/04/27 1:13, Tommy Pham wrote:
> >> -Original Message-
> 
> > I assume that you did full testing with the browser as I suggested?
> > If everything works, one other thing to keep in mind is that the
> > target also may implement reverse DNS lookup in their anti-bot.  One
> > good way to test that is to remote in via SSH (if on Linux/Unix) to test
with
> wget.
> > Otherwise, I'm pretty sure that target site have some anti-bot
> > mechanisms in place.  Microseconds of analyzing valid 'user' requests
> > is better than processing 2-3 seconds and sending the response which
> > will consume bandwidth.  What you could also try is setting different
> > user-agents for every request and use cookies in cURL should the
> > target site have an anti-bot mechanism.
> >
> > Regards,
> > Tommy
> >
> >
> Yes, I think I have tested with/without cookies on the browser, trying
> different user agents (code emailed previously using array and rand) and
> cookies are used in script/not used.

And it works on subsequent requests?

> 
> I cannot work out how to use Putty/ssh/public private keys etc..wget...
> 

Learning how to use that is easier than learning to code PHP, IMO.
 
> I read about some problems with curl setting the port and a required patch
> on the server.
> 
> John

If cURL requires some kind of patch as you say, then it wouldn't have work
in the first place.  Perhaps it's better to post your (obfuscated personal
data) code.  Or try on your local PC on your local web server to eliminate
possibility of proxies, anti-bots, etc... problems to test that your code
works as intended and not cURL problem as you say.  I didn't have problems
using cURL before.  But then my targeted sites were very big companies and
didn't care about bots much.

Regards,
Tommy

PS:  Always reply to the list so others in the future can benefit unless
it's something personal.


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] CURL cannot connect to URL - IP address - after successful connection

2010-04-26 Thread Tommy Pham
> -Original Message-
> From: ioan...@btinternet.com [mailto:ioan...@btinternet.com]
> Sent: Sunday, April 25, 2010 10:44 PM
> To: a...@ashleysheridan.co.uk; tommy...@gmail.com >> Tommy Pham
> Subject: Re: [PHP] CURL cannot connect to URL - IP address - after
successful
> connection
> 
> The answer I got from support desk on my shared server: 'You are trying to
> curl to a datapipe server, if it is rejecting the server name and port,
you will
> need to take that up with them.'
> 
> John

I assume that you did full testing with the browser as I suggested?  If
everything works, one other thing to keep in mind is that the target also
may implement reverse DNS lookup in their anti-bot.  One good way to test
that is to remote in via SSH (if on Linux/Unix) to test with wget.
Otherwise, I'm pretty sure that target site have some anti-bot mechanisms in
place.  Microseconds of analyzing valid 'user' requests is better than
processing 2-3 seconds and sending the response which will consume
bandwidth.  What you could also try is setting different user-agents for
every request and use cookies in cURL should the target site have an
anti-bot mechanism.

Regards,
Tommy


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] CURL cannot connect to URL - IP address - after successful connection

2010-04-26 Thread ioan...@btinternet.com



On 2010/04/26 20:01, Ashley Sheridan wrote:



How frequently do you request the page? Maybe playing about with that
would resolve it? Is it possible to randomise the request frequency a
bit?

Thanks,
Ash
http://www.ashleysheridan.co.uk



Just manually for testing, and it would be used for human requests.  Say 
occasionally 5, 10, 30 minutes intervals etc.  There must be other 
parameters that are being passed so that the site can determine that the 
request is coming from the same user and through a script request, 
because it works normally from the browser so just refusing a second 
call from the same IP address (which could be a browser with static or 
unchanged IP address) is not what is happening.  It must be determining 
that it is through a server from another site via curl or similar.


John

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] CURL cannot connect to URL - IP address - after successful connection

2010-04-26 Thread Ashley Sheridan
On Mon, 2010-04-26 at 12:05 +0900, ioan...@btinternet.com wrote:

> >>
> >> Just to eliminate all possibilities, are you to open the same URL/URI in
> > the
> >> web pages repeatedly?  Also, what happens when you fake the user agent in
> >> the web browser?  The target site may have some anti bot mechanism in
> >> place to reduce stress/load on the server(s).
> >>
> >> Regards,
> >> Tommy
> >
> > One more thing, check it with cookies enabled/disabled in the web browser
> > too.
> >
> >
> 
> Having deleted cookies on the browser and disabled them, it still does 
> not like various user agents:
> 
>   $useragent = array('Mozilla','Opera','Microsoft Internet 
> Explorer','ia_archiver');
>   $os = array('Windows','Windows XP','Linux','Windows NT','Windows 
> 2000','OSX');
>   //random user agent code
>   $agent = $useragent[rand(0,3)].'/'.rand(1,8).'.'.rand(0,9).' 
> ('.$os[rand(0,5)].' '.rand(1,7).'.'.rand(0,9).'; en-US;)';
>   //would give something like Mozilla/3.5 (Windows 5.4; en-US;)
>   
> -- OR --
> 
>   //$useragent='Google Image - Googlebot-Image/1.0 ( 
> http://www.googlebot.com/bot.html)';
>   //$useragent="MSN Live - msnbot-Products/1.0 
> (+http://search.msn.com/msnbot.htm)";
> 
> --  OR --
>   //$agent = "DocZilla/1.0 (Windows; U; WinNT4.0; en-US; rv:1.0.0) 
> Gecko/20020804";
> 
> I am just calling the page manually, once at a time.  It is probable 
> that there is some anti-bot measures.  Page would probably not want to 
> be indexed as it is providing ever changing content.  How to use this 
> for normal level of use for real user just in a different site?
> 
> John
> 


How frequently do you request the page? Maybe playing about with that
would resolve it? Is it possible to randomise the request frequency a
bit?

Thanks,
Ash
http://www.ashleysheridan.co.uk




Re: [PHP] CURL cannot connect to URL - IP address - after successful connection

2010-04-26 Thread ioan...@btinternet.com




Just to eliminate all possibilities, are you to open the same URL/URI in

the

web pages repeatedly?  Also, what happens when you fake the user agent in
the web browser?  The target site may have some anti bot mechanism in
place to reduce stress/load on the server(s).

Regards,
Tommy


One more thing, check it with cookies enabled/disabled in the web browser
too.




Having deleted cookies on the browser and disabled them, it still does 
not like various user agents:


	$useragent = array('Mozilla','Opera','Microsoft Internet 
Explorer','ia_archiver');
	$os = array('Windows','Windows XP','Linux','Windows NT','Windows 
2000','OSX');

//random user agent code
	$agent = $useragent[rand(0,3)].'/'.rand(1,8).'.'.rand(0,9).' 
('.$os[rand(0,5)].' '.rand(1,7).'.'.rand(0,9).'; en-US;)';

//would give something like Mozilla/3.5 (Windows 5.4; en-US;)

-- OR --

	//$useragent='Google Image - Googlebot-Image/1.0 ( 
http://www.googlebot.com/bot.html)';
	//$useragent="MSN Live - msnbot-Products/1.0 
(+http://search.msn.com/msnbot.htm)";


--  OR --
	//$agent = "DocZilla/1.0 (Windows; U; WinNT4.0; en-US; rv:1.0.0) 
Gecko/20020804";


I am just calling the page manually, once at a time.  It is probable 
that there is some anti-bot measures.  Page would probably not want to 
be indexed as it is providing ever changing content.  How to use this 
for normal level of use for real user just in a different site?


John

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] CURL cannot connect to URL - IP address - after successful connection

2010-04-26 Thread Tommy Pham
> -Original Message-
> From: Tommy Pham [mailto:tommy...@gmail.com]
> Sent: Monday, April 26, 2010 1:59 AM
> To: 'php-general@lists.php.net'
> Subject: RE: [PHP] CURL cannot connect to URL - IP address - after
successful
> connection
> 
> > -Original Message-
> > From: ioan...@btinternet.com [mailto:ioan...@btinternet.com]
> > Sent: Sunday, April 25, 2010 6:18 AM
> > To: php-general@lists.php.net
> > Subject: [PHP] CURL cannot connect to URL - IP address - after
> > successful connection
> >
> > I can return a target page - once, but then on refresh within a few
> > hours the script curl_error is that it cannot connect to the host and
return is
> empty.
> > The target URL is an ip address, not a named url, so maybe it has
> > something to do with DNS.  I am on a shared server.  Any ideas on why
this
> happens?
> >
> > John
> >
> 
> Just to eliminate all possibilities, are you to open the same URL/URI in
the
> web pages repeatedly?  Also, what happens when you fake the user agent in
> the web browser?  The target site may have some anti bot mechanism in
> place to reduce stress/load on the server(s).
> 
> Regards,
> Tommy

One more thing, check it with cookies enabled/disabled in the web browser
too.


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] CURL cannot connect to URL - IP address - after successful connection

2010-04-26 Thread Tommy Pham
> -Original Message-
> From: ioan...@btinternet.com [mailto:ioan...@btinternet.com]
> Sent: Sunday, April 25, 2010 6:18 AM
> To: php-general@lists.php.net
> Subject: [PHP] CURL cannot connect to URL - IP address - after successful
> connection
> 
> I can return a target page - once, but then on refresh within a few hours
the
> script curl_error is that it cannot connect to the host and return is
empty.
> The target URL is an ip address, not a named url, so maybe it has
something
> to do with DNS.  I am on a shared server.  Any ideas on why this happens?
> 
> John
> 

Just to eliminate all possibilities, are you to open the same URL/URI in the
web pages repeatedly?  Also, what happens when you fake the user agent in
the web browser?  The target site may have some anti bot mechanism in place
to reduce stress/load on the server(s).

Regards,
Tommy


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] CURL cannot connect to URL - IP address - after successful connection

2010-04-25 Thread ioan...@btinternet.com


This is all I see in the error log:

SUEXEC error_log:


[2010-04-25 16:45:42]: uid: (1116/myname) gid: (1118/myname) cmd: 
fcgiwrapper



John

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] CURL cannot connect to URL - IP address - after successful connection

2010-04-25 Thread Ashley Sheridan
On Sun, 2010-04-25 at 22:17 +0900, ioan...@btinternet.com wrote:

> I can return a target page - once, but then on refresh within a few 
> hours the script curl_error is that it cannot connect to the host and 
> return is empty.  The target URL is an ip address, not a named url, so 
> maybe it has something to do with DNS.  I am on a shared server.  Any 
> ideas on why this happens?
> 
> John
> 


No, DNS is a Domain Named Server used to turn a domain name into an IP
address. As you say you're using an IP address directly, it won't go
near DNS.

Are there any messages in the logs that would give more specific
information?

Thanks,
Ash
http://www.ashleysheridan.co.uk




[PHP] CURL cannot connect to URL - IP address - after successful connection

2010-04-25 Thread ioan...@btinternet.com
I can return a target page - once, but then on refresh within a few 
hours the script curl_error is that it cannot connect to the host and 
return is empty.  The target URL is an ip address, not a named url, so 
maybe it has something to do with DNS.  I am on a shared server.  Any 
ideas on why this happens?


John

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php