Re: [PHP] CURL cannot connect to URL - IP address - after successful connection
On 4/29/10, ioan...@btinternet.com wrote: > On 2010/04/29 19:46, Gary . wrote: >> Failed to connect to host is a pretty strange error if they're doing >> anything regarding cookies and so on, IMO - I think I'd expect at >> least a connection to be established before they decide they don't >> like you. Have you used curl's --trace& --trace-ascii options? > > Is that debug_backtrace() in php Not sure :-P > as I am not using the command line > (can't work out how to get the window up having downloaded curl, I am > not up to 'building libraries' that seems to be needed). Windows? http://curl.haxx.se/download.html *n*x variants should allow installing via their packge management systems. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] CURL cannot connect to URL - IP address - after successful connection
On 2010/04/29 19:46, Gary . wrote: On 4/25/10, ioan...@btinternet.com wrote: I can return a target page - once, but then on refresh within a few hours the script curl_error is that it cannot connect to the host and return is empty. Failed to connect to host is a pretty strange error if they're doing anything regarding cookies and so on, IMO - I think I'd expect at least a connection to be established before they decide they don't like you. Have you used curl's --trace& --trace-ascii options? Is that debug_backtrace() in php, as I am not using the command line (can't work out how to get the window up having downloaded curl, I am not up to 'building libraries' that seems to be needed). debug_backtrace() does not give any useful information other than saying the target link fails to connect (this is after it connects once, and then on refresh and for several hours does not connect). I guess there is some program that notes the calling IP address and if it is in a range it does not like, adds it to a list and refuses subsequent connections to the same address for a while. Cookies are not required when using the browser directly. John -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] CURL cannot connect to URL - IP address - after successful connection
On 4/25/10, ioan...@btinternet.com wrote: > I can return a target page - once, but then on refresh within a few > hours the script curl_error is that it cannot connect to the host and > return is empty. Failed to connect to host is a pretty strange error if they're doing anything regarding cookies and so on, IMO - I think I'd expect at least a connection to be established before they decide they don't like you. Have you used curl's --trace & --trace-ascii options? -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] CURL cannot connect to URL - IP address - after successful connection
> -Original Message- > From: ioan...@btinternet.com [mailto:ioan...@btinternet.com] > Sent: Wednesday, April 28, 2010 7:03 AM > To: 'PHP' > Subject: Re: [PHP] CURL cannot connect to URL - IP address - after successful > connection > > I think the answer is: ISPs have a different range of addresses from host > providers, so it is possible to block requests from host servers, so from > scripts. > > John > That's possible but very unlikely in your case. Since you were to able to get the necessary information on the 1st request, but failed on subsequent means it has an anti-bot mechanism in place. Bypassing anti-bot requires in-depth reverse-engineering of the targeted site. Regards, Tommy -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] CURL cannot connect to URL - IP address - after successful connection
I think the answer is: ISPs have a different range of addresses from host providers, so it is possible to block requests from host servers, so from scripts. John -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] CURL cannot connect to URL - IP address - after successful connection
> -Original Message- > From: ioan...@btinternet.com [mailto:ioan...@btinternet.com] > Sent: Monday, April 26, 2010 7:10 AM > To: Tommy Pham > Subject: Re: [PHP] CURL cannot connect to URL - IP address - after successful > connection > > > > On 2010/04/27 1:13, Tommy Pham wrote: > >> -Original Message- > > > I assume that you did full testing with the browser as I suggested? > > If everything works, one other thing to keep in mind is that the > > target also may implement reverse DNS lookup in their anti-bot. One > > good way to test that is to remote in via SSH (if on Linux/Unix) to test with > wget. > > Otherwise, I'm pretty sure that target site have some anti-bot > > mechanisms in place. Microseconds of analyzing valid 'user' requests > > is better than processing 2-3 seconds and sending the response which > > will consume bandwidth. What you could also try is setting different > > user-agents for every request and use cookies in cURL should the > > target site have an anti-bot mechanism. > > > > Regards, > > Tommy > > > > > Yes, I think I have tested with/without cookies on the browser, trying > different user agents (code emailed previously using array and rand) and > cookies are used in script/not used. And it works on subsequent requests? > > I cannot work out how to use Putty/ssh/public private keys etc..wget... > Learning how to use that is easier than learning to code PHP, IMO. > I read about some problems with curl setting the port and a required patch > on the server. > > John If cURL requires some kind of patch as you say, then it wouldn't have work in the first place. Perhaps it's better to post your (obfuscated personal data) code. Or try on your local PC on your local web server to eliminate possibility of proxies, anti-bots, etc... problems to test that your code works as intended and not cURL problem as you say. I didn't have problems using cURL before. But then my targeted sites were very big companies and didn't care about bots much. Regards, Tommy PS: Always reply to the list so others in the future can benefit unless it's something personal. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] CURL cannot connect to URL - IP address - after successful connection
> -Original Message- > From: ioan...@btinternet.com [mailto:ioan...@btinternet.com] > Sent: Sunday, April 25, 2010 10:44 PM > To: a...@ashleysheridan.co.uk; tommy...@gmail.com >> Tommy Pham > Subject: Re: [PHP] CURL cannot connect to URL - IP address - after successful > connection > > The answer I got from support desk on my shared server: 'You are trying to > curl to a datapipe server, if it is rejecting the server name and port, you will > need to take that up with them.' > > John I assume that you did full testing with the browser as I suggested? If everything works, one other thing to keep in mind is that the target also may implement reverse DNS lookup in their anti-bot. One good way to test that is to remote in via SSH (if on Linux/Unix) to test with wget. Otherwise, I'm pretty sure that target site have some anti-bot mechanisms in place. Microseconds of analyzing valid 'user' requests is better than processing 2-3 seconds and sending the response which will consume bandwidth. What you could also try is setting different user-agents for every request and use cookies in cURL should the target site have an anti-bot mechanism. Regards, Tommy -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] CURL cannot connect to URL - IP address - after successful connection
On 2010/04/26 20:01, Ashley Sheridan wrote: How frequently do you request the page? Maybe playing about with that would resolve it? Is it possible to randomise the request frequency a bit? Thanks, Ash http://www.ashleysheridan.co.uk Just manually for testing, and it would be used for human requests. Say occasionally 5, 10, 30 minutes intervals etc. There must be other parameters that are being passed so that the site can determine that the request is coming from the same user and through a script request, because it works normally from the browser so just refusing a second call from the same IP address (which could be a browser with static or unchanged IP address) is not what is happening. It must be determining that it is through a server from another site via curl or similar. John -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] CURL cannot connect to URL - IP address - after successful connection
On Mon, 2010-04-26 at 12:05 +0900, ioan...@btinternet.com wrote: > >> > >> Just to eliminate all possibilities, are you to open the same URL/URI in > > the > >> web pages repeatedly? Also, what happens when you fake the user agent in > >> the web browser? The target site may have some anti bot mechanism in > >> place to reduce stress/load on the server(s). > >> > >> Regards, > >> Tommy > > > > One more thing, check it with cookies enabled/disabled in the web browser > > too. > > > > > > Having deleted cookies on the browser and disabled them, it still does > not like various user agents: > > $useragent = array('Mozilla','Opera','Microsoft Internet > Explorer','ia_archiver'); > $os = array('Windows','Windows XP','Linux','Windows NT','Windows > 2000','OSX'); > //random user agent code > $agent = $useragent[rand(0,3)].'/'.rand(1,8).'.'.rand(0,9).' > ('.$os[rand(0,5)].' '.rand(1,7).'.'.rand(0,9).'; en-US;)'; > //would give something like Mozilla/3.5 (Windows 5.4; en-US;) > > -- OR -- > > //$useragent='Google Image - Googlebot-Image/1.0 ( > http://www.googlebot.com/bot.html)'; > //$useragent="MSN Live - msnbot-Products/1.0 > (+http://search.msn.com/msnbot.htm)"; > > -- OR -- > //$agent = "DocZilla/1.0 (Windows; U; WinNT4.0; en-US; rv:1.0.0) > Gecko/20020804"; > > I am just calling the page manually, once at a time. It is probable > that there is some anti-bot measures. Page would probably not want to > be indexed as it is providing ever changing content. How to use this > for normal level of use for real user just in a different site? > > John > How frequently do you request the page? Maybe playing about with that would resolve it? Is it possible to randomise the request frequency a bit? Thanks, Ash http://www.ashleysheridan.co.uk
Re: [PHP] CURL cannot connect to URL - IP address - after successful connection
Just to eliminate all possibilities, are you to open the same URL/URI in the web pages repeatedly? Also, what happens when you fake the user agent in the web browser? The target site may have some anti bot mechanism in place to reduce stress/load on the server(s). Regards, Tommy One more thing, check it with cookies enabled/disabled in the web browser too. Having deleted cookies on the browser and disabled them, it still does not like various user agents: $useragent = array('Mozilla','Opera','Microsoft Internet Explorer','ia_archiver'); $os = array('Windows','Windows XP','Linux','Windows NT','Windows 2000','OSX'); //random user agent code $agent = $useragent[rand(0,3)].'/'.rand(1,8).'.'.rand(0,9).' ('.$os[rand(0,5)].' '.rand(1,7).'.'.rand(0,9).'; en-US;)'; //would give something like Mozilla/3.5 (Windows 5.4; en-US;) -- OR -- //$useragent='Google Image - Googlebot-Image/1.0 ( http://www.googlebot.com/bot.html)'; //$useragent="MSN Live - msnbot-Products/1.0 (+http://search.msn.com/msnbot.htm)"; -- OR -- //$agent = "DocZilla/1.0 (Windows; U; WinNT4.0; en-US; rv:1.0.0) Gecko/20020804"; I am just calling the page manually, once at a time. It is probable that there is some anti-bot measures. Page would probably not want to be indexed as it is providing ever changing content. How to use this for normal level of use for real user just in a different site? John -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] CURL cannot connect to URL - IP address - after successful connection
> -Original Message- > From: Tommy Pham [mailto:tommy...@gmail.com] > Sent: Monday, April 26, 2010 1:59 AM > To: 'php-general@lists.php.net' > Subject: RE: [PHP] CURL cannot connect to URL - IP address - after successful > connection > > > -Original Message- > > From: ioan...@btinternet.com [mailto:ioan...@btinternet.com] > > Sent: Sunday, April 25, 2010 6:18 AM > > To: php-general@lists.php.net > > Subject: [PHP] CURL cannot connect to URL - IP address - after > > successful connection > > > > I can return a target page - once, but then on refresh within a few > > hours the script curl_error is that it cannot connect to the host and return is > empty. > > The target URL is an ip address, not a named url, so maybe it has > > something to do with DNS. I am on a shared server. Any ideas on why this > happens? > > > > John > > > > Just to eliminate all possibilities, are you to open the same URL/URI in the > web pages repeatedly? Also, what happens when you fake the user agent in > the web browser? The target site may have some anti bot mechanism in > place to reduce stress/load on the server(s). > > Regards, > Tommy One more thing, check it with cookies enabled/disabled in the web browser too. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] CURL cannot connect to URL - IP address - after successful connection
> -Original Message- > From: ioan...@btinternet.com [mailto:ioan...@btinternet.com] > Sent: Sunday, April 25, 2010 6:18 AM > To: php-general@lists.php.net > Subject: [PHP] CURL cannot connect to URL - IP address - after successful > connection > > I can return a target page - once, but then on refresh within a few hours the > script curl_error is that it cannot connect to the host and return is empty. > The target URL is an ip address, not a named url, so maybe it has something > to do with DNS. I am on a shared server. Any ideas on why this happens? > > John > Just to eliminate all possibilities, are you to open the same URL/URI in the web pages repeatedly? Also, what happens when you fake the user agent in the web browser? The target site may have some anti bot mechanism in place to reduce stress/load on the server(s). Regards, Tommy -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] CURL cannot connect to URL - IP address - after successful connection
This is all I see in the error log: SUEXEC error_log: [2010-04-25 16:45:42]: uid: (1116/myname) gid: (1118/myname) cmd: fcgiwrapper John -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] CURL cannot connect to URL - IP address - after successful connection
On Sun, 2010-04-25 at 22:17 +0900, ioan...@btinternet.com wrote: > I can return a target page - once, but then on refresh within a few > hours the script curl_error is that it cannot connect to the host and > return is empty. The target URL is an ip address, not a named url, so > maybe it has something to do with DNS. I am on a shared server. Any > ideas on why this happens? > > John > No, DNS is a Domain Named Server used to turn a domain name into an IP address. As you say you're using an IP address directly, it won't go near DNS. Are there any messages in the logs that would give more specific information? Thanks, Ash http://www.ashleysheridan.co.uk
[PHP] CURL cannot connect to URL - IP address - after successful connection
I can return a target page - once, but then on refresh within a few hours the script curl_error is that it cannot connect to the host and return is empty. The target URL is an ip address, not a named url, so maybe it has something to do with DNS. I am on a shared server. Any ideas on why this happens? John -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php