Do you have the ability to use wireshark or tcpdump on this machine? If so, can you set up a crawl with only that URL, and compare and contrast fetches vs. Curl? There must be some key difference.
Karl Sent from my Windows Phone From: Erlend Garåsen Sent: 4/23/2013 8:03 AM To: [email protected] Subject: Re: Timeout problems with web crawling On 23.04.13 13.48, Erlend Garåsen wrote: > -bash-3.2$ curl -vvv -H "User-Agent: Mozilla/5.0 > (ApacheManifoldCFWebCrawler; [email protected])" > "http://www.ibsen.uio.no/REGINFO_peAGa.xhtml?bokstav=G|1366644879398+299979" A small typo in the URL, so the correct command is: curl -vvv -H "User-Agent: Mozilla/5.0 (ApacheManifoldCFWebCrawler; [email protected])" "http://www.ibsen.uio.no/REGINFO_peAGa.xhtml?bokstav=G" But same result. An immediate response. Erlend -- Erlend Garåsen Center for Information Technology Services University of Oslo P.O. Box 1086 Blindern, N-0317 OSLO, Norway Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050
