I get a 500.  Have tried removing Nutch from my user-agent string and still get 
the same result.

-----Original Message-----
From: Markus Jelsma [mailto:[email protected]] 
Sent: Friday, February 27, 2015 12:05 PM
To: [email protected]
Subject: RE: Can anyone fetch this page?

Seems fine to me
http://oldservice.openindex.io/extract.php?url=http%3A%2F%2Fwww.nature.com%2Fnature%2Fjournal%2Fv518%2Fn7540%2Ffull%2Fnature14236.html
 
 
-----Original message-----
> From:Lewis John Mcgibbney <[email protected]>
> Sent: Friday 27th February 2015 18:56
> To: [email protected]
> Subject: Can anyone fetch this page?
> 
> Hi Folks,
> I was getting 500 internal server error using Nutch trunk when 
> attempting to fetch content from this domain.
> http://www.nature.com
> Just for detail, Nature.com is a catalogue of journals and science 
> resources, including the journal *Nature*. Publishes science news and 
> articles across a wide range of scientific fields. So it is nothing 
> malicious or sensitive/offending content-wise.
> Can anyone else fetch this URL?
> I can get it with curl and wget but not Nutch.
> Thanks
> Lewis
> 
> 
> --
> *Lewis*
> 

Reply via email to