I agree with Sebastian suggestion that you can use a network traffic analyzer to analyzer the HTTP request and response headers between nutch and browser. Maybe they send different request headers.
On Fri, Aug 2, 2013 at 7:16 AM, A Laxmi <[email protected]> wrote: > Sebastian - thanks for your help! > > I can access the link from a browser without any issue. I am getting fetch > failed with http code = 403 only while crawler is trying to fetch > > On Thursday, August 1, 2013, Sebastian Nagel <[email protected]> > wrote: > > Hi, > > > > why are you sure that you didn't get a real 403 (forbidden)? > > - the answering web server logs a delivery with 200 (ok)? > > - a network traffic analyzer (wireshark, tcpdump) shows > > that HTTP response headers have a different status code? > > > > In general, servers may deliver different responses to a crawler > > and a browser, or even deny to deliver a document. > > > > Sebastian > > > > On 08/01/2013 10:56 PM, A Laxmi wrote: > >> For some reason, I am not able to crawl, the fetcher seems to have an > >> issue. It complains - "*fetch of http://www.someurldomain.com/ failed > with: > >> Http code = 403, url = * > >> *http://www.someurldomain.com/" > >> > >> * > >> Please help. I tried to google this issue but could not find anything > that > >> can address this issue > >> > > > > > -- Don't Grow Old, Grow Up... :-)

