Hi Kartik, I've tried the same URL and parsing worked well with Nutch 1.x (trunk).
Which Nutch version is used? The error indicates that the fetch didn't succeed with HTTP status 200 which may happen (it could be a temporary failure). If no failure is indicated in the logs, it's possible to get more information via % bin/nutch readdb and for 1.x also: % bin/nutch readseg Best, Sebastian On 01/13/2015 09:52 PM, Krishnanand, Kartik wrote: > Hi, > > As a nutch newbie, I am trying to crawl a single URL at a depth of 1, I am > seeing the following behavior > > I don't know why this could be happening. I loaded the URL in browser, this > did not work for me. What could be the possible reason for this behavior? Any > advice would be gratefully appreciated. > > 2015-01-12 16:53:48,237 INFO fetcher.Fetcher - fetching > http://promo.bank.com (queue crawl delay=5000ms) > 2015-01-12 16:54:57,278 INFO parse.ParseSegment - Skipping > http://promo.bank.com as content is not fetched successfully. > > Thanks, > > Kartik > > ---------------------------------------------------------------------- > This message, and any attachments, is for the intended recipient(s) only, may > contain information that is privileged, confidential and/or proprietary and > subject to important terms and conditions available at > http://www.bankofamerica.com/emaildisclaimer. If you are not the intended > recipient, please delete this message. >

