Re: Parser not returning any results

Sebastian Nagel Wed, 14 Jan 2015 12:50:15 -0800

Hi Kartik,

I've tried the same URL and parsing worked well with Nutch 1.x (trunk).


Which Nutch version is used?

The error indicates that the fetch didn't succeed with HTTP status 200
which may happen (it could be a temporary failure).

If no failure is indicated in the logs, it's possible
to get more information via

 % bin/nutch readdb
and for 1.x also:
 % bin/nutch readseg

Best,
Sebastian


On 01/13/2015 09:52 PM, Krishnanand, Kartik wrote:
> Hi,
> 
> As a nutch newbie, I am trying to crawl a single URL at a depth of 1, I am 
> seeing the following behavior
> 
> I don't know why this could be happening. I loaded the URL in browser, this 
> did not work for me. What could be the possible reason for this behavior? Any 
> advice would be gratefully appreciated.
> 
> 2015-01-12 16:53:48,237 INFO  fetcher.Fetcher - fetching 
> http://promo.bank.com (queue crawl delay=5000ms)
> 2015-01-12 16:54:57,278 INFO  parse.ParseSegment - Skipping 
> http://promo.bank.com as content is not fetched successfully.
> 
> Thanks,
> 
> Kartik
> 
> ----------------------------------------------------------------------
> This message, and any attachments, is for the intended recipient(s) only, may 
> contain information that is privileged, confidential and/or proprietary and 
> subject to important terms and conditions available at 
> http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended 
> recipient, please delete this message.
>

Re: Parser not returning any results

Reply via email to