Re: generator fail

Sebastian Nagel Wed, 25 Oct 2017 09:08:59 -0700

Hi,

the file hadoop.log should contain more information about the error.
It's located in ${NUTCH_RUNTIME_HOME}/logs/ or where $NUTCH_LOG_DIR points to.


Could you have a look at the hadoop.log and eventually send the snippet
where the error is logged.

Thanks,
Sebastian

On 10/25/2017 05:35 PM, Ankit Goel wrote:
> Hi,
> I am using Nutch 1.13 with Solr 5.5.0. I have not started hadoop on my 
> system, and i’m trying to run this as a single nose.When I run the nutch 
> script I get the following error,
> 
> $ bin/crawl -i -D solr.server.url=http://localhost:8983/solr/nutch ./urls/ 
> TestCrawl2/  2
> Injecting seed URLs
> ${NUTCH_RUNTIME_HOME}/bin/nutch inject TestCrawl2//crawldb ./urls/
> Injector: starting at 2017-10-25 19:52:11
> Injector: crawlDb: TestCrawl2/crawldb
> Injector: urlDir: urls
> Injector: Converting injected urls to crawl db entries.
> Injector: Total urls rejected by filters: 0
> Injector: Total urls injected after normalization and filtering: 0
> Injector: Total urls injected but already in CrawlDb: 0
> Injector: Total new urls injected: 0
> Injector: finished at 2017-10-25 19:52:14, elapsed: 00:00:03
> Wed Oct 25 19:52:14 IST 2017 : Iteration 1 of 2
> Generating a new segment
> ${NUTCH_RUNTIME_HOME}/bin/nutch generate -D mapreduce.job.reduces=2 -D 
> mapred.child.java.opts=-Xmx1000m -D mapreduce.reduce.speculative=false -D 
> mapreduce.map.speculative=false -D mapreduce.map.output.compress=true 
> TestCrawl2//crawldb TestCrawl2//segments -topN 50000 -numFetchers 1 -noFilter
> Generator: starting at 2017-10-25 19:52:16
> Generator: Selecting best-scoring urls due for fetch.
> Generator: filtering: false
> Generator: normalizing: true
> Generator: topN: 50000
> Generator: java.io.IOException: Job failed!
>       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865)
>       at org.apache.nutch.crawl.Generator.generate(Generator.java:591)
>       at org.apache.nutch.crawl.Generator.run(Generator.java:766)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>       at org.apache.nutch.crawl.Generator.main(Generator.java:719)
> 
> Error running:
>  ${NUTCH_RUNTIME_HOME}/bin/nutch generate -D mapreduce.job.reduces=2 -D 
> mapred.child.java.opts=-Xmx1000m -D mapreduce.reduce.speculative=false -D 
> mapreduce.map.speculative=false -D mapreduce.map.output.compress=true 
> TestCrawl2//crawldb TestCrawl2//segments -topN 50000 -numFetchers 1 -noFilter
> Failed with exit value 255.
> 
> Unsure as to why I am getting this error for crawl generator. I followed the 
> instruction on the nutch tutorial page. Never got this perviously with 1.9 or 
> 1.10
> 
> Thanks,
> Ankit Goel
>

Re: generator fail

Reply via email to