Re: generator fail

Ankit Goel Wed, 25 Oct 2017 11:44:26 -0700

Hi Sebastian,
The error logs were catching a wrong “\” in the seed.txt. Thanks for pointing 
me there. 
Regards,
Ankit Goel


> On 25-Oct-2017, at 9:38 PM, Sebastian Nagel <[email protected]> 
> wrote:
> 
> Hi,
> 
> the file hadoop.log should contain more information about the error.
> It's located in ${NUTCH_RUNTIME_HOME}/logs/ or where $NUTCH_LOG_DIR points to.
> 
> Could you have a look at the hadoop.log and eventually send the snippet
> where the error is logged.
> 
> Thanks,
> Sebastian
> 
> On 10/25/2017 05:35 PM, Ankit Goel wrote:
>> Hi,
>> I am using Nutch 1.13 with Solr 5.5.0. I have not started hadoop on my 
>> system, and i’m trying to run this as a single nose.When I run the nutch 
>> script I get the following error,
>> 
>> $ bin/crawl -i -D solr.server.url=http://localhost:8983/solr/nutch ./urls/ 
>> TestCrawl2/  2
>> Injecting seed URLs
>> ${NUTCH_RUNTIME_HOME}/bin/nutch inject TestCrawl2//crawldb ./urls/
>> Injector: starting at 2017-10-25 19:52:11
>> Injector: crawlDb: TestCrawl2/crawldb
>> Injector: urlDir: urls
>> Injector: Converting injected urls to crawl db entries.
>> Injector: Total urls rejected by filters: 0
>> Injector: Total urls injected after normalization and filtering: 0
>> Injector: Total urls injected but already in CrawlDb: 0
>> Injector: Total new urls injected: 0
>> Injector: finished at 2017-10-25 19:52:14, elapsed: 00:00:03
>> Wed Oct 25 19:52:14 IST 2017 : Iteration 1 of 2
>> Generating a new segment
>> ${NUTCH_RUNTIME_HOME}/bin/nutch generate -D mapreduce.job.reduces=2 -D 
>> mapred.child.java.opts=-Xmx1000m -D mapreduce.reduce.speculative=false -D 
>> mapreduce.map.speculative=false -D mapreduce.map.output.compress=true 
>> TestCrawl2//crawldb TestCrawl2//segments -topN 50000 -numFetchers 1 -noFilter
>> Generator: starting at 2017-10-25 19:52:16
>> Generator: Selecting best-scoring urls due for fetch.
>> Generator: filtering: false
>> Generator: normalizing: true
>> Generator: topN: 50000
>> Generator: java.io.IOException: Job failed!
>>      at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865)
>>      at org.apache.nutch.crawl.Generator.generate(Generator.java:591)
>>      at org.apache.nutch.crawl.Generator.run(Generator.java:766)
>>      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>      at org.apache.nutch.crawl.Generator.main(Generator.java:719)
>> 
>> Error running:
>> ${NUTCH_RUNTIME_HOME}/bin/nutch generate -D mapreduce.job.reduces=2 -D 
>> mapred.child.java.opts=-Xmx1000m -D mapreduce.reduce.speculative=false -D 
>> mapreduce.map.speculative=false -D mapreduce.map.output.compress=true 
>> TestCrawl2//crawldb TestCrawl2//segments -topN 50000 -numFetchers 1 -noFilter
>> Failed with exit value 255.
>> 
>> Unsure as to why I am getting this error for crawl generator. I followed the 
>> instruction on the nutch tutorial page. Never got this perviously with 1.9 
>> or 1.10
>> 
>> Thanks,
>> Ankit Goel
>> 
>

Re: generator fail

Reply via email to