Hi Sebastian, The error logs were catching a wrong ā\ā in the seed.txt. Thanks for pointing me there. Regards, Ankit Goel
> On 25-Oct-2017, at 9:38 PM, Sebastian Nagel <[email protected]> > wrote: > > Hi, > > the file hadoop.log should contain more information about the error. > It's located in ${NUTCH_RUNTIME_HOME}/logs/ or where $NUTCH_LOG_DIR points to. > > Could you have a look at the hadoop.log and eventually send the snippet > where the error is logged. > > Thanks, > Sebastian > > On 10/25/2017 05:35 PM, Ankit Goel wrote: >> Hi, >> I am using Nutch 1.13 with Solr 5.5.0. I have not started hadoop on my >> system, and iām trying to run this as a single nose.When I run the nutch >> script I get the following error, >> >> $ bin/crawl -i -D solr.server.url=http://localhost:8983/solr/nutch ./urls/ >> TestCrawl2/ 2 >> Injecting seed URLs >> ${NUTCH_RUNTIME_HOME}/bin/nutch inject TestCrawl2//crawldb ./urls/ >> Injector: starting at 2017-10-25 19:52:11 >> Injector: crawlDb: TestCrawl2/crawldb >> Injector: urlDir: urls >> Injector: Converting injected urls to crawl db entries. >> Injector: Total urls rejected by filters: 0 >> Injector: Total urls injected after normalization and filtering: 0 >> Injector: Total urls injected but already in CrawlDb: 0 >> Injector: Total new urls injected: 0 >> Injector: finished at 2017-10-25 19:52:14, elapsed: 00:00:03 >> Wed Oct 25 19:52:14 IST 2017 : Iteration 1 of 2 >> Generating a new segment >> ${NUTCH_RUNTIME_HOME}/bin/nutch generate -D mapreduce.job.reduces=2 -D >> mapred.child.java.opts=-Xmx1000m -D mapreduce.reduce.speculative=false -D >> mapreduce.map.speculative=false -D mapreduce.map.output.compress=true >> TestCrawl2//crawldb TestCrawl2//segments -topN 50000 -numFetchers 1 -noFilter >> Generator: starting at 2017-10-25 19:52:16 >> Generator: Selecting best-scoring urls due for fetch. >> Generator: filtering: false >> Generator: normalizing: true >> Generator: topN: 50000 >> Generator: java.io.IOException: Job failed! >> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865) >> at org.apache.nutch.crawl.Generator.generate(Generator.java:591) >> at org.apache.nutch.crawl.Generator.run(Generator.java:766) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) >> at org.apache.nutch.crawl.Generator.main(Generator.java:719) >> >> Error running: >> ${NUTCH_RUNTIME_HOME}/bin/nutch generate -D mapreduce.job.reduces=2 -D >> mapred.child.java.opts=-Xmx1000m -D mapreduce.reduce.speculative=false -D >> mapreduce.map.speculative=false -D mapreduce.map.output.compress=true >> TestCrawl2//crawldb TestCrawl2//segments -topN 50000 -numFetchers 1 -noFilter >> Failed with exit value 255. >> >> Unsure as to why I am getting this error for crawl generator. I followed the >> instruction on the nutch tutorial page. Never got this perviously with 1.9 >> or 1.10 >> >> Thanks, >> Ankit Goel >> >

