Hi,

I am trying to install and use Apache Nutch for web crwaling.

I'm using Nutch 1.13 and Solr 5.5.5.


I'm following the steps on https://wiki.apache.org/nutch/NutchTutorial.

Everything seems to work properly until I get to the step  "Cleaning Solr". I 
use the command


bin/nutch clean crawl/crawldb/ http://localhost:8983/solr


 I get an exception:


SolrIndexer: deleting 1/1 documents
SolrIndexer: deleting 1/1 documents
ERROR CleaningJob: java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865)
    at org.apache.nutch.indexer.CleaningJob.delete(CleaningJob.java:174)
    at org.apache.nutch.indexer.CleaningJob.run(CleaningJob.java:197)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.nutch.indexer.CleaningJob.main(CleaningJob.java:208)



I was googling for a solution for this error but I couldn't find anything 
helpful.


What might be the problem?


Thank you very much for your help.

Anna

NutchTutorial - Nutch Wiki<https://wiki.apache.org/nutch/NutchTutorial>
wiki.apache.org
Introduction. Nutch is a well matured, production ready Web crawler. Nutch 1.x 
enables fine grained configuration, relying on Apache Hadoop data structures ...


Reply via email to