Hi Markus,

Thank you so much for your response. I tried to understand the log and do some 
google search that left me with nothing much. Please see the log below:


2016-06-25 20:27:20,396 INFO  indexer.CleaningJob - CleaningJob: starting at 
2016-06-25 20:27:20
2016-06-25 20:27:20,737 WARN  util.NativeCodeLoader - Unable to load 
native-hadoop library for your platform... using bu
iltin-java classes where applicable
2016-06-25 20:27:21,879 WARN  conf.Configuration - 
file:/tmp/hadoop-root/mapred/staging/root1345410474/.staging/job_loca
l1345410474_0001/job.xml:an attempt to override final parameter: 
mapreduce.job.end-notification.max.retry.interval;  Ign
oring.
2016-06-25 20:27:21,884 WARN  conf.Configuration - 
file:/tmp/hadoop-root/mapred/staging/root1345410474/.staging/job_loca
l1345410474_0001/job.xml:an attempt to override final parameter: 
mapreduce.job.end-notification.max.attempts;  Ignoring.
2016-06-25 20:27:22,077 WARN  conf.Configuration - 
file:/tmp/hadoop-root/mapred/local/localRunner/root/job_local13454104
74_0001/job_local1345410474_0001.xml:an attempt to override final parameter: 
mapreduce.job.end-notification.max.retry.in
terval;  Ignoring.
2016-06-25 20:27:22,084 WARN  conf.Configuration - 
file:/tmp/hadoop-root/mapred/local/localRunner/root/job_local13454104
74_0001/job_local1345410474_0001.xml:an attempt to override final parameter: 
mapreduce.job.end-notification.max.attempts
;  Ignoring.
2016-06-25 20:27:22,111 WARN  output.FileOutputCommitter - Output Path is null 
in setupJob()
2016-06-25 20:27:22,882 INFO  indexer.IndexWriters - Adding 
org.apache.nutch.indexwriter.solr.SolrIndexWriter
2016-06-25 20:27:23,282 INFO  solr.SolrMappingReader - source: content dest: 
content
2016-06-25 20:27:23,282 INFO  solr.SolrMappingReader - source: title dest: title
2016-06-25 20:27:23,282 INFO  solr.SolrMappingReader - source: host dest: host
2016-06-25 20:27:23,282 INFO  solr.SolrMappingReader - source: segment dest: 
segment
2016-06-25 20:27:23,282 INFO  solr.SolrMappingReader - source: boost dest: boost
2016-06-25 20:27:23,282 INFO  solr.SolrMappingReader - source: digest dest: 
digest
2016-06-25 20:27:23,282 INFO  solr.SolrMappingReader - source: tstamp dest: 
tstamp
2016-06-25 20:27:23,312 INFO  solr.SolrIndexWriter - SolrIndexer: deleting 2/2 
documents
2016-06-25 20:27:23,610 WARN  output.FileOutputCommitter - Output Path is null 
in cleanupJob()
2016-06-25 20:27:23,611 WARN  mapred.LocalJobRunner - job_local1345410474_0001
java.lang.Exception: java.lang.IllegalStateException: Connection pool shut down
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
Caused by: java.lang.IllegalStateException: Connection pool shut down
        at org.apache.http.util.Asserts.check(Asserts.java:34)
        at 
org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:169)
        at 
org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:202)


Your reply is highly appreciated.


Regards,

Munim

________________________________
From: Markus Jelsma <[email protected]>
Sent: 21 June 2016 17:55:56
To: [email protected]
Subject: RE: nutch clean in crawl script throwing error

Hello Abdul - please check the logs, the real errors are reported there.

Markus



-----Original message-----
> From:Abdul Munim <[email protected]>
> Sent: Sunday 19th June 2016 21:29
> To: [email protected]
> Subject: nutch clean in crawl script throwing error
>
> Hi folks,
>
>
> I've setup Nutch 1.12 and Solr 6.1. Using the crawl script as defined in the 
> tutorial but however the last command  in the script i.e. nutch clean is 
> throughing the following exception:
>
>
> Cleaning up index if possible
> /opt/nutch-latest/bin/nutch clean 
> -Dsolr.server.url=http://192.168.99.100:8983/solr/test/ crawl/crawldb
> SolrIndexer: deleting 2/2 documents
> ERROR CleaningJob: java.io.IOException: Job failed!
>         at org.apache.nutch.indexer.CleaningJob.delete(CleaningJob.java:172)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70):195)
>         at org.apache.nutch.indexer.CleaningJob.main(CleaningJob.java:206)
>
> Error running:
>   /opt/nutch-latest/bin/nutch clean 
> -Dsolr.server.url=http://192.168.99.100:8983/solr/test/ crawl/crawldb
> Failed with exit value 255.
>
>
> I ran the following command:
>
>
> [root@2a563cff0511 nutch-latest]# bin/crawl -i \
> > -D solr.server.url=http://192.168.99.100:8983/solr/bt-business/ urls/ 
> > crawl1 1
>
>
> Nutch is installed the following environment:
>
>   *   Centos 6.8
>   *   Java x64 1.7.0_79
>   *   Nutch 1.12
>   *   Solr 6.1 (installed in a different instance of docker)
>
> However, the previous commands that are in the script executed successfully 
> and I'm able to peform query in Solr.
>
>
> Any help is highly appreciated.
>
>
> Regards,
>
> Munim
>

Reply via email to