Hi Markus,
Thank you so much for your response. I tried to understand the log and do some
google search that left me with nothing much. Please see the log below:
2016-06-25 20:27:20,396 INFO indexer.CleaningJob - CleaningJob: starting at
2016-06-25 20:27:20
2016-06-25 20:27:20,737 WARN util.NativeCodeLoader - Unable to load
native-hadoop library for your platform... using bu
iltin-java classes where applicable
2016-06-25 20:27:21,879 WARN conf.Configuration -
file:/tmp/hadoop-root/mapred/staging/root1345410474/.staging/job_loca
l1345410474_0001/job.xml:an attempt to override final parameter:
mapreduce.job.end-notification.max.retry.interval; Ign
oring.
2016-06-25 20:27:21,884 WARN conf.Configuration -
file:/tmp/hadoop-root/mapred/staging/root1345410474/.staging/job_loca
l1345410474_0001/job.xml:an attempt to override final parameter:
mapreduce.job.end-notification.max.attempts; Ignoring.
2016-06-25 20:27:22,077 WARN conf.Configuration -
file:/tmp/hadoop-root/mapred/local/localRunner/root/job_local13454104
74_0001/job_local1345410474_0001.xml:an attempt to override final parameter:
mapreduce.job.end-notification.max.retry.in
terval; Ignoring.
2016-06-25 20:27:22,084 WARN conf.Configuration -
file:/tmp/hadoop-root/mapred/local/localRunner/root/job_local13454104
74_0001/job_local1345410474_0001.xml:an attempt to override final parameter:
mapreduce.job.end-notification.max.attempts
; Ignoring.
2016-06-25 20:27:22,111 WARN output.FileOutputCommitter - Output Path is null
in setupJob()
2016-06-25 20:27:22,882 INFO indexer.IndexWriters - Adding
org.apache.nutch.indexwriter.solr.SolrIndexWriter
2016-06-25 20:27:23,282 INFO solr.SolrMappingReader - source: content dest:
content
2016-06-25 20:27:23,282 INFO solr.SolrMappingReader - source: title dest: title
2016-06-25 20:27:23,282 INFO solr.SolrMappingReader - source: host dest: host
2016-06-25 20:27:23,282 INFO solr.SolrMappingReader - source: segment dest:
segment
2016-06-25 20:27:23,282 INFO solr.SolrMappingReader - source: boost dest: boost
2016-06-25 20:27:23,282 INFO solr.SolrMappingReader - source: digest dest:
digest
2016-06-25 20:27:23,282 INFO solr.SolrMappingReader - source: tstamp dest:
tstamp
2016-06-25 20:27:23,312 INFO solr.SolrIndexWriter - SolrIndexer: deleting 2/2
documents
2016-06-25 20:27:23,610 WARN output.FileOutputCommitter - Output Path is null
in cleanupJob()
2016-06-25 20:27:23,611 WARN mapred.LocalJobRunner - job_local1345410474_0001
java.lang.Exception: java.lang.IllegalStateException: Connection pool shut down
at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
Caused by: java.lang.IllegalStateException: Connection pool shut down
at org.apache.http.util.Asserts.check(Asserts.java:34)
at
org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:169)
at
org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:202)
Your reply is highly appreciated.
Regards,
Munim
________________________________
From: Markus Jelsma <[email protected]>
Sent: 21 June 2016 17:55:56
To: [email protected]
Subject: RE: nutch clean in crawl script throwing error
Hello Abdul - please check the logs, the real errors are reported there.
Markus
-----Original message-----
> From:Abdul Munim <[email protected]>
> Sent: Sunday 19th June 2016 21:29
> To: [email protected]
> Subject: nutch clean in crawl script throwing error
>
> Hi folks,
>
>
> I've setup Nutch 1.12 and Solr 6.1. Using the crawl script as defined in the
> tutorial but however the last command in the script i.e. nutch clean is
> throughing the following exception:
>
>
> Cleaning up index if possible
> /opt/nutch-latest/bin/nutch clean
> -Dsolr.server.url=http://192.168.99.100:8983/solr/test/ crawl/crawldb
> SolrIndexer: deleting 2/2 documents
> ERROR CleaningJob: java.io.IOException: Job failed!
> at org.apache.nutch.indexer.CleaningJob.delete(CleaningJob.java:172)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70):195)
> at org.apache.nutch.indexer.CleaningJob.main(CleaningJob.java:206)
>
> Error running:
> /opt/nutch-latest/bin/nutch clean
> -Dsolr.server.url=http://192.168.99.100:8983/solr/test/ crawl/crawldb
> Failed with exit value 255.
>
>
> I ran the following command:
>
>
> [root@2a563cff0511 nutch-latest]# bin/crawl -i \
> > -D solr.server.url=http://192.168.99.100:8983/solr/bt-business/ urls/
> > crawl1 1
>
>
> Nutch is installed the following environment:
>
> * Centos 6.8
> * Java x64 1.7.0_79
> * Nutch 1.12
> * Solr 6.1 (installed in a different instance of docker)
>
> However, the previous commands that are in the script executed successfully
> and I'm able to peform query in Solr.
>
>
> Any help is highly appreciated.
>
>
> Regards,
>
> Munim
>