Thank you so much Lewis.
On 8/19/16, 4:53 AM, "lewis john mcgibbney" <[email protected]> wrote: >Evening Madhvi, >I will set this up and debug a clean. I'll report over on >https://issues.apache.org/jira/browse/NUTCH-2269 > >Thank you for reporting. >Lewis > >On Thu, Aug 18, 2016 at 7:08 AM, <[email protected]> wrote: > >> >> From: "Arora, Madhvi" <[email protected]> >> To: "[email protected]" <[email protected]> >> Cc: >> Date: Wed, 17 Aug 2016 13:30:09 +0000 >> Subject: Upgrade to Nutch 1.12 >> Hi, >> >> >> I wanted to find out how to correct the issue below and will appreciate >> any help. >> >> >> I am trying to upgrade to Nutch 1.12. I am using solr 5.3.1. The reason I >> am upgrading are: >> 1: https crawling >> 2: Boilerplate canola extraction through tika >> >> The only problem so far I am having is an IOException. Please see below. I >> searched and there is an existing jira issue >> NUTCH-2269 <https://issues.apache.org/jira/browse/NUTCH-2269> >> >> [NUTCH-2269] Clean not working after crawl - ASF JIRA< >> https://issues.apache.org/jira/browse/NUTCH-2269> >> issues.apache.org >> It seems like the database on Lucene can only be called crawldb. However a >> couple of bundled version we can find online use linkdb for Lucene as >> default >> >> >> >> >> I get the same error if I try to clean via the old command: >> bin/nutch solrclean crawl-adc/crawldb http://localhost:8983/solr/nutch >> >> But cleaning through linkdb worked as said in the jira issue i.e. >> bin/nutch solrclean crawl-adc/linkdb http://localhost:8983/solr/nutch >> >> >> Just want to know if there is a fix or an alternate way of cleaning and if >> cleaning via linkdb might be okay or what are the repercussions of cleaning >> via linkdb. >> >> >> Exception from logs: >> java.lang.Exception: java.lang.IllegalStateException: Connection pool >> shut down >> at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks( >> LocalJobRunner.java:462) >> at org.apache.hadoop.mapred.LocalJobRunner$Job.run( >> LocalJobRunner.java:529

