Evening Madhvi, I will set this up and debug a clean. I'll report over on https://issues.apache.org/jira/browse/NUTCH-2269
Thank you for reporting. Lewis On Thu, Aug 18, 2016 at 7:08 AM, <[email protected]> wrote: > > From: "Arora, Madhvi" <[email protected]> > To: "[email protected]" <[email protected]> > Cc: > Date: Wed, 17 Aug 2016 13:30:09 +0000 > Subject: Upgrade to Nutch 1.12 > Hi, > > > I wanted to find out how to correct the issue below and will appreciate > any help. > > > I am trying to upgrade to Nutch 1.12. I am using solr 5.3.1. The reason I > am upgrading are: > 1: https crawling > 2: Boilerplate canola extraction through tika > > The only problem so far I am having is an IOException. Please see below. I > searched and there is an existing jira issue > NUTCH-2269 <https://issues.apache.org/jira/browse/NUTCH-2269> > > [NUTCH-2269] Clean not working after crawl - ASF JIRA< > https://issues.apache.org/jira/browse/NUTCH-2269> > issues.apache.org > It seems like the database on Lucene can only be called crawldb. However a > couple of bundled version we can find online use linkdb for Lucene as > default > > > > > I get the same error if I try to clean via the old command: > bin/nutch solrclean crawl-adc/crawldb http://localhost:8983/solr/nutch > > But cleaning through linkdb worked as said in the jira issue i.e. > bin/nutch solrclean crawl-adc/linkdb http://localhost:8983/solr/nutch > > > Just want to know if there is a fix or an alternate way of cleaning and if > cleaning via linkdb might be okay or what are the repercussions of cleaning > via linkdb. > > > Exception from logs: > java.lang.Exception: java.lang.IllegalStateException: Connection pool > shut down > at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks( > LocalJobRunner.java:462) > at org.apache.hadoop.mapred.LocalJobRunner$Job.run( > LocalJobRunner.java:529

