Re: Upgrade to Nutch 1.12

Arora, Madhvi Fri, 19 Aug 2016 05:44:11 -0700

Thank you so much Lewis.




On 8/19/16, 4:53 AM, "lewis john mcgibbney" <[email protected]> wrote:

>Evening Madhvi,
>I will set this up and debug a clean. I'll report over on
>https://issues.apache.org/jira/browse/NUTCH-2269
>
>Thank you for reporting.
>Lewis
>
>On Thu, Aug 18, 2016 at 7:08 AM, <[email protected]> wrote:
>
>>
>> From: "Arora, Madhvi" <[email protected]>
>> To: "[email protected]" <[email protected]>
>> Cc:
>> Date: Wed, 17 Aug 2016 13:30:09 +0000
>> Subject: Upgrade to Nutch 1.12
>> Hi,
>>
>>
>> I wanted to find out how to correct the issue below and will appreciate
>> any help.
>>
>>
>>  I am trying to upgrade to Nutch 1.12. I am using solr 5.3.1. The reason I
>> am upgrading are:
>> 1: https crawling
>> 2: Boilerplate canola extraction through tika
>>
>> The only problem so far I am having is an IOException. Please see below. I
>> searched and there is an existing jira issue
>> NUTCH-2269 <https://issues.apache.org/jira/browse/NUTCH-2269>
>>
>> [NUTCH-2269] Clean not working after crawl - ASF JIRA<
>> https://issues.apache.org/jira/browse/NUTCH-2269>
>> issues.apache.org
>> It seems like the database on Lucene can only be called crawldb. However a
>> couple of bundled version we can find online use linkdb for Lucene as
>> default
>>
>>
>>
>>
>> I get the same error if I try to clean via the old command:
>> bin/nutch solrclean crawl-adc/crawldb http://localhost:8983/solr/nutch
>>
>> But cleaning through linkdb worked as said in the jira issue i.e.
>> bin/nutch solrclean crawl-adc/linkdb http://localhost:8983/solr/nutch
>>
>>
>> Just want to know if there is a fix or an alternate way of cleaning and if
>> cleaning via linkdb might be okay or what are the repercussions of cleaning
>> via linkdb.
>>
>>
>> Exception from logs:
>> java.lang.Exception: java.lang.IllegalStateException: Connection pool
>> shut down
>>         at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(
>> LocalJobRunner.java:462)
>>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(
>> LocalJobRunner.java:529

Re: Upgrade to Nutch 1.12

Reply via email to