Hi Markus and Sethi,

Thank you for your reply, I was stuck and tried the following and it works for 
me:
bin/nutch solrindex http://127.0.0.1:8983/solr/crawl/crawldb crawl/linkdb 
crawl/segments/*

May I know where is the lucene index directory for the /crawl folder? I would 
like to use Luke-Lucene Index Toolbox to open the index. 

On the sidenote, I think Nutch 1.2 is way much better than Nutch 1.3. Nutch 1.2 
autocreates Lucene index, does not need solr,
and has more functions. Why in the first place is the lucene index removed from 
Nutch 1.3?



________________________________
From: "Sethi, Parampreet"
 <[email protected]>
To: "[email protected]" <[email protected]>; Kelvin <[email protected]>
Sent: Tuesday, 19 July 2011 11:51 PM
Subject: Re: SolrDeleteDuplicates error

Hey Kelvin,

The issue is with the Solr version incompatibility in Nutch 1.3 and the Solr
server that you are running. Remove the SolrjClient jar from
nutch/runtime/local/lib folder and copy the Solrj client jar from your Solr
installation to it.

This will solve your problem. (I was facing the same error yesterday and
this fix resolved it =) )

I am creating notes about how I setup Nutch with Solr and what issues I
faced with solutions at my blog
http://param-techie.blogspot.com/2011/07/nutch-13-and-solr-integration.html

Hope it Helps!
-param 


On 7/19/11 11:28 AM, "Markus
 Jelsma" <[email protected]> wrote:

> The solrdedup job completes without failure, it is the solrindex job that's
> actually failing. See your hadoop.log and check Solr's output.
> 
> On Tuesday 19 July 2011 17:23:51 Kelvin wrote:
>> Sorry for the multiple postings. I am trying out nutch 1.3, which requires
>> solr for indexing
>> 
>> I try to crawl and index with solr with this simple command
>> bin/nutch crawl urls -solr http://localhost:8983/solr/ -depth 3 -topN 10
>> 
>> But why does it gives me the following error? Thank you for your kind help
>> 
>> 
>> SolrIndexer: starting at 2011-07-19 23:13:31
>> java.io.IOException: Job failed!
>> SolrDeleteDuplicates: starting at 2011-07-19 23:13:33
>> SolrDeleteDuplicates: Solr url: http://localhost:8983/solr/
>> SolrDeleteDuplicates: finished at 2011-07-19 23:13:34, elapsed: 00:00:01
>> crawl finished: crawl-20110719231304

Reply via email to