Two questions about Integration Solr with Nutch on the Nutch 1.x tutorial

Junqiang Zhang Sun, 15 Dec 2013 23:20:51 -0800

Hi,

I am new to Nutch. I followed the Nutch 1.x tutorial to install the
1.7 version. During my installation of Nutch 1.7, I had two problems
with the integration of Solr with Nutch.



(1) Section 6 (Integrate Solr with Nutch) of Nutch 1.x tutorial
basically replaces the original schema.xml file in Solr with the
schema.xml in Nutch, and does some modification. Is it necessary to
replace the file? After the file was replaced, I was not able to start
Solr with the command “java -jar start.jar”.



(2) If I do not replace the schema.xml, I can run “java -jar
start.jar”. However, some exception happens after I run the following
Solr Index command at Section 6.6 of the tutorial.

bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb -linkdb
crawl/linkdb crawl/segments/*

The exception is:
Indexer: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob.

I think this exception is related to Hadoop. How to fix it?


I hope somebody could kindly help me with the above two problems, or
point out where I can find the answers. Thanks in advance.

Regards,
Junqiang

Two questions about Integration Solr with Nutch on the Nutch 1.x tutorial

Reply via email to