Hi,

I am a brand-new user to Nutch and Solr. I've been trying to install both
programs and integrate them. I followed these two tutorials:
http://wiki.apache.org/nutch/RunningNutchAndSolr
http://blog.foofactory.fi/2007/02/online-indexing-integrating-nutch-with.html

The installation of Solr was successful, because I was able to run 'java
-jar start.jar' to start indexing.

However, when I tried to run Nutch by './bin/crawl.sh crawl.s', I got this
error:

*Injector: starting
Injector: crawlDb: crawl.s/crawldb
Injector: urlDir: urls
Injector: Converting injected urls to crawl db entries.
Injector: Merging injected urls into crawl db.
Injector: done
Generator: Selecting best-scoring urls due for fetch.
Generator: starting
Generator: segment: crawl.s/segments/20081225010322
Generator: filtering: true
Generator: topN: 1000
Generator: jobtracker is 'local', generating exactly one partition.
Generator: Partitioning selected urls by host, for politeness.
Generator: done.
processing segment drwxr-xr-x - tomcat tomcat 4096 2008-12-23 00:38
/opt/tomcat6/nutch/crawl.s/segments/20081223003839
Fetcher: starting
Fetcher: segment: drwxr-xr-x
Fetcher: java.io.IOException: Segment already fetched!
        at
org.apache.nutch.fetcher.FetcherOutputFormat.checkOutputSpecs(FetcherOutputFormat.java:50)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:778)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1127)
        at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:531)
        at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:566)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:538)

Command exited with abnormal status, bailing out.*

I don't know where is wrong. Could someone on the list help me out? thanks!
-- 
Signature: Success is a journey that never ends.

Reply via email to