Re: index-more problem?

vkblogger Tue, 29 Apr 2008 22:15:50 -0700

In the latest nutch revision- 652259 I am getting the following error:

bin/nutch  crawl urls -dir crawl 
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/nutch/crawl/Crawl


For the most recent builds ( e.g.nutch-2008-04-30_04-01-32)  I get this
error after running bin/nutch:



Fetcher: done
CrawlDb update: starting
CrawlDb update: db: crawlfs/crawldb
CrawlDb update: segments: [crawlfs/segments/20080430051112]
CrawlDb update: additions allowed: true
CrawlDb update: URL normalizing: true
CrawlDb update: URL filtering: true
CrawlDb update: Merging segment data into db.
CrawlDb update: done
Generator: Selecting best-scoring urls due for fetch.
Generator: starting
Generator: segment: crawlfs/segments/20080430051126
Generator: filtering: true
Generator: topN: 100000
Generator: jobtracker is 'local', generating exactly one partition.
Generator: 0 records selected for fetching, exiting ...
Stopping at depth=2 - no more URLs to fetch.
LinkDb: starting
LinkDb: linkdb: crawlfs/linkdb
LinkDb: URL normalize: true
LinkDb: URL filter: true
LinkDb: adding segment:
file:/home/admin/nutch-2008-04-30_04-01-32/crawlfs/segments/20080430051112
LinkDb: adding segment:
file:/home/admin/nutch-2008-04-30_04-01-32/crawlfs/segments/20080430051053
LinkDb: done
Indexer: starting
Indexer: linkdb: crawlfs/linkdb
Indexer: adding segment:
file:/home/admin/nutch-2008-04-30_04-01-32/crawlfs/segments/20080430051112
Indexer: adding segment:
file:/home/admin/nutch-2008-04-30_04-01-32/crawlfs/segments/20080430051053
IFD [Thread-102]: setInfoStream
[EMAIL PROTECTED]
IW 0 [Thread-102]: setInfoStream:
dir=org.apache.lucene.store.FSDirectory@/tmp/hadoop-admin/mapred/local/index/_1406110510
autoCommit=true
[EMAIL PROTECTED]
[EMAIL PROTECTED]
ramBufferSizeMB=16.0 maxBuffereDocs=50 maxBuffereDeleteTerms=-1
maxFieldLength=10000 index=
Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:894)
        at org.apache.nutch.indexer.Indexer.index(Indexer.java:311)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:144)


This error does not happen after I remove index-more plugin from
plugin.includes in the conf/nutch-site.xml file.

Any idea why this is happening?
-- 
View this message in context: 
http://www.nabble.com/index-more-problem--tp16757538p16975481.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: index-more problem?

Reply via email to