Re: index-more problem?

ogjunk-nutch Wed, 30 Apr 2008 11:06:43 -0700

The first error is simply a symptom of bad CLASSPATH.

The second error has "index=" -- do you find that a little strange looking?  It 
looks like there should be a path to the index directory there (e.g. 
index=/my/index/here), no?


My guess is you don't have things configured correctly.

I also suggest you wait 48 hours before taking your problem from a mailing list 
message into JIRA.  You possibly created more work for people who now have to 
read your message here and then go to JIRA and close/clean up that issue if 
it's invalid.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ----
> From: vkblogger <[EMAIL PROTECTED]>
> To: [email protected]
> Sent: Wednesday, April 30, 2008 1:15:16 AM
> Subject: Re: index-more problem?
> 
> 
> In the latest nutch revision- 652259 I am getting the following error:
> 
> bin/nutch  crawl urls -dir crawl 
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/nutch/crawl/Crawl
> 
> For the most recent builds ( e.g.nutch-2008-04-30_04-01-32)  I get this
> error after running bin/nutch:
> 
> 
> 
> Fetcher: done
> CrawlDb update: starting
> CrawlDb update: db: crawlfs/crawldb
> CrawlDb update: segments: [crawlfs/segments/20080430051112]
> CrawlDb update: additions allowed: true
> CrawlDb update: URL normalizing: true
> CrawlDb update: URL filtering: true
> CrawlDb update: Merging segment data into db.
> CrawlDb update: done
> Generator: Selecting best-scoring urls due for fetch.
> Generator: starting
> Generator: segment: crawlfs/segments/20080430051126
> Generator: filtering: true
> Generator: topN: 100000
> Generator: jobtracker is 'local', generating exactly one partition.
> Generator: 0 records selected for fetching, exiting ...
> Stopping at depth=2 - no more URLs to fetch.
> LinkDb: starting
> LinkDb: linkdb: crawlfs/linkdb
> LinkDb: URL normalize: true
> LinkDb: URL filter: true
> LinkDb: adding segment:
> file:/home/admin/nutch-2008-04-30_04-01-32/crawlfs/segments/20080430051112
> LinkDb: adding segment:
> file:/home/admin/nutch-2008-04-30_04-01-32/crawlfs/segments/20080430051053
> LinkDb: done
> Indexer: starting
> Indexer: linkdb: crawlfs/linkdb
> Indexer: adding segment:
> file:/home/admin/nutch-2008-04-30_04-01-32/crawlfs/segments/20080430051112
> Indexer: adding segment:
> file:/home/admin/nutch-2008-04-30_04-01-32/crawlfs/segments/20080430051053
> IFD [Thread-102]: setInfoStream
> [EMAIL PROTECTED]
> IW 0 [Thread-102]: setInfoStream:
> dir=org.apache.lucene.store.FSDirectory@/tmp/hadoop-admin/mapred/local/index/_1406110510
> autoCommit=true
> [EMAIL PROTECTED]
> [EMAIL PROTECTED]
> ramBufferSizeMB=16.0 maxBuffereDocs=50 maxBuffereDeleteTerms=-1
> maxFieldLength=10000 index=
> Exception in thread "main" java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:894)
>         at org.apache.nutch.indexer.Indexer.index(Indexer.java:311)
>         at org.apache.nutch.crawl.Crawl.main(Crawl.java:144)
> 
> 
> This error does not happen after I remove index-more plugin from
> plugin.includes in the conf/nutch-site.xml file.
> 
> Any idea why this is happening?
> -- 
> View this message in context: 
> http://www.nabble.com/index-more-problem--tp16757538p16975481.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
> 
>

Re: index-more problem?

Reply via email to