I guess I'll have to put this on hold for the time being...I tried tweaking lots of configuration things but it still fails at the indexing. I ran out of ideas for now. I'm not even using the Solr indexer yet; I just want to make sure that the basic crawl/index works before I proceed...
In any case, any help would be greatly appreciated. Cheers, Tony Wang-3 wrote: > > man, I have exactly the same problem with nutch 1.0 in the SVN trunk! I > wonder when the nutch team will release the official 1.0. really cannot > wait. > > On Mon, Mar 2, 2009 at 12:09 PM, ahammad <[email protected]> wrote: > >> >> I am aware that this is still a development version, but I need to test a >> few >> things with Nutch/Solr so I installed the latest dev version of Nutch >> 1.0. >> >> I tried running a crawl like I did with the working 0.9 version. From the >> log, it seems to fetch all the pages properly, but it fails at the >> indexing: >> >> CrawlDb update: starting >> CrawlDb update: db: kb/crawldb >> CrawlDb update: segments: [kb/segments/20090302135858] >> CrawlDb update: additions allowed: true >> CrawlDb update: URL normalizing: true >> CrawlDb update: URL filtering: true >> CrawlDb update: Merging segment data into db. >> CrawlDb update: done >> LinkDb: starting >> LinkDb: linkdb: kb/linkdb >> LinkDb: URL normalize: true >> LinkDb: URL filter: true >> LinkDb: adding segment: >> file:/c:/nutch-2009-03-02_04-01-53/kb/segments/20090302135757 >> LinkDb: adding segment: >> file:/c:/nutch-2009-03-02_04-01-53/kb/segments/20090302135807 >> LinkDb: adding segment: >> file:/c:/nutch-2009-03-02_04-01-53/kb/segments/20090302135858 >> LinkDb: done >> Indexer: starting >> Exception in thread "main" java.io.IOException: Job failed! >> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232) >> at org.apache.nutch.indexer.Indexer.index(Indexer.java:72) >> at org.apache.nutch.crawl.Crawl.main(Crawl.java:146) >> >> >> I took a look at all the configuration and as far as I can tell, I did >> the >> same thing with my 0.9 install. Could it be that I didn't install it >> properly? I unzipped it and ran ant and ant war in the root directory. >> >> Thanks >> >> -- >> View this message in context: >> http://www.nabble.com/Problem-with-crawling-using-the-latest-1.0-trunk-tp22294581p22294581.html >> Sent from the Nutch - User mailing list archive at Nabble.com. >> >> > > > -- > Are you RCholic? www.RCholic.com > 温 良 恭 俭 让 仁 义 礼 智 信 > > -- View this message in context: http://www.nabble.com/Problem-with-crawling-using-the-latest-1.0-trunk-tp22294581p22294739.html Sent from the Nutch - User mailing list archive at Nabble.com.
