Same problem here if using build #740 (Mar 2, 2009 4:01:53 AM)
I switched to build #736 (Feb 26, 2009 4:01:15 AM) and it worked then.

Justin

Tony Wang wrote:
> man, I have exactly the same problem with nutch 1.0 in the SVN trunk! I
> wonder when the nutch team will release the official 1.0. really cannot
> wait.
> 
> On Mon, Mar 2, 2009 at 12:09 PM, ahammad <[email protected]> wrote:
> 
>> I am aware that this is still a development version, but I need to test a
>> few
>> things with Nutch/Solr so I installed the latest dev version of Nutch 1.0.
>>
>> I tried running a crawl like I did with the working 0.9 version. From the
>> log, it seems to fetch all the pages properly, but it fails at the
>> indexing:
>>
>> CrawlDb update: starting
>> CrawlDb update: db: kb/crawldb
>> CrawlDb update: segments: [kb/segments/20090302135858]
>> CrawlDb update: additions allowed: true
>> CrawlDb update: URL normalizing: true
>> CrawlDb update: URL filtering: true
>> CrawlDb update: Merging segment data into db.
>> CrawlDb update: done
>> LinkDb: starting
>> LinkDb: linkdb: kb/linkdb
>> LinkDb: URL normalize: true
>> LinkDb: URL filter: true
>> LinkDb: adding segment:
>> file:/c:/nutch-2009-03-02_04-01-53/kb/segments/20090302135757
>> LinkDb: adding segment:
>> file:/c:/nutch-2009-03-02_04-01-53/kb/segments/20090302135807
>> LinkDb: adding segment:
>> file:/c:/nutch-2009-03-02_04-01-53/kb/segments/20090302135858
>> LinkDb: done
>> Indexer: starting
>> Exception in thread "main" java.io.IOException: Job failed!
>>        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
>>        at org.apache.nutch.indexer.Indexer.index(Indexer.java:72)
>>        at org.apache.nutch.crawl.Crawl.main(Crawl.java:146)
>>
>>
>> I took a look at all the configuration and as far as I can tell, I did the
>> same thing with my 0.9 install. Could it be that I didn't install it
>> properly? I unzipped it and ran ant and ant war in the root directory.
>>
>> Thanks
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Problem-with-crawling-using-the-latest-1.0-trunk-tp22294581p22294581.html
>> Sent from the Nutch - User mailing list archive at Nabble.com.
>>
>>
> 
> 


Reply via email to