thanks Justin. the build #736 works flawlessly! On Mon, Mar 2, 2009 at 1:34 PM, Justin Yao <[email protected]> wrote:
> Same problem here if using build #740 (Mar 2, 2009 4:01:53 AM) > I switched to build #736 (Feb 26, 2009 4:01:15 AM) and it worked then. > > Justin > > Tony Wang wrote: > > man, I have exactly the same problem with nutch 1.0 in the SVN trunk! I > > wonder when the nutch team will release the official 1.0. really cannot > > wait. > > > > On Mon, Mar 2, 2009 at 12:09 PM, ahammad <[email protected]> wrote: > > > >> I am aware that this is still a development version, but I need to test > a > >> few > >> things with Nutch/Solr so I installed the latest dev version of Nutch > 1.0. > >> > >> I tried running a crawl like I did with the working 0.9 version. From > the > >> log, it seems to fetch all the pages properly, but it fails at the > >> indexing: > >> > >> CrawlDb update: starting > >> CrawlDb update: db: kb/crawldb > >> CrawlDb update: segments: [kb/segments/20090302135858] > >> CrawlDb update: additions allowed: true > >> CrawlDb update: URL normalizing: true > >> CrawlDb update: URL filtering: true > >> CrawlDb update: Merging segment data into db. > >> CrawlDb update: done > >> LinkDb: starting > >> LinkDb: linkdb: kb/linkdb > >> LinkDb: URL normalize: true > >> LinkDb: URL filter: true > >> LinkDb: adding segment: > >> file:/c:/nutch-2009-03-02_04-01-53/kb/segments/20090302135757 > >> LinkDb: adding segment: > >> file:/c:/nutch-2009-03-02_04-01-53/kb/segments/20090302135807 > >> LinkDb: adding segment: > >> file:/c:/nutch-2009-03-02_04-01-53/kb/segments/20090302135858 > >> LinkDb: done > >> Indexer: starting > >> Exception in thread "main" java.io.IOException: Job failed! > >> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232) > >> at org.apache.nutch.indexer.Indexer.index(Indexer.java:72) > >> at org.apache.nutch.crawl.Crawl.main(Crawl.java:146) > >> > >> > >> I took a look at all the configuration and as far as I can tell, I did > the > >> same thing with my 0.9 install. Could it be that I didn't install it > >> properly? I unzipped it and ran ant and ant war in the root directory. > >> > >> Thanks > >> > >> -- > >> View this message in context: > >> > http://www.nabble.com/Problem-with-crawling-using-the-latest-1.0-trunk-tp22294581p22294581.html > >> Sent from the Nutch - User mailing list archive at Nabble.com. > >> > >> > > > > > > > -- Are you RCholic? www.RCholic.com 温 良 恭 俭 让 仁 义 礼 智 信
