I finally finished my first successfull experinece with nutch/hadoop, I started from 70,000 seeds and this is results of 1 cycle crawl.
060207 153956 Statistics for CrawlDb: t1/crawldb 060207 153956 TOTAL urls: 850726 060207 153956 avg score: 1.037 060207 153956 max score: 133.269 060207 153956 min score: 1.0 060207 153956 retry 0: 849588 060207 153956 retry 1: 1138 060207 153956 status 1 (DB_unfetched): 788522 060207 153956 status 2 (DB_fetched): 60703 060207 153956 status 3 (DB_gone): 1501 060207 153956 CrawlDb statistics: done But, I still have problem with hadoop.jar. I have to use the built classes instead! Thanks, Mike On 2/7/06, Mike Smith <[EMAIL PROTECTED]> wrote: > > Rafit, > > get the hadoop project make that and use the build folder instead of the > jar file. It will work fine then. Something probabely is missing from the > hadoop jar. > > M > > > On 2/7/06, Rafit Izhak_Ratzin <[EMAIL PROTECTED]> wrote: > > > > I am still getting teh next Exception: > > > > Exception in thread "main" java.lang.NullPointerException > > at > > org.apache.hadoop.mapred.JobTrackerInfoServer.<init>( > > JobTrackerInfoServer.java:56) > > at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java > > :303) > > at > > org.apache.hadoop.mapred.JobTracker.startTracker (JobTracker.java:50) > > at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:813) > > 060207 161923 Server handler 9 on 50020: starting > > > > > > > > > > > > >From: Doug Cutting < [EMAIL PROTECTED]> > > >Reply-To: [email protected] > > >To: [email protected] > > >Subject: Re: new svn version:NoClassDefFoundError - JobTracker > > >Date: Tue, 07 Feb 2006 13:04:37 -0800 > > > > > >Mike Smith wrote: > > >>The problem is that jetty jar files are missing from the SVN., I > > replaced > > >>the Jetty jar files but I get another exception: > > > > > >I just restored the jetty lib and the jetty-ext libs. Does that help? > > > > > >Doug > > > > _________________________________________________________________ > > FREE pop-up blocking with the new MSN Toolbar - get it now! > > http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ > > > > >
