I am able to fix the problem of last email and go through the command of whole-web site crawl from nutch-0.8.x tutorial.
But the resultant folder crawl is still very small, and the last search of "apache", I got the "hit 0" message. Something is still wrong. Please give me some feedback. Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -----Original Message----- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Saturday, July 14, 2007 12:11 PM To: [EMAIL PROTECTED] Subject: inject command fail on whole-web run I am running the "bin/nutch inject crawl/crawldb dmoz" command on my ubuntu OS by following the nutch-0.8.x tutorial. But I got the following error message: 2007-07-14 11:38:35,238 WARN mapred.LocalJobRunner (LocalJobRunner.java:run(120)) - job_ij0atx java.lang.NoClassDefFoundError: dk/brics/automaton/RunAutomaton at org.apache.nutch.urlfilter.automaton.AutomatonURLFilter$Rule.<init>(Automato nURLFilter.java:89) at org.apache.nutch.urlfilter.automaton.AutomatonURLFilter.createRule(Automaton URLFilter.java:70) at org.apache.nutch.urlfilter.api.RegexURLFilterBase.readRulesFile(RegexURLFilt erBase.java:191) at org.apache.nutch.urlfilter.api.RegexURLFilterBase.setConf(RegexURLFilterBase .java:140) at org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:153) at org.apache.nutch.net.URLFilters.<init>(URLFilters.java:53) at org.apache.nutch.crawl.Injector$InjectMapper.configure(Injector.java:56) at org.apache.hadoop.mapred.JobConf.newInstance(JobConf.java:443) at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:33) at org.apache.hadoop.mapred.JobConf.newInstance(JobConf.java:443) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:125) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:91) Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:357) at org.apache.nutch.crawl.Injector.inject(Injector.java:138) at org.apache.nutch.crawl.Injector.main(Injector.java:164) [EMAIL PROTECTED]:~/nutch-0.8.1$ What is wrong in my ubuntu environment? Please help!! Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers