hi i've changed from nutch 0.7 to 0.8 done the following steps: created an urls.txt in a dir. named seeds
bin/hadoop dfs -put seeds seeds 060317 121440 parsing jar:file:/home/../nutch-nightly/lib/hadoop-0.1-dev.jar!/hadoop-default.xml 060317 121441 No FS indicated, using default:local bin/nutch crawl seeds -dir crawled -depth 2 >& crawl.log but in crawl.log: 060419 124302 parsing jar:file:/home/../nutch-nightly/lib/hadoop-0.1-dev.jar!/hadoop-default.xml 060419 124302 parsing jar:file:/home/../nutch-nightly/lib/hadoop-0.1-dev.jar!/mapred-default.xml 060419 124302 parsing /tmp/hadoop/mapred/local/job_e7cpf1.xml/localRunner 060419 124302 parsing file:/home/../nutch-nightly/conf/hadoop-site.xml java.io.IOException: No input directories specified in: Configuration: defaults: hadoop-default.xml , mapred-default.xml , /tmp/hadoop/mapred/local/job_e7cpf1.xml/localRunnerfinal: hadoop-site.xml at org.apache.hadoop.mapred.InputFormatBase.listFiles(InputFormatBase.java:84) at org.apache.hadoop.mapred.InputFormatBase.getSplits(InputFormatBase.java:94) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:70) 060419 124302 Running job: job_e7cpf1 Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:310) at org.apache.nutch.crawl.Injector.inject(Injector.java:114) at org.apache.nutch.crawl.Crawl.main(Crawl.java:104) Any ideas? ------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
