hi

i've changed from nutch 0.7 to 0.8
done the following steps:
created an urls.txt in a dir. named seeds

bin/hadoop dfs -put seeds seeds

060317 121440 parsing
jar:file:/home/../nutch-nightly/lib/hadoop-0.1-dev.jar!/hadoop-default.xml
060317 121441 No FS indicated, using default:local

bin/nutch crawl seeds -dir crawled -depth 2 >& crawl.log
but in crawl.log:
060419 124302 parsing
jar:file:/home/../nutch-nightly/lib/hadoop-0.1-dev.jar!/hadoop-default.xml
060419 124302 parsing
jar:file:/home/../nutch-nightly/lib/hadoop-0.1-dev.jar!/mapred-default.xml
060419 124302 parsing /tmp/hadoop/mapred/local/job_e7cpf1.xml/localRunner
060419 124302 parsing file:/home/../nutch-nightly/conf/hadoop-site.xml
java.io.IOException: No input directories specified in: Configuration:
defaults: hadoop-default.xml , mapred-default.xml ,
/tmp/hadoop/mapred/local/job_e7cpf1.xml/localRunnerfinal: hadoop-site.xml
   at
org.apache.hadoop.mapred.InputFormatBase.listFiles(InputFormatBase.java:84)
   at
org.apache.hadoop.mapred.InputFormatBase.getSplits(InputFormatBase.java:94)
   at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:70)
060419 124302 Running job: job_e7cpf1
Exception in thread "main" java.io.IOException: Job failed!
   at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:310)
   at org.apache.nutch.crawl.Injector.inject(Injector.java:114)
   at org.apache.nutch.crawl.Crawl.main(Crawl.java:104)

Any ideas?


-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to