Add this to conf/core-site.xml <property> <name>fs.default.name</name> <value>hdfs://cluster0:9000</value> </property>
Thanks! Xiao On Sun, Jul 25, 2010 at 9:59 PM, Alex Luya <[email protected]> wrote: > Hello, > I run this: > a...@alexluya:~$nutch crawl crawl/urls.txt -dir crawl -depth 3 > > got this errors: > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > crawl started in: crawl > rootUrlDir = crawl/urls.txt > threads = 10 > depth = 3 > Injector: starting > Injector: crawlDb: crawl/crawldb > Injector: urlDir: crawl/urls.txt > Injector: Converting injected urls to crawl db entries. > Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: > Input path does not exist: file:/home/alex/crawl/urls.txt > at > org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:179) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:190) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:797) > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1142) > at org.apache.nutch.crawl.Injector.inject(Injector.java:160) > at org.apache.nutch.crawl.Crawl.main(Crawl.java:113) > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > Obviously,it is using local file system by default,How can I configure to lead > it to use hdfs by default? >

