HI All when i use nutch and hadoop (with pseudo-distributed model) , when i try to crawl some web site, there is a rpc timeout mistake: Injector: java.lang.RuntimeException: java.net.SocketTimeoutException: timed out waiting for rpc response at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:252) at org.apache.hadoop.mapred.JobConf.setInputPath(JobConf.java:155) at org.apache.nutch.crawl.Injector.inject(Injector.java:154) at org.apache.nutch.crawl.Injector.run(Injector.java:192) at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189) at org.apache.nutch.crawl.Injector.main(Injector.java:182) Caused by: java.net.SocketTimeoutException: timed out waiting for rpc response at org.apache.hadoop.ipc.Client.call(Client.java:473) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:163) at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:247) at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:105) at org.apache.hadoop.dfs.DistributedFileSystem$RawDistributedFileSystem.initialize(DistributedFileSystem.java:67) at org.apache.hadoop.fs.FilterFileSystem.initialize(FilterFileSystem.java:57) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:160) at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:119) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:91) at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:248) ... 5 more
i have use *hadoop dfs -put urls urls *(urls is a directory, which contain one txt file)* *and use* hadoop dfs -ls*, i can find the urls on dfs *hadoop start-all.sh *works well * * is there any one know how to solve the problem? thx~ -- 程越强 Cheng Yueqiang