brainstorm
Sun, 13 Jul 2008 11:50:55 -0700
Looks like you haven't done: bin/hadoop namenode -format *before anything else* (do start-all.sh, *after* formatting the namenode)... this is just a guess. I *do* recommend you to start from scratch reading this howto and follow it strictly step by step: http://wiki.apache.org/nutch/NutchHadoopTutorial It worked for me... good luck ! ;) On Fri, Jul 11, 2008 at 7:57 AM, kranthi reddy <[EMAIL PROTECTED]> wrote: > Hi , > > I am trying to crawl a few sites using nutch and hadoop . I have a cluster > of 10 pc's and i have given nutch as a job file to hadoop. I am able to > execute most commands like > > bin_temp/hadoop dfs -put xxx yyy (ls, mkdir) etc > > But when i try to run nutch then i get the following error. > > bin_temp/nutch crawl tempcrawl/urls -dir tempcrawl/crawl -depth 1 > > Exception in thread "main" java.net.SocketTimeoutException: timed out > waiting for rpc response > at org.apache.hadoop.ipc.Client.call(Client.java:473) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:163) > at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown Source) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:247) > at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:105) > at > org.apache.hadoop.dfs.DistributedFileSystem$RawDistributedFileSystem.initialize(DistributedFileSystem.java:67) > at > org.apache.hadoop.fs.FilterFileSystem.initialize(FilterFileSystem.java:57) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:160) > at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:119) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:91) > at org.apache.nutch.crawl.Crawl.main(Crawl.java:83) > > Some one please help me out. > > When i remove the hadoop-env.sh ,hadoop-site.xml and masters file and > replace slaves with "localhost" ....i am able to crawl perfectly well (but > only on master pc :(( ) > > Thank you in advance. > Kranthi reddy.B >