You need to make sure that each slave node has a copy of the authorised keys you generated on the master node.
Miles On 03/01/2008, Dhaya007 <[EMAIL PROTECTED]> wrote: > > > Thanks Arun, > > I am able to riun the datanode in slave (As per the solution given by You > (listinig port )) > > But still it asks the pasword while starting the dfs ans mapreduce > > First i generated rsa as password less as follws > > ssh-keygen -t rsa -P "" > cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys > ssh master > ssh slave > I started the dfs in master as follows > /bin/start-dfs.sh > it asks the passowrd > Please help me to resolve the same (I dont know i am doing right in the > case > of ssh) > > > > Dhaya007 wrote: > > > > > > > > Arun C Murthy wrote: > >> > >> What version of Hadoop are you running? > >> Dhaya007:hadoop-0.15.1 > >> > >> http://wiki.apache.org/lucene-hadoop/Help > >> > >> Dhaya007 wrote: > >> > ..datanode-slave.log > >>> 2007-12-19 19:30:55,579 WARN org.apache.hadoop.dfs.DataNode: Invalid > >>> directory in dfs.data.dir: directory is not writable: > >>> /tmp/hadoop-hdpusr/dfs/data > >>> 2007-12-19 19:30:55,579 ERROR org.apache.hadoop.dfs.DataNode: All > >>> directories in dfs.data.dir are invalid. > >> > >> Did you check that directory? > >> Daya007:Yes, i have checked the folder in which there is no file saved. > >> > >> DataNode is complaining that it doesn't have any 'valid' directories to > >> store data in. > >> > >>> Tasktracker_slav.log > >>> 2008-01-02 15:10:34,419 ERROR org.apache.hadoop.mapred.TaskTracker: > Can > >>> not > >>> start task tracker because java.net.UnknownHostException: unknown > host: > >>> localhost > >>> at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:136) > >>> at org.apache.hadoop.ipc.Client.getConnection(Client.java:532) > >>> at org.apache.hadoop.ipc.Client.call(Client.java:471) > >>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184) > >>> at org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown > Source) > >>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269) > >>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:293) > >>> at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:246) > >>> at > >>> org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:427) > >>> at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java > :717) > >>> at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java > :1880) > >>> > >> > >> That probably means that the TaskTracker's hadoop-site.xml says that > >> 'localhost' is the JobTracker which isn't true... > >> > >> hadoop-site.xml is as follows > >> <?xml version="1.0"?> > >> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > >> > >> <!-- Put site-specific property overrides in this file. --> > >> > >> <configuration> > >> <property> > >> <name>hadoop.tmp.dir</name> > >> <value>/home/hdusr/hadoop-${user.name}</value> > >> <description>A base for other temporary directories.</description> > >> </property> > >> > >> <property> > >> <name>fs.default.name</name> > >> <value>hdfs://master:54310</value> > >> <description>The name of the default file system. A URI whose > >> scheme and authority determine the FileSystem implementation. The > >> uri's scheme determines the config property (fs.SCHEME.impl) naming > >> the FileSystem implementation class. The uri's authority is used to > >> determine the host, port, etc. for a filesystem.</description> > >> </property> > >> > >> <property> > >> <name>mapred.job.tracker</name> > >> <value>master:54311</value> > >> <description>The host and port that the MapReduce job tracker runs > >> at. If "local", then jobs are run in-process as a single map > >> and reduce task. > >> </description> > >> </property> > >> > >> <property> > >> <name>dfs.replication</name> > >> <value>2</value> > >> <description>Default block replication. > >> The actual number of replications can be specified when the file is > >> created. > >> The default is used if replication is not specified in create time. > >> </description> > >> </property> > >> > >> <property> > >> <name>mapred.map.tasks</name> > >> <value>20</value> > >> <description>As a rule of thumb, use 10x the number of slaves (i.e., > >> number of tasktrackers). > >> </description> > >> </property> > >> > >> <property> > >> <name>mapred.reduce.tasks</name> > >> <value>4</value> > >> <description>As a rule of thumb, use 2x the number of slave > processors > >> (i.e., number of tasktrackers). > >> </description> > >> </property> > >> </configuration> > >> > >> > namenode-master.log > >> > 2008-01-02 14:44:02,636 INFO org.apache.hadoop.dfs.Storage: Storage > >> > directory /tmp/hadoop-hdpusr/dfs/name does not exist. > >> > 2008-01-02 14:44:02,638 INFO org.apache.hadoop.ipc.Server: Stopping > >> server > >> > on 54310 > >> > 2008-01-02 14:44:02,653 ERROR org.apache.hadoop.dfs.NameNode: > >> > org.apache.hadoop.dfs.InconsistentFSStateException: Directory > >> > /tmp/hadoop-hdpusr/dfs/name is in an inconsistent state: storage > >> directory > >> > does not exist or is not accessible. > >> > >> That means that, /tmp/hadoop-hdpusr/dfs/name doesn't exist or isn't > >> accessible. > >> > >> Dhaya007 I have checked the name folder but i wont find any folder in > the > >> specified dir > >> -*-*- > >> > >> Overall, this looks like an acute case of wrong-configuration-itis. > >> Please provid the corect configuration site example for multi node > >> cluster other than > >> > http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29 > >> because i followed the same > >> > >> Have you got the same hadoop-site.xml on all your nodes? > >> Dhaya007:Yes > >> > >> More info here: > >> http://lucene.apache.org/hadoop/docs/r0.15.1/cluster_setup.html > >> Dhaya007: I followed the same site you have mentioned but no solution > >> > >> Arun > >> > >> > >>> 2008-01-02 15:10:34,420 INFO org.apache.hadoop.mapred.TaskTracker: > >>> SHUTDOWN_MSG: > >>> /************************************************************ > >>> SHUTDOWN_MSG: Shutting down TaskTracker at slave/172.16.0.58 > >>> ************************************************************/ > >>> > >>> > >>> And all the ports are running > >>> Some time it asks password and some time it wont while starting the > dfs > >>> > >>> Master logs > >>> 2008-01-02 14:44:02,677 INFO org.apache.hadoop.dfs.NameNode: > >>> SHUTDOWN_MSG: > >>> /************************************************************ > >>> SHUTDOWN_MSG: Shutting down NameNode at master/172.16.0.25 > >>> ************************************************************/ > >>> > >>> Datanode-master.log > >>> 2008-01-02 16:26:32,380 INFO org.apache.hadoop.ipc.RPC: Server at > >>> localhost/127.0.0.1:54310 not available yet, Zzzzz... > >>> 2008-01-02 16:26:33,390 INFO org.apache.hadoop.ipc.Client: Retrying > >>> connect > >>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s). > >>> 2008-01-02 16:26:34,400 INFO org.apache.hadoop.ipc.Client: Retrying > >>> connect > >>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s). > >>> 2008-01-02 16:26:35,410 INFO org.apache.hadoop.ipc.Client: Retrying > >>> connect > >>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s). > >>> 2008-01-02 16:26:36,420 INFO org.apache.hadoop.ipc.Client: Retrying > >>> connect > >>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s). > >>> *********************************************** > >>> Jobtracker_master.log > >>> 2008-01-02 16:25:41,040 INFO org.apache.hadoop.ipc.Client: Retrying > >>> connect > >>> to server: localhost/127.0.0.1:54310. Already tried 10 time(s). > >>> 2008-01-02 16:25:42,050 INFO org.apache.hadoop.mapred.JobTracker: > >>> problem > >>> cleaning system directory: /tmp/hadoop-hdpusr/mapred/system > >>> java.net.ConnectException: Connection refused > >>> at java.net.PlainSocketImpl.socketConnect(Native Method) > >>> at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) > >>> at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java > :195) > >>> at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) > >>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) > >>> at java.net.Socket.connect(Socket.java:520) > >>> at > >>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java > :152) > >>> at org.apache.hadoop.ipc.Client.getConnection(Client.java:542) > >>> at org.apache.hadoop.ipc.Client.call(Client.java:471) > >>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184) > >>> at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown > Source) > >>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269) > >>> at org.apache.hadoop.dfs.DFSClient.createNamenode(DFSClient.java > :147) > >>> at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:161) > >>> at > >>> org.apache.hadoop.dfs.DistributedFileSystem.initialize( > DistributedFileSystem.java:65) > >>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:159) > >>> at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118) > >>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90) > >>> at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:683) > >>> at > >>> org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:120) > >>> at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:2052) > >>> 2008-01-02 16:25:42,931 INFO org.apache.hadoop.ipc.Server: IPC Server > >>> handler 5 on 54311, call getFilesystemName() from 127.0.0.1:49283: > >>> error: > >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem > >>> object > >>> not available yet > >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem > >>> object > >>> not available yet > >>> at > >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java > :1475) > >>> at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) > >>> at > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:25) > >>> at java.lang.reflect.Method.invoke(Method.java:585) > >>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) > >>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) > >>> 2008-01-02 16:25:47,942 INFO org.apache.hadoop.ipc.Server: IPC Server > >>> handler 6 on 54311, call getFilesystemName() from 127.0.0.1:49293: > >>> error: > >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem > >>> object > >>> not available yet > >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem > >>> object > >>> not available yet > >>> at > >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java > :1475) > >>> at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) > >>> at > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:25) > >>> at java.lang.reflect.Method.invoke(Method.java:585) > >>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) > >>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) > >>> 2008-01-02 16:25:52,061 INFO org.apache.hadoop.ipc.Client: Retrying > >>> connect > >>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s). > >>> 2008-01-02 16:25:52,951 INFO org.apache.hadoop.ipc.Server: IPC Server > >>> handler 7 on 54311, call getFilesystemName() from 127.0.0.1:49304: > >>> error: > >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem > >>> object > >>> not available yet > >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem > >>> object > >>> not available yet > >>> at > >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java > :1475) > >>> at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) > >>> at > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:25) > >>> at java.lang.reflect.Method.invoke(Method.java:585) > >>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) > >>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) > >>> 2008-01-02 16:25:53,070 INFO org.apache.hadoop.ipc.Client: Retrying > >>> connect > >>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s). > >>> 2008-01-02 16:25:54,080 INFO org.apache.hadoop.ipc.Client: Retrying > >>> connect > >>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s). > >>> 2008-01-02 16:25:55,090 INFO org.apache.hadoop.ipc.Client: Retrying > >>> connect > >>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s). > >>> 2008-01-02 16:25:56,100 INFO org.apache.hadoop.ipc.Client: Retrying > >>> connect > >>> to server: localhost/127.0.0.1:54310. Already tried 5 time(s). > >>> 2008-01-02 16:25:56,281 INFO org.apache.hadoop.mapred.JobTracker: > >>> SHUTDOWN_MSG: > >>> /************************************************************ > >>> SHUTDOWN_MSG: Shutting down JobTracker at master/172.16.0.25 > >>> ************************************************************/ > >>> > >>> Tasktracker_master.log > >>> 2008-01-02 16:26:14,080 INFO org.apache.hadoop.ipc.Client: Retrying > >>> connect > >>> to server: localhost/127.0.0.1:54311. Already tried 2 time(s). > >>> 2008-01-02 16:28:34,510 INFO org.apache.hadoop.mapred.TaskTracker: > >>> STARTUP_MSG: > >>> /************************************************************ > >>> STARTUP_MSG: Starting TaskTracker > >>> STARTUP_MSG: host = master/172.16.0.25 > >>> STARTUP_MSG: args = [] > >>> ************************************************************/ > >>> 2008-01-02 16:28:34,739 INFO org.mortbay.util.Credential: Checking > >>> Resource > >>> aliases > >>> 2008-01-02 16:28:34,827 INFO org.mortbay.http.HttpServer: Version > >>> Jetty/5.1.4 > >>> 2008-01-02 16:28:35,281 INFO org.mortbay.util.Container: Started > >>> [EMAIL PROTECTED] > >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started > >>> WebApplicationContext[/,/] > >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started > >>> HttpContext[/logs,/logs] > >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started > >>> HttpContext[/static,/static] > >>> 2008-01-02 16:28:35,336 INFO org.mortbay.http.SocketListener: Started > >>> SocketListener on 0.0.0.0:50060 > >>> 2008-01-02 16:28:35,336 INFO org.mortbay.util.Container: Started > >>> [EMAIL PROTECTED] > >>> 2008-01-02 16:28:35,383 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: > >>> Initializing JVM Metrics with processName=TaskTracker, sessionId= > >>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker: > >>> TaskTracker up at: /127.0.0.1:49599 > >>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker: > >>> Starting > >>> tracker tracker_master:/127.0.0.1:49599 > >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server > >>> listener on 49599: starting > >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server > >>> handler 0 on 49599: starting > >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server > >>> handler 1 on 49599: starting > >>> 2008-01-02 16:28:35,490 INFO org.apache.hadoop.mapred.TaskTracker: > >>> Starting > >>> thread: Map-events fetcher for all reduce tasks on > >>> tracker_master:/127.0.0.1:49599 > >>> 2008-01-02 16:28:35,500 INFO org.apache.hadoop.mapred.TaskTracker: > Lost > >>> connection to JobTracker [localhost/127.0.0.1:54311]. Retrying... > >>> org.apache.hadoop.ipc.RemoteException: > >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem > >>> object > >>> not available yet > >>> at > >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java > :1475) > >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >>> at > >>> sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:39) > >>> at > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:25) > >>> at java.lang.reflect.Method.invoke(Method.java:585) > >>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) > >>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) > >>> > >>> at org.apache.hadoop.ipc.Client.call(Client.java:482) > >>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184) > >>> at org.apache.hadoop.mapred.$Proxy0.getFilesystemName(Unknown > Source) > >>> at > >>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java > :773) > >>> at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1179) > >>> at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java > :1880) > >>> ******************************************* > >>> > >>> Please help me to resolve the same. > >>> > >>> > >>> Khalil Honsali wrote: > >>> > >>>>Hi, > >>>> > >>>>I think you need to post more information, for example an excerpt of > the > >>>>failing datanode log. Also, please clarify the issue of connectivity: > >>>>- are you able to ssh passwordless (from master to slave, slave to > master, > >>>>slave to slave, master to master), you shouldn't be entering passwrd > >>>>everytime... > >>>>- are you able to telnet (not necessary but preferred) > >>>>- have you verified the ports as RUNNING on using netstat command? > >>>> > >>>>besides, the tasktracker starts ok but not the datanode? > >>>> > >>>>K. Honsali > >>>> > >>>>On 02/01/2008, Dhaya007 <[EMAIL PROTECTED]> wrote: > >>>> > >>>>> > >>>>>I am new to hadoop if any think wrong please correct me .... > >>>>>I Have configured a single/multi node cluster using following link > >>>>> > >>>>> > http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29 > >>>>>. > >>>>>I have followed the link but i am not able to start the haoop in > multi > >>>>>node > >>>>>environment > >>>>>The problems i am facing are as Follows: > >>>>>1.I have configured master and slave nodes with ssh less pharase if > try > >>>>>to > >>>>>run the start-dfs.sh it prompt the password for master:slave > machines.(I > >>>>>have copied the .ssh/id_rsa.pub key of master in to slaves > autherized_key > >>>>>file) > >>>>> > >>>>>2.After giving password datanode,namenode,jobtracker,tasktraker > started > >>>>>successfully in master but datanode is started in slave. > >>>>> > >>>>> > >>>>>3.Some time step 2 works and some time it says that permission > denied. > >>>>> > >>>>>4.I have checked the log file in the slave for datanode it says that > >>>>>incompatible node, then i have formated the slave, master and start > the > >>>>>dfs > >>>>>by start-dfs.sh still i am getting the error > >>>>> > >>>>> > >>>>>The host entry in etc/hosts are both master/slave > >>>>>master > >>>>>slave > >>>>>conf/masters > >>>>>master > >>>>>conf/slaves > >>>>>master > >>>>>slave > >>>>> > >>>>>The hadoop-site.xml for both master/slave > >>>>><?xml version="1.0"?> > >>>>><?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > >>>>> > >>>>><!-- Put site-specific property overrides in this file. --> > >>>>> > >>>>><configuration> > >>>>><property> > >>>>> <name>hadoop.tmp.dir</name> > >>>>> <value>/home/hdusr/hadoop-${user.name}</value> > >>>>> <description>A base for other temporary directories.</description> > >>>>></property> > >>>>> > >>>>><property> > >>>>> <name>fs.default.name</name> > >>>>> <value>hdfs://master:54310</value> > >>>>> <description>The name of the default file system. A URI whose > >>>>> scheme and authority determine the FileSystem implementation. The > >>>>> uri's scheme determines the config property (fs.SCHEME.impl) naming > >>>>> the FileSystem implementation class. The uri's authority is used > to > >>>>> determine the host, port, etc. for a filesystem.</description> > >>>>></property> > >>>>> > >>>>><property> > >>>>> <name>mapred.job.tracker</name> > >>>>> <value>master:54311</value> > >>>>> <description>The host and port that the MapReduce job tracker runs > >>>>> at. If "local", then jobs are run in-process as a single map > >>>>> and reduce task. > >>>>> </description> > >>>>></property> > >>>>> > >>>>><property> > >>>>> <name>dfs.replication</name> > >>>>> <value>2</value> > >>>>> <description>Default block replication. > >>>>> The actual number of replications can be specified when the file is > >>>>>created. > >>>>> The default is used if replication is not specified in create time. > >>>>> </description> > >>>>></property> > >>>>> > >>>>><property> > >>>>> <name>mapred.map.tasks</name> > >>>>> <value>20</value> > >>>>> <description>As a rule of thumb, use 10x the number of slaves (i.e > ., > >>>>>number of tasktrackers). > >>>>> </description> > >>>>></property> > >>>>> > >>>>><property> > >>>>> <name>mapred.reduce.tasks</name> > >>>>> <value>4</value> > >>>>> <description>As a rule of thumb, use 2x the number of slave > >>>>> processors > >>>>>(i.e., number of tasktrackers). > >>>>> </description> > >>>>></property> > >>>>></configuration> > >>>>> > >>>>>Please help me to reslove the same. Or else provide any other > tutorial > >>>>>for > >>>>>multi node cluster setup.I am egarly waiting for the tutorials. > >>>>> > >>>>> > >>>>>Thanks > >>>>> > >>>>>-- > >>>>>View this message in context: > >>>>> > http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14573889.html > >>>>>Sent from the Hadoop Users mailing list archive at Nabble.com. > >>>>> > >>>>> > >>>> > >>>> > >>> > >> > >> > >> > > > > > > -- > View this message in context: > http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14594256.html > Sent from the Hadoop Users mailing list archive at Nabble.com. > >