Thanks Arun, I am able to riun the datanode in slave (As per the solution given by You (listinig port ))
But still it asks the pasword while starting the dfs ans mapreduce First i generated rsa as password less as follws ssh-keygen -t rsa -P "" cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys ssh master ssh slave I started the dfs in master as follows /bin/start-dfs.sh it asks the passowrd Please help me to resolve the same (I dont know i am doing right in the case of ssh) Dhaya007 wrote: > > > > Arun C Murthy wrote: >> >> What version of Hadoop are you running? >> Dhaya007:hadoop-0.15.1 >> >> http://wiki.apache.org/lucene-hadoop/Help >> >> Dhaya007 wrote: >> > ..datanode-slave.log >>> 2007-12-19 19:30:55,579 WARN org.apache.hadoop.dfs.DataNode: Invalid >>> directory in dfs.data.dir: directory is not writable: >>> /tmp/hadoop-hdpusr/dfs/data >>> 2007-12-19 19:30:55,579 ERROR org.apache.hadoop.dfs.DataNode: All >>> directories in dfs.data.dir are invalid. >> >> Did you check that directory? >> Daya007:Yes, i have checked the folder in which there is no file saved. >> >> DataNode is complaining that it doesn't have any 'valid' directories to >> store data in. >> >>> Tasktracker_slav.log >>> 2008-01-02 15:10:34,419 ERROR org.apache.hadoop.mapred.TaskTracker: Can >>> not >>> start task tracker because java.net.UnknownHostException: unknown host: >>> localhost >>> at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:136) >>> at org.apache.hadoop.ipc.Client.getConnection(Client.java:532) >>> at org.apache.hadoop.ipc.Client.call(Client.java:471) >>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184) >>> at org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown Source) >>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269) >>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:293) >>> at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:246) >>> at >>> org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:427) >>> at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:717) >>> at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1880) >>> >> >> That probably means that the TaskTracker's hadoop-site.xml says that >> 'localhost' is the JobTracker which isn't true... >> >> hadoop-site.xml is as follows >> <?xml version="1.0"?> >> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> >> >> <!-- Put site-specific property overrides in this file. --> >> >> <configuration> >> <property> >> <name>hadoop.tmp.dir</name> >> <value>/home/hdusr/hadoop-${user.name}</value> >> <description>A base for other temporary directories.</description> >> </property> >> >> <property> >> <name>fs.default.name</name> >> <value>hdfs://master:54310</value> >> <description>The name of the default file system. A URI whose >> scheme and authority determine the FileSystem implementation. The >> uri's scheme determines the config property (fs.SCHEME.impl) naming >> the FileSystem implementation class. The uri's authority is used to >> determine the host, port, etc. for a filesystem.</description> >> </property> >> >> <property> >> <name>mapred.job.tracker</name> >> <value>master:54311</value> >> <description>The host and port that the MapReduce job tracker runs >> at. If "local", then jobs are run in-process as a single map >> and reduce task. >> </description> >> </property> >> >> <property> >> <name>dfs.replication</name> >> <value>2</value> >> <description>Default block replication. >> The actual number of replications can be specified when the file is >> created. >> The default is used if replication is not specified in create time. >> </description> >> </property> >> >> <property> >> <name>mapred.map.tasks</name> >> <value>20</value> >> <description>As a rule of thumb, use 10x the number of slaves (i.e., >> number of tasktrackers). >> </description> >> </property> >> >> <property> >> <name>mapred.reduce.tasks</name> >> <value>4</value> >> <description>As a rule of thumb, use 2x the number of slave processors >> (i.e., number of tasktrackers). >> </description> >> </property> >> </configuration> >> >> > namenode-master.log >> > 2008-01-02 14:44:02,636 INFO org.apache.hadoop.dfs.Storage: Storage >> > directory /tmp/hadoop-hdpusr/dfs/name does not exist. >> > 2008-01-02 14:44:02,638 INFO org.apache.hadoop.ipc.Server: Stopping >> server >> > on 54310 >> > 2008-01-02 14:44:02,653 ERROR org.apache.hadoop.dfs.NameNode: >> > org.apache.hadoop.dfs.InconsistentFSStateException: Directory >> > /tmp/hadoop-hdpusr/dfs/name is in an inconsistent state: storage >> directory >> > does not exist or is not accessible. >> >> That means that, /tmp/hadoop-hdpusr/dfs/name doesn't exist or isn't >> accessible. >> >> Dhaya007 I have checked the name folder but i wont find any folder in the >> specified dir >> -*-*- >> >> Overall, this looks like an acute case of wrong-configuration-itis. >> Please provid the corect configuration site example for multi node >> cluster other than >> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29 >> because i followed the same >> >> Have you got the same hadoop-site.xml on all your nodes? >> Dhaya007:Yes >> >> More info here: >> http://lucene.apache.org/hadoop/docs/r0.15.1/cluster_setup.html >> Dhaya007: I followed the same site you have mentioned but no solution >> >> Arun >> >> >>> 2008-01-02 15:10:34,420 INFO org.apache.hadoop.mapred.TaskTracker: >>> SHUTDOWN_MSG: >>> /************************************************************ >>> SHUTDOWN_MSG: Shutting down TaskTracker at slave/172.16.0.58 >>> ************************************************************/ >>> >>> >>> And all the ports are running >>> Some time it asks password and some time it wont while starting the dfs >>> >>> Master logs >>> 2008-01-02 14:44:02,677 INFO org.apache.hadoop.dfs.NameNode: >>> SHUTDOWN_MSG: >>> /************************************************************ >>> SHUTDOWN_MSG: Shutting down NameNode at master/172.16.0.25 >>> ************************************************************/ >>> >>> Datanode-master.log >>> 2008-01-02 16:26:32,380 INFO org.apache.hadoop.ipc.RPC: Server at >>> localhost/127.0.0.1:54310 not available yet, Zzzzz... >>> 2008-01-02 16:26:33,390 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect >>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s). >>> 2008-01-02 16:26:34,400 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect >>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s). >>> 2008-01-02 16:26:35,410 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect >>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s). >>> 2008-01-02 16:26:36,420 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect >>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s). >>> *********************************************** >>> Jobtracker_master.log >>> 2008-01-02 16:25:41,040 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect >>> to server: localhost/127.0.0.1:54310. Already tried 10 time(s). >>> 2008-01-02 16:25:42,050 INFO org.apache.hadoop.mapred.JobTracker: >>> problem >>> cleaning system directory: /tmp/hadoop-hdpusr/mapred/system >>> java.net.ConnectException: Connection refused >>> at java.net.PlainSocketImpl.socketConnect(Native Method) >>> at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) >>> at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195) >>> at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) >>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) >>> at java.net.Socket.connect(Socket.java:520) >>> at >>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:152) >>> at org.apache.hadoop.ipc.Client.getConnection(Client.java:542) >>> at org.apache.hadoop.ipc.Client.call(Client.java:471) >>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184) >>> at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown Source) >>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269) >>> at org.apache.hadoop.dfs.DFSClient.createNamenode(DFSClient.java:147) >>> at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:161) >>> at >>> org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSystem.java:65) >>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:159) >>> at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118) >>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90) >>> at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:683) >>> at >>> org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:120) >>> at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:2052) >>> 2008-01-02 16:25:42,931 INFO org.apache.hadoop.ipc.Server: IPC Server >>> handler 5 on 54311, call getFilesystemName() from 127.0.0.1:49283: >>> error: >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem >>> object >>> not available yet >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem >>> object >>> not available yet >>> at >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475) >>> at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:585) >>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) >>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) >>> 2008-01-02 16:25:47,942 INFO org.apache.hadoop.ipc.Server: IPC Server >>> handler 6 on 54311, call getFilesystemName() from 127.0.0.1:49293: >>> error: >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem >>> object >>> not available yet >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem >>> object >>> not available yet >>> at >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475) >>> at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:585) >>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) >>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) >>> 2008-01-02 16:25:52,061 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect >>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s). >>> 2008-01-02 16:25:52,951 INFO org.apache.hadoop.ipc.Server: IPC Server >>> handler 7 on 54311, call getFilesystemName() from 127.0.0.1:49304: >>> error: >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem >>> object >>> not available yet >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem >>> object >>> not available yet >>> at >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475) >>> at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:585) >>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) >>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) >>> 2008-01-02 16:25:53,070 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect >>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s). >>> 2008-01-02 16:25:54,080 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect >>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s). >>> 2008-01-02 16:25:55,090 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect >>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s). >>> 2008-01-02 16:25:56,100 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect >>> to server: localhost/127.0.0.1:54310. Already tried 5 time(s). >>> 2008-01-02 16:25:56,281 INFO org.apache.hadoop.mapred.JobTracker: >>> SHUTDOWN_MSG: >>> /************************************************************ >>> SHUTDOWN_MSG: Shutting down JobTracker at master/172.16.0.25 >>> ************************************************************/ >>> >>> Tasktracker_master.log >>> 2008-01-02 16:26:14,080 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect >>> to server: localhost/127.0.0.1:54311. Already tried 2 time(s). >>> 2008-01-02 16:28:34,510 INFO org.apache.hadoop.mapred.TaskTracker: >>> STARTUP_MSG: >>> /************************************************************ >>> STARTUP_MSG: Starting TaskTracker >>> STARTUP_MSG: host = master/172.16.0.25 >>> STARTUP_MSG: args = [] >>> ************************************************************/ >>> 2008-01-02 16:28:34,739 INFO org.mortbay.util.Credential: Checking >>> Resource >>> aliases >>> 2008-01-02 16:28:34,827 INFO org.mortbay.http.HttpServer: Version >>> Jetty/5.1.4 >>> 2008-01-02 16:28:35,281 INFO org.mortbay.util.Container: Started >>> [EMAIL PROTECTED] >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started >>> WebApplicationContext[/,/] >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started >>> HttpContext[/logs,/logs] >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started >>> HttpContext[/static,/static] >>> 2008-01-02 16:28:35,336 INFO org.mortbay.http.SocketListener: Started >>> SocketListener on 0.0.0.0:50060 >>> 2008-01-02 16:28:35,336 INFO org.mortbay.util.Container: Started >>> [EMAIL PROTECTED] >>> 2008-01-02 16:28:35,383 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: >>> Initializing JVM Metrics with processName=TaskTracker, sessionId= >>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker: >>> TaskTracker up at: /127.0.0.1:49599 >>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker: >>> Starting >>> tracker tracker_master:/127.0.0.1:49599 >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server >>> listener on 49599: starting >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server >>> handler 0 on 49599: starting >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server >>> handler 1 on 49599: starting >>> 2008-01-02 16:28:35,490 INFO org.apache.hadoop.mapred.TaskTracker: >>> Starting >>> thread: Map-events fetcher for all reduce tasks on >>> tracker_master:/127.0.0.1:49599 >>> 2008-01-02 16:28:35,500 INFO org.apache.hadoop.mapred.TaskTracker: Lost >>> connection to JobTracker [localhost/127.0.0.1:54311]. Retrying... >>> org.apache.hadoop.ipc.RemoteException: >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem >>> object >>> not available yet >>> at >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:585) >>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) >>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) >>> >>> at org.apache.hadoop.ipc.Client.call(Client.java:482) >>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184) >>> at org.apache.hadoop.mapred.$Proxy0.getFilesystemName(Unknown Source) >>> at >>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:773) >>> at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1179) >>> at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1880) >>> ******************************************* >>> >>> Please help me to resolve the same. >>> >>> >>> Khalil Honsali wrote: >>> >>>>Hi, >>>> >>>>I think you need to post more information, for example an excerpt of the >>>>failing datanode log. Also, please clarify the issue of connectivity: >>>>- are you able to ssh passwordless (from master to slave, slave to master, >>>>slave to slave, master to master), you shouldn't be entering passwrd >>>>everytime... >>>>- are you able to telnet (not necessary but preferred) >>>>- have you verified the ports as RUNNING on using netstat command? >>>> >>>>besides, the tasktracker starts ok but not the datanode? >>>> >>>>K. Honsali >>>> >>>>On 02/01/2008, Dhaya007 <[EMAIL PROTECTED]> wrote: >>>> >>>>> >>>>>I am new to hadoop if any think wrong please correct me .... >>>>>I Have configured a single/multi node cluster using following link >>>>> >>>>>http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29 >>>>>. >>>>>I have followed the link but i am not able to start the haoop in multi >>>>>node >>>>>environment >>>>>The problems i am facing are as Follows: >>>>>1.I have configured master and slave nodes with ssh less pharase if try >>>>>to >>>>>run the start-dfs.sh it prompt the password for master:slave machines.(I >>>>>have copied the .ssh/id_rsa.pub key of master in to slaves autherized_key >>>>>file) >>>>> >>>>>2.After giving password datanode,namenode,jobtracker,tasktraker started >>>>>successfully in master but datanode is started in slave. >>>>> >>>>> >>>>>3.Some time step 2 works and some time it says that permission denied. >>>>> >>>>>4.I have checked the log file in the slave for datanode it says that >>>>>incompatible node, then i have formated the slave, master and start the >>>>>dfs >>>>>by start-dfs.sh still i am getting the error >>>>> >>>>> >>>>>The host entry in etc/hosts are both master/slave >>>>>master >>>>>slave >>>>>conf/masters >>>>>master >>>>>conf/slaves >>>>>master >>>>>slave >>>>> >>>>>The hadoop-site.xml for both master/slave >>>>><?xml version="1.0"?> >>>>><?xml-stylesheet type="text/xsl" href="configuration.xsl"?> >>>>> >>>>><!-- Put site-specific property overrides in this file. --> >>>>> >>>>><configuration> >>>>><property> >>>>> <name>hadoop.tmp.dir</name> >>>>> <value>/home/hdusr/hadoop-${user.name}</value> >>>>> <description>A base for other temporary directories.</description> >>>>></property> >>>>> >>>>><property> >>>>> <name>fs.default.name</name> >>>>> <value>hdfs://master:54310</value> >>>>> <description>The name of the default file system. A URI whose >>>>> scheme and authority determine the FileSystem implementation. The >>>>> uri's scheme determines the config property (fs.SCHEME.impl) naming >>>>> the FileSystem implementation class. The uri's authority is used to >>>>> determine the host, port, etc. for a filesystem.</description> >>>>></property> >>>>> >>>>><property> >>>>> <name>mapred.job.tracker</name> >>>>> <value>master:54311</value> >>>>> <description>The host and port that the MapReduce job tracker runs >>>>> at. If "local", then jobs are run in-process as a single map >>>>> and reduce task. >>>>> </description> >>>>></property> >>>>> >>>>><property> >>>>> <name>dfs.replication</name> >>>>> <value>2</value> >>>>> <description>Default block replication. >>>>> The actual number of replications can be specified when the file is >>>>>created. >>>>> The default is used if replication is not specified in create time. >>>>> </description> >>>>></property> >>>>> >>>>><property> >>>>> <name>mapred.map.tasks</name> >>>>> <value>20</value> >>>>> <description>As a rule of thumb, use 10x the number of slaves (i.e., >>>>>number of tasktrackers). >>>>> </description> >>>>></property> >>>>> >>>>><property> >>>>> <name>mapred.reduce.tasks</name> >>>>> <value>4</value> >>>>> <description>As a rule of thumb, use 2x the number of slave >>>>> processors >>>>>(i.e., number of tasktrackers). >>>>> </description> >>>>></property> >>>>></configuration> >>>>> >>>>>Please help me to reslove the same. Or else provide any other tutorial >>>>>for >>>>>multi node cluster setup.I am egarly waiting for the tutorials. >>>>> >>>>> >>>>>Thanks >>>>> >>>>>-- >>>>>View this message in context: >>>>>http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14573889.html >>>>>Sent from the Hadoop Users mailing list archive at Nabble.com. >>>>> >>>>> >>>> >>>> >>> >> >> >> > > -- View this message in context: http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14594256.html Sent from the Hadoop Users mailing list archive at Nabble.com.