Re: datanode not being started

Rasit OZDAS Mon, 16 Feb 2009 23:45:08 -0800

Sandy, I have no idea about your issue :(

Zander,
Your problem is probably about this JIRA issue:
http://issues.apache.org/jira/browse/HADOOP-1212


Here is 2 workarounds explained:
http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster)#java.io.IOException:_Incompatible_namespaceIDs

I haven't tried it, hope it helps.
Rasit

2009/2/17 zander1013 <zander1...@gmail.com>:
>
> hi,
>
> i am not seeing the DataNode run either. but i am seeing an extra process
> TaskTracker run.
>
> here is what hapens when i start the cluster run jps and stop the cluster...
>
> had...@node0:/usr/local/hadoop$ bin/start-all.sh
> starting namenode, logging to
> /usr/local/hadoop/bin/../logs/hadoop-hadoop-namenode-node0.out
> node0.local: starting datanode, logging to
> /usr/local/hadoop/bin/../logs/hadoop-hadoop-datanode-node0.out
> node1.local: starting datanode, logging to
> /usr/local/hadoop/bin/../logs/hadoop-hadoop-datanode-node1.out
> node0.local: starting secondarynamenode, logging to
> /usr/local/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-node0.out
> starting jobtracker, logging to
> /usr/local/hadoop/bin/../logs/hadoop-hadoop-jobtracker-node0.out
> node0.local: starting tasktracker, logging to
> /usr/local/hadoop/bin/../logs/hadoop-hadoop-tasktracker-node0.out
> node1.local: starting tasktracker, logging to
> /usr/local/hadoop/bin/../logs/hadoop-hadoop-tasktracker-node1.out
> had...@node0:/usr/local/hadoop$ jps
> 13353 TaskTracker
> 13126 SecondaryNameNode
> 12846 NameNode
> 13455 Jps
> 13232 JobTracker
> had...@node0:/usr/local/hadoop$ bin/stop-all.sh
> stopping jobtracker
> node0.local: stopping tasktracker
> node1.local: stopping tasktracker
> stopping namenode
> node0.local: no datanode to stop
> node1.local: no datanode to stop
> node0.local: stopping secondarynamenode
> had...@node0:/usr/local/hadoop$
>
> here is the tail of the log file for the session above...
> ************************************************************/
> 2009-02-16 19:35:13,999 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting DataNode
> STARTUP_MSG:   host = node1/127.0.1.1
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.19.0
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r 713890;
> compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008
> ************************************************************/
> 2009-02-16 19:35:18,999 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException:
> Incompatible namespaceIDs in
> /usr/local/hadoop-datastore/hadoop-hadoop/dfs/data: namenode namespaceID =
> 1050914495; datanode namespaceID = 722953254
>    at
> org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:233)
>    at
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:148)
>    at
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:287)
>    at
> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:205)
>    at
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1199)
>    at
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1154)
>    at
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1162)
>    at
> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1284)
>
> 2009-02-16 19:35:19,000 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down DataNode at node1/127.0.1.1
> ************************************************************/
>
> i have not seen DataNode run yet. i have only started and stopped the
> cluster a couple of times.
>
> i tried to reformat datanode and namenode with bin/hadoop datanode -format
> and bin/hadoop namenode -format from /usr/local/hadoop dir.
>
> please advise
>
> zander
>
>
>
> Mithila Nagendra wrote:
>>
>> Hey Sandy
>> I had a similar problem with Hadoop. All I did was I stopped all the
>> daemons
>> using stop-all.sh. Then formatted the namenode again using hadoop namenode
>> -format. After this I went on to restarting everything by using
>> start-all.sh
>>
>> I hope you dont have much data on the datanode, reformatting it would
>> erase
>> everything out.
>>
>> Hope this helps!
>> Mithila
>>
>>
>>
>> On Sat, Feb 14, 2009 at 2:39 AM, james warren <ja...@rockyou.com> wrote:
>>
>>> Sandy -
>>>
>>> I suggest you take a look into your NameNode and DataNode logs.  From the
>>> information posted, these likely would be at
>>>
>>>
>>> /Users/hadoop/hadoop-0.18.2/bin/../logs/hadoop-hadoop-namenode-loteria.cs.tamu.edu.log
>>>
>>> /Users/hadoop/hadoop-0.18.2/bin/../logs/hadoop-hadoop-jobtracker-loteria.cs.tamu.edu.log
>>>
>>> If the cause isn't obvious from what you see there, could you please post
>>> the last few lines from each log?
>>>
>>> -jw
>>>
>>> On Fri, Feb 13, 2009 at 3:28 PM, Sandy <snickerdoodl...@gmail.com> wrote:
>>>
>>> > Hello,
>>> >
>>> > I would really appreciate any help I can get on this! I've suddenly ran
>>> > into
>>> > a very strange error.
>>> >
>>> > when I do:
>>> > bin/start-all
>>> > I get:
>>> > hadoop$ bin/start-all.sh
>>> > starting namenode, logging to
>>> >
>>> >
>>> /Users/hadoop/hadoop-0.18.2/bin/../logs/hadoop-hadoop-namenode-loteria.cs.tamu.edu.out
>>> > starting jobtracker, logging to
>>> >
>>> >
>>> /Users/hadoop/hadoop-0.18.2/bin/../logs/hadoop-hadoop-jobtracker-loteria.cs.tamu.edu.out
>>> >
>>> > No datanode, secondary namenode or jobtracker are being started.
>>> >
>>> > When I try to upload anything on the dfs, I get a "node in safemode"
>>> error
>>> > (even after waiting 5 minutes), presumably because it's trying to reach
>>> a
>>> > datanode that does not exist.  The same "safemode" error occurs when I
>>> try
>>> > to run jobs.
>>> >
>>> > I have tried bin/stop-all and then bin/start-all again. I get the same
>>> > problem!
>>> >
>>> > This is incredibly strange, since I was previously able to start and
>>> run
>>> > jobs without any issue using this version on this machine. I am running
>>> > jobs
>>> > on a single Mac Pro running OS X 10.5
>>> >
>>> > I have tried updating to hadoop-0.19.0, and I get the same problem. I
>>> have
>>> > even tried this using previous versions, and I'm getting the same
>>> problem!
>>> >
>>> > Anyone have any idea why this suddenly could be happening? What am I
>>> doing
>>> > wrong?
>>> >
>>> > For convenience, I'm including portions of both conf/hadoop-env.sh and
>>> > conf/hadoop-site.xml:
>>> >
>>> > --- hadoop-env.sh ---
>>> >  # Set Hadoop-specific environment variables here.
>>> >
>>> > # The only required environment variable is JAVA_HOME.  All others are
>>> > # optional.  When running a distributed configuration it is best to
>>> > # set JAVA_HOME in this file, so that it is correctly defined on
>>> > # remote nodes.
>>> >
>>> > # The java implementation to use.  Required.
>>> >  export
>>> >
>>> JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/Home
>>> >
>>> > # Extra Java CLASSPATH elements.  Optional.
>>> > # export HADOOP_CLASSPATH=
>>> >
>>> > # The maximum amount of heap to use, in MB. Default is 1000.
>>> >  export HADOOP_HEAPSIZE=3000
>>> > ...
>>> > --- hadoop-site.xml ---
>>> > <configuration>
>>> >
>>> > <property>
>>> >  <name>hadoop.tmp.dir</name>
>>> >  <value>/Users/hadoop/hadoop-0.18.2/hadoop-${user.name}</value>
>>> >  <description>A base for other temporary directories.</description>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>fs.default.name</name>
>>> >  <value>hdfs://localhost:9000</value>
>>> >  <description>The name of the default file system.  A URI whose
>>> >  scheme and authority determine the FileSystem implementation.  The
>>> >  uri's scheme determines the config property (fs.SCHEME.impl) naming
>>> >  the FileSystem implementation class.  The uri's authority is used to
>>> >  determine the host, port, etc. for a filesystem.</description>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>mapred.job.tracker</name>
>>> >  <value>localhost:9001</value>
>>> >  <description>The host and port that the MapReduce job tracker runs
>>> >  at.  If "local", then jobs are run in-process as a single map
>>> >  and reduce task.
>>> >  </description>
>>> > </property>
>>> >
>>> > <property>
>>> > <name>mapred.tasktracker.tasks.maximum</name>
>>> > <value>1</value>
>>> > <description>The maximum number of tasks that will be run
>>> simultaneously
>>> by
>>> > a
>>> > a task tracker
>>> > </description>
>>> > </property>
>>> > ...
>>> >
>>>
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/datanode-not-being-started-tp22006929p22049288.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>



-- 
M. Raşit ÖZDAŞ

Re: datanode not being started

Reply via email to