Re: Inconsistency in namenode's and datanode's namespaceID

2008-07-03 Thread Konstantin Shvachko

Yes this is a known bug.
http://issues.apache.org/jira/browse/HADOOP-1212
You should manually remove current directory from every data-node
after reformatting the name-node and start the cluster again.
I do not believe there is any other way.
Thanks,
--Konstantin

Taeho Kang wrote:

No, I don't think it's a bug.

Your datanodes' data partition/directory was probably used in other HDFS
setup and thus had other namespaceID.

Or you could've used other partition/directory for your new HDFS setup by
setting different values for dfs.data.dir on your datanode. But in this
case, you can't access your old HDFS's data.


On Thu, Jul 3, 2008 at 4:21 AM, Xuan Dzung Doan [EMAIL PROTECTED]
wrote:


I was following the quickstart guide to run pseudo-distributed operations
with Hadoop 0.16.4. I got it to work successfully the first time. But I
failed to repeat the steps (I tried to re-do everything from re-formating
the HDFS). Then by looking at the log files of the daemons, I found out the
datanode failed to start because its namespaceID didn't match with the
namenode's. I after that found that the namespaceID is stored in the text
file VERSION under dfs/data/current and dfs/name/current for the datanode
and the namenode, respectively. The reformatting step does change
namespaceID of the namenode, but not for the datanode, and that's the cause
for the inconsistency. So after reformatting, if I manually update
namespaceID for the datanode, things will work totally fine again.

I guess there are probably others who had this same experience. Is it a bug
in Hadoop 0.16.4? If so, has it been taken care of in later versions?

Thanks,
David.








Re: Inconsistency in namenode's and datanode's namespaceID

2008-07-02 Thread Taeho Kang
No, I don't think it's a bug.

Your datanodes' data partition/directory was probably used in other HDFS
setup and thus had other namespaceID.

Or you could've used other partition/directory for your new HDFS setup by
setting different values for dfs.data.dir on your datanode. But in this
case, you can't access your old HDFS's data.


On Thu, Jul 3, 2008 at 4:21 AM, Xuan Dzung Doan [EMAIL PROTECTED]
wrote:

 I was following the quickstart guide to run pseudo-distributed operations
 with Hadoop 0.16.4. I got it to work successfully the first time. But I
 failed to repeat the steps (I tried to re-do everything from re-formating
 the HDFS). Then by looking at the log files of the daemons, I found out the
 datanode failed to start because its namespaceID didn't match with the
 namenode's. I after that found that the namespaceID is stored in the text
 file VERSION under dfs/data/current and dfs/name/current for the datanode
 and the namenode, respectively. The reformatting step does change
 namespaceID of the namenode, but not for the datanode, and that's the cause
 for the inconsistency. So after reformatting, if I manually update
 namespaceID for the datanode, things will work totally fine again.

 I guess there are probably others who had this same experience. Is it a bug
 in Hadoop 0.16.4? If so, has it been taken care of in later versions?

 Thanks,
 David.