On the first run you want namenode to initialize its directories (where it store VERSION file, fsimage and edits). On the subsequent formats - you are making sure you have a new EMPTY file system. If you don't do format NameNode will load up fsimage and edits. There is also matter of generating new space id, which is matched against Datanode's ones. So if you format Namenode you need to cleanup data from Datanodes.
On the other hand, if you just add Datanodes to a running cluster - you don't have to format NN. Boris. On 3/9/11 8:27 PM, "Adarsh Sharma" <[email protected]> wrote: > Dear all, > > I have configured several times a Hadoop Cluster of 2,3,5,8 nodes but > one doubt in my mind always occur. > Why it is necessary to format Hadoop Namenode by *bin/hadoop -namenode > format *command. > What is the reason and logic behind this. > > Please justify if someone knows. > > > Thanks & best Regards, > > Adarsh Sharma
