Thanks, Jeff. deleting the contents of dfs.data.dir on the cloned data note worked.
On Wed, May 11, 2011 at 5:02 PM, Jeff Bean <jwfb...@cloudera.com> wrote: > If I understand correctly, datanode reports its blocks based on the > contents of dfs.data.dir. > > When you cloned the data node, you cloned all of its blocks as well. > > When you add a "fresh" datanode to the cluster, you add one that has an > empty dfs.data.dir. > > Try clearing out dfs.data.dir before adding the new node. > > Jeff > > > > On Wed, May 11, 2011 at 1:59 PM, Steve Cohen <mail4st...@gmail.com> wrote: > >> Hello, >> >> We are running an hdfs cluster and we decided we wanted to add a new >> datanode. Since we are using a virtual machine, we just cloned an existing >> datanode. We added it to the slaves list and started up the cluster. We >> started getting log messages like this in the namenode log: >> >> 2011-05-11 15:59:44,148 ERROR hdfs.StateChange - BLOCK* >> NameSystem.getDatanode: Data node 10.104.211.58:50010 is attempting to >> report storage ID DS-1360904153-10.104.211.57-50010-1293288346692. Node >> 10.104.211.57:50010 is expected to serve this storage. >> 2011-05-11 15:59:46,975 ERROR hdfs.StateChange - BLOCK* >> NameSystem.getDatanode: Data node 10.104.211.57:50010 is attempting to >> report storage ID DS-1360904153-10.104.211.57-50010-1293288346692. Node >> 10.104.211.58:50010 is expected to serve this storage. >> >> I understand that this is because the datanodes have the exact same >> information so the first data node that connects has precedence. >> >> Is it possible to just wipe one of the datanodes so it is blank or do we >> have to format the entire hdfs filesystem from the namenode to add the new >> datanode. >> >> Thanks, >> Steve Cohen >> > >