What is the replication factor you have set on the cluster? If it is 2 then the data should be evenly balanced between the two nodes.
Use "hadoop dfsadmin -report" command to get a full report of the datanodes. AFAIK it should make no difference in how data is distributed/replicated between HDFS datanodes even when you have Hbase on top of it. Thanks Divye Sheth On Wed, Mar 12, 2014 at 12:37 PM, Kashif Jawed Siddiqui <[email protected] > wrote: > You should use hadoop fs command OR hdfs dfs command to check > > > Usage: hadoop fs [generic options] -du [-s] [-h] <path> ... > > OR > > Usage: hdfs dfs -du [-s] [-h] <path> ... > > > > > > Regards > > KASHIF > > > > -----Original Message----- > From: Vimal Jain [mailto:[email protected]] > Sent: 12 March 2014 11:31 > To: [email protected]; [email protected] > Subject: Size of data directory same on all nodes in cluster > > > > Hi, > > I have setup 2 node Hbase cluster on top of 2 node HDFS cluster. > > When i perform "du -sh" command on data directory ( where hadoop stores > data ) on both machines , its shows the same size. > > As per my understanding , of entire data half of the data is stored in one > machine and other half on other machine. > > Please help. > > > > -- > > Thanks and Regards, > > Vimal Jain >
