The 5 RS will be connecting to all 10 DNs.  However, when writing to HDFS the 
first replica always goes to the local node.  Because of this, the 5 DNs that 
are hosting the 5 RS could potentially have more data than the other 5 DNs.

In almost all installations I've been a part of the #RS == #DN so I don't much 
practical experience with how this might behave.

JG
 
> -----Original Message-----
> From: Hari Sreekumar [mailto:[email protected]]
> Sent: Tuesday, December 14, 2010 8:49 PM
> To: user
> Subject: Confusion on the role of regionserver
> 
> Hi,
> 
>         HBase stores data on HDFS, and RS handles read/write requests. But how
> does all this fit in? e.g, say I have a 10 node cluster, all running datanode 
> and
> only 5 of them running RS process. So, will the HBase data be only on the RS
> nodes or distributed across all DNs?
> 
> Thanks,
> Hari

Reply via email to