The 5 RS will be connecting to all 10 DNs. However, when writing to HDFS the first replica always goes to the local node. Because of this, the 5 DNs that are hosting the 5 RS could potentially have more data than the other 5 DNs.
In almost all installations I've been a part of the #RS == #DN so I don't much practical experience with how this might behave. JG > -----Original Message----- > From: Hari Sreekumar [mailto:[email protected]] > Sent: Tuesday, December 14, 2010 8:49 PM > To: user > Subject: Confusion on the role of regionserver > > Hi, > > HBase stores data on HDFS, and RS handles read/write requests. But how > does all this fit in? e.g, say I have a 10 node cluster, all running datanode > and > only 5 of them running RS process. So, will the HBase data be only on the RS > nodes or distributed across all DNs? > > Thanks, > Hari
