Hi Yong,

HBase trades availability for consistencies (please see recent thread on this 
list: "HBase and Consistency in CAP")

RegionServers read data from HDFS, if you don't consider performance 
implications from data locality for a moment then
it does not matter from where the RegionServer reads the data.

You can look at it this way: Datanodes (HDFS) handle the distribution of the 
data, RegionServer handle distribution of CPU load and
control access to the data.

Hotspotting is a problem. You deal with it by avoiding it :), i.e. distribute 
the data in a way that does not lead to hotspotting (by choosing proper row 
keys).
Hot regions can be split, etc.

According to documentation found here 
https://issues.apache.org/jira/browse/HDFS-265, hflush only returns to client 
when all nodes
in the pipeline have sync'ed the data.

-- Lars


----- Original Message -----
From: yonghu <[email protected]>
To: [email protected]
Cc: 
Sent: Friday, December 9, 2011 11:40 AM
Subject: availability and data replica issues of HBase

Hello,

I read some discussions from the mail-list. It mentions the read and
write operations for the same data object will be routed into the same
RegionServer. This strategy can guarantee data consistency. But, how
about availability?  If this RegionServer is down or temporarily not
available, the master will assign a new RegionServer for processing
data request or just wait until that RegionServer comes back? If mater
assigns new RegionServer, how can new RegionServer obtains data?

The other issue is about work-balance. If a huge amount of read and
write operations only apply on a small set of data, one RegionServer
may become a hot-spot. How HBase deal with this problems?

The last question is about data replica.  The HBase data is still
stored in HDFS. HDFS will use eager synchronization (pipelining) tot
synchronize all replicas. If HBase write data into HDFS, when should
HDFS return the write finishing acknowledge to HBase, just waiting
until one replica update or until all replicas update?

Thanks

Yong

Reply via email to