The feature depends on hdfs support. Once we have that, we can implement this feature in HBase.
Cheers On Tue, Jan 22, 2013 at 8:49 AM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote: > This sounds hugely useful to me and is one of those "why doesn't HBase have > that" things that bugged me. > > Is there an issue to watch? > > > http://search-hadoop.com/?q=region+failover+secondary&fc_project=HBase&fc_type=issuedoesn't > find any. > > Thanks, > Otis > -- > HBASE Performance Monitoring - http://sematext.com/spm/index.html > > > > On Mon, Jan 21, 2013 at 7:55 PM, Jonathan Hsieh <j...@cloudera.com> wrote: > > > The main motivation is to maintain good performance on RS failovers. > > This is also tied with hdfs and its block placement policy. Let me > > explain as I understand it. If we control the hdfs block placement > > strategy we can write all blocks for a hfile (or for all hfiles > > related to a region) to the same set of data nodes. If the RS fails, > > they favor failover to a node that has a local copy of all the blocks. > > > > Today, when you write an hfile to hdfs, for each block the first > > replica goes to the local data node but the others get disbursed > > around the cluster randomly at a per block granularity. The problem > > here is that if the rs fails, the new rs that gets the responsibility > > for the region has to read files that are spread all over the cluster > > and with roughly 1/nth of the data local. This means that the > > recovered region is slower until a compaction localizes the data gain. > > > > They've gone in and modified hdfs and their hbase to take advantage of > > this idea. I believe the randomization policy is enforced per region > > -- if an rs serves 25 region, all the files within a each region are > > sent to the same set of secondary/tertiary nodes, but each region > > sends to a different set of secondary/tertiary nodes. > > > > Jon. > > > > > > On Mon, Jan 21, 2013 at 3:48 PM, Devaraj Das <d...@hortonworks.com> > wrote: > > > In 0.89-fb branch I stumbled upon stuff that indicated that there is a > > > concept of secondary and tertiary regionserver. Could someone with > > > more insights please shed some light on this? > > > Might be useful to do the analysis on whether it makes sense for > trunk.. > > > Thanks > > > Devaraj > > > > > > > > -- > > // Jonathan Hsieh (shay) > > // Software Engineer, Cloudera > > // j...@cloudera.com > > >