[jira] [Commented] (HBASE-10070) HBase read high-availability using eventually consistent region replicas

stack (JIRA) Fri, 17 Jan 2014 15:46:34 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13875417#comment-13875417
 ]


stack commented on HBASE-10070:
-------------------------------

Ok on the timing.  You know how I feel about 1.0 -- sooner rather than later -- 
but hopefully this feature gets done in time.

Looking at HBASE-10347, I have a 'design level' concern so let me raise it here 
rather than there.  Let me repeat a comment I made there:

{quote}
After thinking more on this, I 'get' why you have the replicas listed inside in 
the row rather than as rows themselves [in hbase:meta].  The row in hbase:meta 
becomes a proxy or facade for the little cluster of regions one of which is the 
primary with the others read replicas.  If that is the case, lets recognize it 
as so and make proper accommodation in the code base and model.

Problems I see are:

+ HRegionInfo now is overloaded.  Before it was the info on a specific region.  
Now it is trying to serve two purposes; its original intent and now too as a 
descriptor on the region-serving 'cluster' made of a primary and replicas.  
Lets avoid overloading what up to this has had a clear role in the hbase model.
+ The primary holds the 'pole position' being the name of the region in meta.  
The read replicas are differently named with the 00001 and 00002, etc., 
interpolated into the middle of the region name.  I suppose doing it this way 
'minimizes' the disturbance in the code base but I'm worried this naming 
exception will only confuse though it minimizes change.  Why would the primary 
not be named like the replica regions?  

On the latter I can hear a reply that goes, "For those who do not need read 
replicas, then they will be unaffected", which I would counter ensures that 
this feature will forever be ghetto and no one will use it because it 
unexercised.

Trying to ensure that we do not paint ourselves into a corner and to avoid the 
ghetto, looking beyond read replicas to full-on quorum read/writes, I can 
imagine we'd need some means like the above where the hbase:meta row name is 
not longer the physical name of a region but rather a logical name.  The 
primary region in the quorum in read replicas is the region number 00000 but 
doing quorum read/writes, the leader will need to be able to change over the 
life of the quorum.
{quote}

Going forward, all regions get an index?  By default the index is zero.  When 
replicas or quorum members, the indices distingush members.  When read replicas 
the region with index 0 is primary.  When a quorum, the index has no special 
meaning.  In the past we have had two naming conventions for regions live side 
by side in the one live cluster.  We could do it again.

> HBase read high-availability using eventually consistent region replicas
> ------------------------------------------------------------------------
>
>                 Key: HBASE-10070
>                 URL: https://issues.apache.org/jira/browse/HBASE-10070
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>         Attachments: HighAvailabilityDesignforreadsApachedoc.pdf
>
>
> In the present HBase architecture, it is hard, probably impossible, to 
> satisfy constraints like 99th percentile of the reads will be served under 10 
> ms. One of the major factors that affects this is the MTTR for regions. There 
> are three phases in the MTTR process - detection, assignment, and recovery. 
> Of these, the detection is usually the longest and is presently in the order 
> of 20-30 seconds. During this time, the clients would not be able to read the 
> region data.
> However, some clients will be better served if regions will be available for 
> reads during recovery for doing eventually consistent reads. This will help 
> with satisfying low latency guarantees for some class of applications which 
> can work with stale reads.
> For improving read availability, we propose a replicated read-only region 
> serving design, also referred as secondary regions, or region shadows. 
> Extending current model of a region being opened for reads and writes in a 
> single region server, the region will be also opened for reading in region 
> servers. The region server which hosts the region for reads and writes (as in 
> current case) will be declared as PRIMARY, while 0 or more region servers 
> might be hosting the region as SECONDARY. There may be more than one 
> secondary (replica count > 2).
> Will attach a design doc shortly which contains most of the details and some 
> thoughts about development approaches. Reviews are more than welcome. 
> We also have a proof of concept patch, which includes the master and regions 
> server side of changes. Client side changes will be coming soon as well. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-10070) HBase read high-availability using eventually consistent region replicas

Reply via email to