[ 
https://issues.apache.org/jira/browse/HBASE-8815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133752#comment-15133752
 ] 

cuixin commented on HBASE-8815:
-------------------------------

Are you talk about the https://github.com/tmalaska/HBase.MCC ?

> A replicated cross cluster client
> ---------------------------------
>
>                 Key: HBASE-8815
>                 URL: https://issues.apache.org/jira/browse/HBASE-8815
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Varun Sharma
>
> I would like to float this idea for brain storming.
> HBase is a strongly consistent system modelled after bigtable which means a 
> machine going down results in loss of availability of around 2 minutes as it 
> stands today. So there is a trade off.
> However, for high availability and redundancy, it is common practice for 
> online/mission critical applications to run replicated clusters. For example, 
> we run replicated clusters at pinterest in different EC2 az(s) and at google, 
> critical data is always replicated across bigtable cells.
> At high volumes, 2 minutes of downtime can also be critical, however, today 
> our client does not make use of the fact, that there is an available slave 
> replica cluster from which slightly inconsistent data can be read. It only 
> reads from one cluster. When you have replication, it is a very common 
> practice for reading from slave if the error rate from master is high. That 
> is how, web sites serve data out of MySQL and survive machine failures by 
> directing their reads to slave machines when the master goes down.
> I am sure folks love the strong consistency gaurantee from HBase, but I think 
> that this way, we can make better use of the replica cluster, much in the 
> same way people use MySQL slaves for reads. In case of regions going offline, 
> it would be nice if, for the offline regions only (a small fraction), reads 
> could be directed to the slave cluster.
> I know one company which follows this model. At Google, a replicated client 
> api is used for reads which is able to farm reads to multiple clusters and 
> also writes to multiple clusters depending on availability in case of Multi 
> master replication.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to