Hi Andrew,  Interesting use case, thanks for sharing. I'm curious about a few things:

On Feb 1, 2011, at 5:38 PM, Andrew Purtell wrote:

Two ideas actually:

1) Do pretty straightforward log shipping from region master to read only replicas.


I don't understand why you have to ship the log to the read only replicas. Aren't you storing the log on HDFS currently? Can't they read from HDFS directly?

2) Divide the cluster into quorum 3-cliques. Extract ZAB and use it to maintain consensus on writes from region master to two read only replicas. Run the consensus protocol in parallel with HDFS hflush to the write ahead log. Needs a lot of work filling in the detail, obviously, but that's the general notion.


I wonder why you are choosing 3 for the size of a clique and not letting it be a free parameter. I would think that this a decision of the user. Are you choosing 3 to avoid the replication overhead?


#1 is relatively simple but trades away the consistency for which HBase is indicated for higher availability (for reads) when regions are in transition.


I don't see where you could have inconsistencies here. Would you mind elaborating a bit further?

#2 is not simple at all but may let maintain replicas that are fully consistent at all times with the region master, not lower region master write performance unacceptably, and also gain the higher availability (for reads) when regions are in transition.


Agreed, it will be tricky, especially because we would have to extract Zab first.

Cheers,
-Flavio


Problems worthy of attack prove their worth by hitting back.
 - Piet Hein (via Tom White)






flavio
junqueira
 
research scientist
 
[email protected]
direct +34 93-183-8828
 
avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301


Reply via email to