Re: [DISCUSS] Effective HBase in the Cloud

Josh Elser Wed, 09 May 2018 08:37:08 -0700

Hi Duo,

Both replication and the backup&restore work suffer from this problem.

The approach we think will work best is that when we get to a certainsize-limit (e.g. 100MB), we will take the current Log Stream (the RAFTquorum), flip over the RegionServer to use a new Log Stream, and thenwrite this to a distributed FileSystem all at once, finally cleaning upthe old Log Stream.


This approach:

* Avoids forcing us to change Replication, B&R, and other things thatare implicitly depending on a file-based WAL. We can change this later,but are not forced to do anything immediately* Allows replication to buffer on a filesystem as opposed to the RAFTquorums (keeping on the FS is much much "cheaper")

I have some more on this in the detailed doc I mentioned to Stack inanother branch of the conversation. Working on making sure I can shareall of that :)


On 5/7/18 7:26 PM, 张铎(Duo Zhang) wrote:

How do we deal with replication? It is file based...

2018-05-08 10:12 GMT+08:00 Josh Elser <els...@apache.org>:



On 5/7/18 2:53 PM, Stack wrote:

On Thu, May 3, 2018 at 9:04 AM, Josh Elser <els...@apache.org> wrote:

Hi,


... I'm happy to delve some more into how I think we can implement this.

I'd be interested in this part.

St.Ack


You got it, boss. Let me find the time to get that document exported as
well. Will get back to you.

- Josh


[1] https://docs.google.com/document/d/1Su5py_T5Ytfh9RoTTX2s20Kb
SJwBHVxbO7ge5ORqbCk/edit#
[2] https://home.apache.org/~elserj/Effective%20HBase%20in%20the
%20Cloud.pdf

Re: [DISCUSS] Effective HBase in the Cloud

Reply via email to