Hi Duo,

Both replication and the backup&restore work suffer from this problem.

The approach we think will work best is that when we get to a certain size-limit (e.g. 100MB), we will take the current Log Stream (the RAFT quorum), flip over the RegionServer to use a new Log Stream, and then write this to a distributed FileSystem all at once, finally cleaning up the old Log Stream.

This approach:

* Avoids forcing us to change Replication, B&R, and other things that are implicitly depending on a file-based WAL. We can change this later, but are not forced to do anything immediately * Allows replication to buffer on a filesystem as opposed to the RAFT quorums (keeping on the FS is much much "cheaper")

I have some more on this in the detailed doc I mentioned to Stack in another branch of the conversation. Working on making sure I can share all of that :)

On 5/7/18 7:26 PM, 张铎(Duo Zhang) wrote:
How do we deal with replication? It is file based...

2018-05-08 10:12 GMT+08:00 Josh Elser <els...@apache.org>:



On 5/7/18 2:53 PM, Stack wrote:

On Thu, May 3, 2018 at 9:04 AM, Josh Elser <els...@apache.org> wrote:

Hi,

... I'm happy to delve some more into how I think we can implement this.

I'd be interested in this part.
St.Ack


You got it, boss. Let me find the time to get that document exported as
well. Will get back to you.



- Josh

[1] https://docs.google.com/document/d/1Su5py_T5Ytfh9RoTTX2s20Kb
SJwBHVxbO7ge5ORqbCk/edit#
[2] https://home.apache.org/~elserj/Effective%20HBase%20in%20the
%20Cloud.pdf




Reply via email to