+1 to starting the work. I think most of the concerns can be figured out on
the JIRAs and we can have a project update every X weeks if enough people
are interested.

I also agree to frame the feature correctly. Decoupling from a HDFS WAL or
WAL on Ratis would be more appropriate names that would better convey the
scope. I think there are a number of projects necessary to complete "HBase
on Cloud" with this being one of those.


Thanks for driving this initiative!

Zach


On Wed, Jul 25, 2018 at 11:55 AM, Josh Elser <[email protected]> wrote:

> Let me give an update on-list for everyone:
>
> First and foremost, thank you very much to everyone who took the time to
> read this, with an extra thanks to those who participated in discussion.
> There were lots of great points raised. Some about things that were unclear
> in the doc, and others shining light onto subjects I hadn't considered yet.
>
> My biggest take-away is that I complicated this document by tying it too
> closely with "HBase on Cloud", treating the WAL+Ratis LogService as the
> only/biggest thing to figure out. This was inaccurate and overly bold of
> me: I apologize. I think this complicated discussion on a number of points,
> and ate a good bit of some of your's time.
>
> My goal was to present this as an important part of a transition to the
> "cloud", giving justification to what WAL+Ratis helps HBase achieve. I did
> not want this document to be a step-by-step guide to a perfect HBase on
> Cloud design. I need to do a better job with this in the future; sorry.
>
> That said, my feeling is that, on the whole, folks are in support of the
> proposed changes/architecture described for the WAL+Ratis work (tl;dr
> revisit WAL API, plug in current WAL implementation to any API
> modification, build new Ratis-backed WAL impl). There were some concerns
> which still need immediate action that I am aware of:
>
> * Sync with Ram and Anoop re: in-memory WAL [1]
> * Where is Ratis LogService metadata kept? How do we know what LogStreams
> were being used/maintained by a RS? How does this tie into recovery?
>
> There are also long-term concerns which I don't think I have an answer for
> yet (for either reasons out of my control or a lack of technical
> understanding):
>
> * Maturity of the Ratis community
> * Required performance by HBase and the ability of the LogService to
> provide that perf (Areas already mentioned: gRPC perf, fsyncs bogging down
> disks, ability to scale RAFT quorums).
> * Continue with WAL-per-RS or move to WAL-per-Region? Related to perf,
> dependent upon Ratis scalability.
> * I/O amplification on WAL retention for backup&restore and replication
> ("logstream export")
> * Ensure that LogStreams can be exported to a dist-filesystem in a manner
> which requires no additional metadata/handling (avoid more storage/mgmt
> complexity)
> * Ability to build krb5 authn into Ratis (really, gRPC)
>
> I will continue the two immediate action items. I think the latter
> concerns are some that will require fingers-on-keyboard -- I don't know
> enough about runtime characteristics without seeing it for myself.
>
> All this said, I'd like to start moving toward the point where we start
> breaking out this work into a feature-branch off of master and start
> building code. My hope is that this is amenable to everyone, with the
> acknowledge that the Ratis work is considered "experimental" and not an
> attempt to make all of HBase use Ratis-backed WALs.
>
> Finally, I do *not* want this message to be interpreted as me squashing
> anyone's concerns. My honest opinion is that discussion has died down, but
> I will be the first to apologize if I have missed any outstanding concerns.
> Please, please, please ping me if I am negligent.
>
> Thanks once again for everyone's participation.
>
> [1] https://docs.google.com/document/d/1Su5py_T5Ytfh9RoTTX2s20Kb
> SJwBHVxbO7ge5ORqbCk/edit?disco=AAAACBm3RLM
>
> On 2018/07/13 20:15:45, Josh Elser <[email protected]> wrote: > Hi all,
>
>>
>> A long time ago, I shared a document about a (I'll call it..) "vision"
>> where we make some steps towards decoupling HBase from HDFS in an effort to
>> make deploying HBase on Cloud IaaS providers a bit easier (operational
>> simplicity, effective use of common IaaS paradigms, etc).
>>
>> https://docs.google.com/document/d/1Su5py_T5Ytfh9RoTTX2s20Kb
>> SJwBHVxbO7ge5ORqbCk/edit?usp=sharing
>>
>> A good ask from our Stack back then was: "[can you break down this
>> work]?" The original document was very high-level, and asking for some more
>> details make a lot of sense. Months later, I'd like to share that I've
>> updated the original document with some new content at the bottom (as well
>> as addressed some comments which went unanswered by me -- sorry!)
>>
>> Based on a discussion I had earlier this week (and some discussions
>> during HBaseCon in California in June), I've tried to add a brief
>> "refresher" on what some of the big goals for this effort are. Please check
>> it out at your leisure and let me know what you think. Would like to start
>> getting some fingers behind this all and pump out some code :)
>>
>> https://docs.google.com/document/d/1Su5py_T5Ytfh9RoTTX2s20Kb
>> SJwBHVxbO7ge5ORqbCk/edit#bookmark=id.fml9ynrqagk
>>
>> - Josh
>>
>>

Reply via email to