> My biggest take-away is that I complicated this document by tying it too > closely with "HBase on Cloud", treating the WAL+Ratis LogService as the only/biggest thing to figure out.
Understanding this now helps a lot to understand better the positions taken from the doc. At first glance it read as an initially interesting document that quickly went to a weird place where there was a preconceived solution working backward toward a problem, engineering run in reverse. I think it's perfectly fine if the Ratis podling and those associated with it want to drive development and/or adoption by finding candidate use cases in other ecosystem projects. As long as we have good interfaces which don't leak internals, no breaking core changes, no hard dependencies on incubating artifacts, and at least a potential path forward to alternate implementations it's all good! On Wed, Jul 25, 2018 at 11:55 AM Josh Elser <[email protected]> wrote: > Let me give an update on-list for everyone: > > First and foremost, thank you very much to everyone who took the time to > read this, with an extra thanks to those who participated in discussion. > There were lots of great points raised. Some about things that were > unclear in the doc, and others shining light onto subjects I hadn't > considered yet. > > My biggest take-away is that I complicated this document by tying it too > closely with "HBase on Cloud", treating the WAL+Ratis LogService as the > only/biggest thing to figure out. This was inaccurate and overly bold of > me: I apologize. I think this complicated discussion on a number of > points, and ate a good bit of some of your's time. > > My goal was to present this as an important part of a transition to the > "cloud", giving justification to what WAL+Ratis helps HBase achieve. I > did not want this document to be a step-by-step guide to a perfect HBase > on Cloud design. I need to do a better job with this in the future; sorry. > > That said, my feeling is that, on the whole, folks are in support of the > proposed changes/architecture described for the WAL+Ratis work (tl;dr > revisit WAL API, plug in current WAL implementation to any API > modification, build new Ratis-backed WAL impl). There were some concerns > which still need immediate action that I am aware of: > > * Sync with Ram and Anoop re: in-memory WAL [1] > * Where is Ratis LogService metadata kept? How do we know what > LogStreams were being used/maintained by a RS? How does this tie into > recovery? > > There are also long-term concerns which I don't think I have an answer > for yet (for either reasons out of my control or a lack of technical > understanding): > > * Maturity of the Ratis community > * Required performance by HBase and the ability of the LogService to > provide that perf (Areas already mentioned: gRPC perf, fsyncs bogging > down disks, ability to scale RAFT quorums). > * Continue with WAL-per-RS or move to WAL-per-Region? Related to perf, > dependent upon Ratis scalability. > * I/O amplification on WAL retention for backup&restore and replication > ("logstream export") > * Ensure that LogStreams can be exported to a dist-filesystem in a > manner which requires no additional metadata/handling (avoid more > storage/mgmt complexity) > * Ability to build krb5 authn into Ratis (really, gRPC) > > I will continue the two immediate action items. I think the latter > concerns are some that will require fingers-on-keyboard -- I don't know > enough about runtime characteristics without seeing it for myself. > > All this said, I'd like to start moving toward the point where we start > breaking out this work into a feature-branch off of master and start > building code. My hope is that this is amenable to everyone, with the > acknowledge that the Ratis work is considered "experimental" and not an > attempt to make all of HBase use Ratis-backed WALs. > > Finally, I do *not* want this message to be interpreted as me squashing > anyone's concerns. My honest opinion is that discussion has died down, > but I will be the first to apologize if I have missed any outstanding > concerns. Please, please, please ping me if I am negligent. > > Thanks once again for everyone's participation. > > [1] > > https://docs.google.com/document/d/1Su5py_T5Ytfh9RoTTX2s20KbSJwBHVxbO7ge5ORqbCk/edit?disco=AAAACBm3RLM > > On 2018/07/13 20:15:45, Josh Elser <[email protected]> wrote: > Hi all, > > > > A long time ago, I shared a document about a (I'll call it..) "vision" > > where we make some steps towards decoupling HBase from HDFS in an effort > > to make deploying HBase on Cloud IaaS providers a bit easier > > (operational simplicity, effective use of common IaaS paradigms, etc). > > > > > https://docs.google.com/document/d/1Su5py_T5Ytfh9RoTTX2s20KbSJwBHVxbO7ge5ORqbCk/edit?usp=sharing > > > > A good ask from our Stack back then was: "[can you break down this > > work]?" The original document was very high-level, and asking for some > > more details make a lot of sense. Months later, I'd like to share that > > I've updated the original document with some new content at the bottom > > (as well as addressed some comments which went unanswered by me -- > sorry!) > > > > Based on a discussion I had earlier this week (and some discussions > > during HBaseCon in California in June), I've tried to add a brief > > "refresher" on what some of the big goals for this effort are. Please > > check it out at your leisure and let me know what you think. Would like > > to start getting some fingers behind this all and pump out some code :) > > > > > https://docs.google.com/document/d/1Su5py_T5Ytfh9RoTTX2s20KbSJwBHVxbO7ge5ORqbCk/edit#bookmark=id.fml9ynrqagk > > > > - Josh > > > -- Best regards, Andrew Words like orphans lost among the crosstalk, meaning torn from truth's decrepit hands - A23, Crosstalk
