Folks,
I've been working with Vlad and Ted offline to make sure we have a plan
that addresses the implementation gaps Vlad sees and the
barriers-for-entry previously stated to keep the feature in HBase 2.0.
My hope is that this can be an honest discussion given 2.0-beta
timelines, with a concrete action plan. I'm trying my best to not
re-hash the logic/reasoning/caveats behind previous concerns; anything
folks feel is a blocker that I haven't covered below is unintentional.
The list:
1. Documentation. It must be updated and committed, ensuring it covers
the details operators/architects need to know to use it effectively
(HBASE-16574). Vlad will help with content, myself and/or Frank will get
it updated to asciidoc.
2. Distributed testing missing. Vlad has taken my previous document on
goals and translated that into an implementation outline[1]. Ted and I
have already weighed in -- I believe it hits the salient points for the
quality of testing we're looking for. I'll get started on this while
Vlad does #4 (after consensus on approach, of course). Needs JIRA issue
(maybe?).
3. Operator utility to verify backups. In abstract, this should just be
the same guts of a tool like VerifyReplication. In practice, this should
be the same code that #3 uses (if not _actually_ the same guts as
VerifyReplication). The hope is that this will be encapsulated
(time-wise) by #3. Needs JIRA issue (maybe?).
4. Polish DistCP for bulk-loaded files/fault-tolerance (HBASE-17852). I
don't have specifics here -- will rely on Vlad to correct me if there's
a better JIRA issue to track than the aforementioned. Will rely on
details to show up the JIRA issue to track it.
Current due dates:
1. End of week (2017/11/10)
2. Before US Thanksgiving (2017/11/22)
3. Same as #2
4. Same as #1
My current thought is that this is reasonable for implementation times,
and would not derail the rest of the beta-1 train. I appreciate the
patience from all parties, and I hope that those trying to make this
better can find a little more time to give some feedback. Thanks for the
long read if nothing else.
- Josh
[1]
https://docs.google.com/document/d/1xbPlLKjOcPq2LDqjbSkF6uNDAG0mzgOxek6P3POLeMc/edit?usp=sharing