Came here just to track anything related to Distributed Log Replay which I
am trying to purge. But looks like it's another discussion thread about
hbase-backup.
Am coming here with limited knowledge about the feature (did a review
initially once, lost track after). But then, looks like discussion is not
about technical aspects of feature, but trust in it.

Something which can help get trust in B&R, or otherwise, is an accurate
summary of it as of now. Basically
1) What features are there in 2.0
2) What features are being targeted for 2.1 onwards
3) What testing has been done so far. Not just names...details. For eg.
ITBLL w/ 50 node cluster and x,y,z fault tolerences.
4) What tests are planned before 2.0. I think a good basis to judge that
would be, will that testing convince Elliot/ Andrew to use that feature in
their internal clusters.
5) List of existing bugs

Once it's there, hopefully everyone agrees that list in (1) is enough and
items in (2) are non-critical for basic B&R.
3 and 4 are most important.
Missing anything in (5) will be counter-productive.
I'd appreciate if the summary is followed by opinions, and not mixed
together.

Just a suggestion which can help you get right attention.
Thanks.

-- Appy



On Wed, Nov 1, 2017 at 3:33 PM, Vladimir Rodionov <[email protected]>
wrote:

> >> HBASE-19106 at least is a fundamental
>
> This new feature was requested 9 days ago (between alpha 3 and alpha 4
> releases) It has never been on a list of features we has agreed to
> implement for 2.0 release.
> When backup started almost 2 years ago, we described what features and
> capabilities will be implemented. We have had a discussions before and I do
> not remember any
> complaints from community that we lack important functionalities
>
> You can not point to it as a blocker for 2.0 release, Stack.
>
> Testing at scale (lack of) - the only real issue I see in B&R now. The
> question: can it justify your willingness to postpone feature till  next
> 2.x release, Stack?
>
> All blockers are resolved, including pending HBASE-17852 patch. All
> functionality for 2.0  has been implemented.   Scalability and performance
> improvements patch is in working
> and expected to be ready next week. In any case, this is improvement - not
> a new feature.
>
> We have been testing B&R in our internal QA clusters for months. Others
> (SF) have done testing as well. I am pretty confident in implementation.
>
>
>
> On Wed, Nov 1, 2017 at 3:15 PM, Josh Elser <[email protected]> wrote:
>
> > On 11/1/17 5:52 PM, Stack wrote:
> >
> >> On Wed, Nov 1, 2017 at 12:25 PM, Vladimir Rodionov<vladrodionov@gmail.
> com
> >> >
> >> wrote:
> >>
> >> 1. HBASE-19104 - 19109
> >>>
> >>> None of them are basic, Stack. These requests came from SF after
> >>> discussion
> >>> we had with them recently
> >>> No single comments is because I was out of country last week.
> >>>
> >>> 2. Backup tables are not system ones, they belong to a separate
> >>> namespace -
> >>> "backup"
> >>>
> >>> 3. We make no assumptions on assignment order of these tables.
> >>>
> >>> As for real scale testing and documentation , we still have time before
> >>> 2.0GA.  Can't be blocker IMO
> >>>
> >>>
> >>> First off, wrong response.
> >>
> >> Better would have been pointers to a description of the feature as it
> >> stands in branch-2 (a list of JIRAs is insufficient), what is to be done
> >> still, and evidence of heavy testing in particular at scale (as Josh
> >> reminds us, we agreed to last time backup-in-hbase2 was broached) ending
> >> with list of what will be done between here and beta-1 to assuage any
> >> concerns that backup is incomplete. As to the issues filed, IMO,
> >> HBASE-19106 at least is a fundamental. W/o it, how you even know backup
> >> works at anything above toy scale.
> >>
> >> Pardon my mistake on 'system' tables. I'd made the statement 9 days ago
> up
> >> in HBASE-17852 trying to figure what was going on in the issue and it
> >> stood
> >> unchallenged (Josh did let me know later that you were traveling).
> >>
> >> I'm not up for waiting till GA before we decide what is in the release.
> >> This DISCUSSION is about deciding now, before beta-1, whats in and whats
> >> out. Backup would be a great to have but it is currently on the chopping
> >> block. I've tried to spend time figuring what is there and where it
> stands
> >> but I always end up stymied (e.g. see HBASE-17852; see how it starts
> out;
> >> see the patch attached w/ no description of what it comprises or the
> >> approach decided upon; and so on). Maybe its me, but hey, unfortunately,
> >> its me who is the RM.
> >>
> >
> > As much as it pains me, I can't argue with the lack of confidence via
> > testing. While it feels like an eternity ago since we posited on B&R's
> > scale/correctness testing, it's only been 1.5 months. In reality, getting
> > to this was delayed by some of the (really good!) FT fixes that Vlad has
> > made.
> >
> > We set the bar for the feature and we missed it; there's not arguing
> that.
> > Yes, it stinks. I see two paths forward: 1) come up with its own release
> to
> > let those downstream use it now (risks withstanding) or 2) shoot for
> HBase
> > 2.1.0. The latter is how we've approached this in the past. Building the
> > test needs to happen regardless of the release vehicle.
> >
> > New issues/feature-requests are always going to come in as people
> > experiment with it. I hope to avoid getting bogged down in this -- I
> > sincerely doubt that there is any single answer to what is "required" for
> > an initial backup and restore implementation. I feel like anything more
> > will turn into a battle of opinions. When we bring up the feature again,
> we
> > should make a concerted effort to say "this is the state of the feature,
> > with the design choices made, and this the result of our testing for
> > correctness." Hopefully much of this is already contained in
> documentation
> > and just needs to be collected/curated.
> >
> > - Josh
> >
>



-- 

-- Appy

Reply via email to