Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

Vladimir Rodionov Fri, 09 Sep 2016 12:17:26 -0700

Do not worry Sean, doc is coming today as a preview and our writer Frank
will be working on a putting  it into Apache repo. Timeline depends on
Franks schedule but I hope we will get it rather sooner than later.


As for failure testing, we are focusing only on a consistent state of
backup system data in a presence of any type of failures, We are not going
to implement  anything more "fancy", than that. We allow both: backup and
restore to fail. What we do not allow is to have system data corrupted.
Will it suffice for you? Do you have any other concerns, you want us to
address?

-Vlad


On Fri, Sep 9, 2016 at 10:56 AM, Sean Busbey <[email protected]> wrote:

> "docs will come to Apache soon" does not address my concern around docs at
> all, unless said docs have already made it into the project repo. I don't
> want third party resources for using a major and important feature of the
> project, I want us to provide end users with what they need to get the job
> done.
>
> I see some calls for patience on the failure testing, but the appeal to us
> having done a bad job of requiring proper tests of previous features just
> makes me more concerned about not getting them here. I don't want to set
> yet another bad example that will then be pointed to in the future.
>
> On Sep 8, 2016 10:50, "Ted Yu" <[email protected]> wrote:
>
> > Is there any concern which is not addressed ?
> >
> > Do we need another Vote thread ?
> >
> > Thanks
> >
> > On Thu, Sep 8, 2016 at 9:21 AM, Andrew Purtell <[email protected]>
> > wrote:
> >
> > > Vlad,
> > >
> > > I apologize for using the term 'half-baked' in a way that could seem a
> > > description of HBASE-7912. I meant that as a general hypothetical.
> > >
> > > On Wed, Sep 7, 2016 at 9:36 AM, Vladimir Rodionov <
> > [email protected]>
> > > wrote:
> > >
> > > > >> I'm not sure that "There is already lots of half-baked code in the
> > > > branch,
> > > > so what's the harm in adding more?"
> > > >
> > > > I meant - not production - ready yet. This is 2.0 development branch
> > and,
> > > > hence many features are in works,
> > > > not being tested well etc. I do not consider backup as half baked
> > > feature -
> > > > it has passed our internal QA and has very good doc, which we will
> > > provide
> > > > to Apache shortly.
> > > >
> > > > -Vlad
> > > >
> > > > On Wed, Sep 7, 2016 at 9:13 AM, Andrew Purtell <[email protected]>
> > > > wrote:
> > > >
> > > > > We shouldn't admit half baked changes that won't be finished.
> However
> > > in
> > > > > this case the crew working on this feature are long timers and less
> > > > likely
> > > > > than just about anyone to leave something in a half baked state. Of
> > > > course
> > > > > there is no guarantee how anything will turn out, but I am willing
> to
> > > > take
> > > > > a little on faith if they feel their best path forward now is to
> > merge
> > > to
> > > > > trunk. I only wish I had bandwidth to have done some real kicking
> of
> > > the
> > > > > tires by now. Maybe this week.
> > > > >
> > > > > (Yes, I'm using some of that time for this email :-) but I type
> > fast.)
> > > > >
> > > > > That said, I would like to agitate for making 2.0 more real and
> spend
> > > > some
> > > > > time on it now that I'm winding down with 0.98. I think that means
> > > > > branching for 2.0 real soon now and even evicting things from 2.0
> > > branch
> > > > > that aren't finished or stable, leaving them only once again in the
> > > > master
> > > > > branch. Or, maybe just evicting them. Let's take it case by case.
> > > > >
> > > > > I think this feature can come in relatively safely. As added
> > insurance,
> > > > > let's admit the possibility it could be reverted on the 2.0 branch
> if
> > > > folks
> > > > > working on stabilizing 2.0 decide to evict it because it is
> > unfinished
> > > or
> > > > > unstable, because that certainly can happen. I would expect if talk
> > > like
> > > > > that starts, we'd get help finishing or stabilizing what's under
> > > > discussion
> > > > > for revert. Or, we'd have a revert. Either way the outcome is
> > > acceptable.
> > > > >
> > > > >
> > > > > On Wed, Sep 7, 2016 at 8:56 AM, Dima Spivak <[email protected]
> >
> > > > wrote:
> > > > >
> > > > > > I'm not sure that "There is already lots of half-baked code in
> the
> > > > > branch,
> > > > > > so what's the harm in adding more?" is a good code commit
> > philosophy
> > > > for
> > > > > a
> > > > > > fault-tolerant distributed data store. ;)
> > > > > >
> > > > > > More seriously, a lack of test coverage for existing features
> > > shouldn't
> > > > > be
> > > > > > used as justification for introducing new features with the same
> > > > > > shortcomings. Ultimately, it's the end user who will feel the
> pain,
> > > so
> > > > > > shouldn't we do everything we can to mitigate that?
> > > > > >
> > > > > > -Dima
> > > > > >
> > > > > > On Wed, Sep 7, 2016 at 8:46 AM, Vladimir Rodionov <
> > > > > [email protected]>
> > > > > > wrote:
> > > > > >
> > > > > > > Sean,
> > > > > > >
> > > > > > > * have docs
> > > > > > >
> > > > > > > Agree. We have a doc and backup is the most documented feature
> > :),
> > > we
> > > > > > will
> > > > > > > release it shortly to Apache.
> > > > > > >
> > > > > > > * have sunny-day correctness tests
> > > > > > >
> > > > > > > Feature has  close to 60 test cases, which run for approx 30
> min.
> > > We
> > > > > can
> > > > > > > add more, if community do not mind :)
> > > > > > >
> > > > > > > * have correctness-in-face-of-failure tests
> > > > > > >
> > > > > > > Any examples of these tests in existing features? In works, we
> > > have a
> > > > > > clear
> > > > > > > understanding of what should be done by the time of 2.0
> release.
> > > > > > > That is very close goal for us, to verify IT monkey for
> existing
> > > > code.
> > > > > > >
> > > > > > > * don't rely on things outside of HBase for normal operation
> > (okay
> > > > for
> > > > > > > advanced operation)
> > > > > > >
> > > > > > > We do not.
> > > > > > >
> > > > > > > Enormous time has been spent already on the development and
> > testing
> > > > the
> > > > > > > feature, it has passed our internal tests and many rounds of
> code
> > > > > reviews
> > > > > > > by HBase committers. We do not mind if someone from HBase
> > community
> > > > > > > (outside of HW) will review the code, but it will probably
> takes
> > > > > forever
> > > > > > to
> > > > > > > wait for volunteer?, the feature is quite large (1MB+
> cumulative
> > > > patch)
> > > > > > >
> > > > > > > 2.0 branch is full of half baked features, most of them are in
> > > active
> > > > > > > development, therefore I am not following you here, Sean? Why
> > > > > HBASE-7912
> > > > > > is
> > > > > > > not good enough yet to be integrated into 2.0 branch?
> > > > > > >
> > > > > > > -Vlad
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Sep 7, 2016 at 8:23 AM, Sean Busbey <[email protected]
> >
> > > > wrote:
> > > > > > >
> > > > > > > > On Tue, Sep 6, 2016 at 10:36 PM, Josh Elser <
> > > [email protected]>
> > > > > > > wrote:
> > > > > > > > > So, the answer to Sean's original question is "as robust as
> > > > > snapshots
> > > > > > > > > presently are"? (independence of backup/restore failure
> > > tolerance
> > > > > > from
> > > > > > > > > snapshot failure tolerance)
> > > > > > > > >
> > > > > > > > > Is this just a question WRT context of the change, or is it
> > > means
> > > > > > for a
> > > > > > > > veto
> > > > > > > > > from you, Sean? Just trying to make sure I'm following
> along
> > > > > > > adequately.
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > > I'd say ATM I'm -0, bordering on -1 but not for reasons I can
> > > > > > articulate
> > > > > > > > well.
> > > > > > > >
> > > > > > > > Here's an attempt.
> > > > > > > >
> > > > > > > > We've been trying to move, as a community, towards minimizing
> > > risk
> > > > to
> > > > > > > > downstream folks by getting "complete enough for use" gates
> in
> > > > place
> > > > > > > > before we introduce new features. This was spurred by a some
> > > > features
> > > > > > > > getting in half-baked and never making it to "can really use"
> > > > status
> > > > > > > > (I'm thinking of distributed log replay and the zk-less
> > > assignment
> > > > > > > > stuff, I don't recall if there was more).
> > > > > > > >
> > > > > > > > The gates, generally, included things like:
> > > > > > > >
> > > > > > > > * have docs
> > > > > > > > * have sunny-day correctness tests
> > > > > > > > * have correctness-in-face-of-failure tests
> > > > > > > > * don't rely on things outside of HBase for normal operation
> > > (okay
> > > > > for
> > > > > > > > advanced operation)
> > > > > > > >
> > > > > > > > As an example, we kept the MOB work off in a branch and out
> of
> > > > master
> > > > > > > > until it could pass these criteria. The big exemption we've
> had
> > > to
> > > > > > > > this was the hbase-spark integration, where we all agreed it
> > > could
> > > > > > > > land in master because it was very well isolated (the slide
> > away
> > > > from
> > > > > > > > including docs as a first-class part of building up that
> > > > integration
> > > > > > > > has led me to doubt the wisdom of this decision).
> > > > > > > >
> > > > > > > > We've also been treating inclusion in a "probably will be
> > > released
> > > > to
> > > > > > > > downstream" branches as a higher bar, requiring
> > > > > > > >
> > > > > > > > * don't moderately impact performance when the feature isn't
> in
> > > use
> > > > > > > > * don't severely impact performance when the feature is in
> use
> > > > > > > > * either default-to-on or show enough demand to believe a
> > > > non-trivial
> > > > > > > > number of folks will turn the feature on
> > > > > > > >
> > > > > > > > The above has kept MOB and hbase-spark integration out of
> > > branch-1,
> > > > > > > > presumably while they've "gotten more stable" in master from
> > the
> > > > odd
> > > > > > > > vendor inclusion.
> > > > > > > >
> > > > > > > > Are we going to have a 2.0 release before the end of the
> year?
> > > > We're
> > > > > > > > coming up on 1.5 years since the release of version 1.0;
> seems
> > > like
> > > > > > > > it's about time, though I haven't seen any concrete plans
> this
> > > > year.
> > > > > > > > Presuming we are going to have one by the end of the year, it
> > > > seems a
> > > > > > > > bit close to still be adding in "features that need maturing"
> > on
> > > > the
> > > > > > > > branch.
> > > > > > > >
> > > > > > > > The lack of a concrete plan for 2.0 keeps me from considering
> > > these
> > > > > > > > things blocker at the moment. But I know first hand how much
> > > > trouble
> > > > > > > > folks have had with other features that have gone into
> > downstream
> > > > > > > > facing releases without robustness checks (i.e. replication),
> > and
> > > > I'm
> > > > > > > > concerned about what we're setting up if 2.0 goes out with
> this
> > > > > > > > feature in its current state.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best regards,
> > > > >
> > > > >    - Andy
> > > > >
> > > > > Problems worthy of attack prove their worth by hitting back. - Piet
> > > Hein
> > > > > (via Tom White)
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > >
> > >    - Andy
> > >
> > > Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> > > (via Tom White)
> > >
> >
>

Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

Reply via email to