Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

Sean Busbey Thu, 22 Sep 2016 10:32:14 -0700

I'd like to see the docs proposed on HBASE-16574 integrated into our
project's documentation prior to merge.


On Thu, Sep 22, 2016 at 9:02 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> This feature can be marked experimental due to some limitations such as
> security.
>
> Your previous round of comments have been addressed.
> Command line tool has gone through:
>
> HBASE-16620 Fix backup command-line tool usability issues
> HBASE-16655 hbase backup describe with incorrect backup id results in NPE
>
> The updated doc has been attached to HBASE-16574.
>
> Cheers
>
> On Thu, Sep 22, 2016 at 8:53 AM, Stack <st...@duboce.net> wrote:
>
>> On Wed, Sep 21, 2016 at 7:43 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>> > Are there more (review) comments ?
>> >
>> >
>> Are outstanding comments addressed?
>>
>> I don't see answer to my 'is this experimental/will it be marked
>> experimental' question.
>>
>> I ran into some issues trying to use the feature and suggested that a
>> feature likes this needs polish else it'll just rot, unused. Has polish
>> been applied? All ready for another 'user' test? Suggest that you update
>> here going forward for the benefit of those trying to follow along and who
>> are not watching JIRA change fly-by.
>>
>> It looks like doc got a revision -- I have to check -- to take on
>> suggestion made above but again, suggest, that this thread gets updated.
>>
>> Thanks,
>> St.Ack
>>
>>
>>
>> > Thanks
>> >
>> > On Tue, Sep 20, 2016 at 10:02 AM, Devaraj Das <d...@hortonworks.com>
>> > wrote:
>> >
>> > > Just reviving this thread. Thanks Sean, Stack, Dima, and others for the
>> > > thorough reviews and testing. Thanks Ted and Vlad for taking care of
>> the
>> > > feedback. Are we all good to do the merge now? Rather do sooner than
>> > later.
>> > > ________________________________________
>> > > From: saint....@gmail.com <saint....@gmail.com> on behalf of Stack <
>> > > st...@duboce.net>
>> > > Sent: Monday, September 12, 2016 1:18 PM
>> > > To: HBase Dev List
>> > > Subject: Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912
>> > >
>> > > On Mon, Sep 12, 2016 at 12:19 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>> > >
>> > > > Mega patch (rev 18) is on HBASE-14123.
>> > > >
>> > > > Please comment on HBASE-14123 on how you want to review.
>> > > >
>> > >
>> > >
>> > > Yeah. That was my lost tab. Last rb was 6 months ago. Suggest updating
>> > it.
>> > > RB is pretty good for review. Patch is only 1.5M so should be fine.
>> > >
>> > > St.Ack
>> > >
>> > >
>> > > >
>> > > > Thanks
>> > > >
>> > > > On Mon, Sep 12, 2016 at 12:15 PM, Stack <st...@duboce.net> wrote:
>> > > >
>> > > > > On review of the 'patch', do I just compare the branch to master or
>> > is
>> > > > > there a megapatch posted somewhere (I think I saw one but it seemed
>> > > stale
>> > > > > and then I 'lost' the tab). Sorry for dumb question.
>> > > > > St.Ack
>> > > > >
>> > > > > On Mon, Sep 12, 2016 at 12:01 PM, Stack <st...@duboce.net> wrote:
>> > > > >
>> > > > > > Late to the game. A few comments after rereading this thread as a
>> > > > 'user'.
>> > > > > >
>> > > > > > + Before merge, a user-facing feature like this should work (If
>> > this
>> > > is
>> > > > > "higher-bar
>> > > > > > for new features", bring it on -- smile).
>> > > > > > + As a user, I tried the branch with tools after reviewing the
>> > > > > just-posted
>> > > > > > doc. I had an 'interesting' experience (left comments up on
>> > issue). I
>> > > > > think
>> > > > > > the tooling/doc. important to get right. If it breaks easily or
>> is
>> > > > > > inconsistent (or lacks 'polish'), operators will judge the whole
>> > > > > > backup/restore tooling chain as not trustworthy and abandon it.
>> > Lets
>> > > > not
>> > > > > > have this happen to this feature.
>> > > > > > + Matteo's suggestion (with a helpful starter list) that there
>> > needs
>> > > to
>> > > > > be
>> > > > > > explicit qualification on what is actually being delivered --
>> > > > including a
>> > > > > > listing of limitations (some look serious such as data bleed from
>> > > other
>> > > > > > regions in WALs, but maybe I don't care for my use case...) --
>> > needs
>> > > to
>> > > > > > accompany the merge. Lets fold them into the user doc. in the
>> > > technical
>> > > > > > overview area as suggested so user expectations are properly
>> > managed
>> > > > > > (otherwise, they expect the world and will just give up when we
>> > fall
>> > > > > > short). Vladimir did a list of what is in each of the phases
>> above
>> > > > which
>> > > > > > would serve as a good start.
>> > > > > > + Is this feature 'experimental' (Matteo asks above). I'd prefer
>> it
>> > > is
>> > > > > > not. If it is, it should be labelled all over that it is so. I
>> see
>> > > > > current
>> > > > > > state called out as a '... technical preview feature'. Does this
>> > mean
>> > > > > > not-for-users?
>> > > > > >
>> > > > > > St.Ack
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > On Mon, Sep 12, 2016 at 8:03 AM, Ted Yu <yuzhih...@gmail.com>
>> > wrote:
>> > > > > >
>> > > > > >> Sean:
>> > > > > >> Do you have more comments ?
>> > > > > >>
>> > > > > >> Cheers
>> > > > > >>
>> > > > > >> On Fri, Sep 9, 2016 at 1:42 PM, Vladimir Rodionov <
>> > > > > vladrodio...@gmail.com
>> > > > > >> >
>> > > > > >> wrote:
>> > > > > >>
>> > > > > >> > Sean,
>> > > > > >> >
>> > > > > >> > Backup/Restore can fail due to various reasons: network outage
>> > > > > (cluster
>> > > > > >> > wide), various time-outs in HBase and HDFS layer, M/R failure
>> > due
>> > > to
>> > > > > >> "HDFS
>> > > > > >> > exceeded quota", user error (manual deletion of data) and so
>> on
>> > so
>> > > > on.
>> > > > > >> That
>> > > > > >> > is impossible to enumerate all possible types of failures in a
>> > > > > >> distributed
>> > > > > >> > system - that is not our goal/task.
>> > > > > >> >
>> > > > > >> > We focus completely on backup system table consistency in a
>> > > presence
>> > > > > of
>> > > > > >> any
>> > > > > >> > type of failure. That is what I call "tolerance to failures".
>> > > > > >> >
>> > > > > >> > On a failure:
>> > > > > >> >
>> > > > > >> > BACKUP. All backup system information (prior to backup) will
>> be
>> > > > > restored
>> > > > > >> > and all temporary data, related to a failed session, in HDFS
>> > will
>> > > be
>> > > > > >> > deleted
>> > > > > >> > RESTORE. We do not care about system data, because restore
>> does
>> > > not
>> > > > > >> change
>> > > > > >> > it. Temporary data in HDFS will be cleaned up and table will
>> be
>> > > in a
>> > > > > >> state
>> > > > > >> > back to where it was before operation started.
>> > > > > >> >
>> > > > > >> > This is what user should expect in case of a failure.
>> > > > > >> >
>> > > > > >> > -Vlad
>> > > > > >> >
>> > > > > >> >
>> > > > > >> > -Vlad
>> > > > > >> >
>> > > > > >> > On Fri, Sep 9, 2016 at 12:56 PM, Sean Busbey <
>> bus...@apache.org
>> > >
>> > > > > wrote:
>> > > > > >> >
>> > > > > >> > > Failing in a consistent way, with docs that explain the
>> > various
>> > > > > >> > > expected failures would be sufficient.
>> > > > > >> > >
>> > > > > >> > > On Fri, Sep 9, 2016 at 12:16 PM, Vladimir Rodionov
>> > > > > >> > > <vladrodio...@gmail.com> wrote:
>> > > > > >> > > > Do not worry Sean, doc is coming today as a preview and
>> our
>> > > > writer
>> > > > > >> > Frank
>> > > > > >> > > > will be working on a putting  it into Apache repo.
>> Timeline
>> > > > > depends
>> > > > > >> on
>> > > > > >> > > > Franks schedule but I hope we will get it rather sooner
>> than
>> > > > > later.
>> > > > > >> > > >
>> > > > > >> > > > As for failure testing, we are focusing only on a
>> consistent
>> > > > state
>> > > > > >> of
>> > > > > >> > > > backup system data in a presence of any type of failures,
>> We
>> > > are
>> > > > > not
>> > > > > >> > > going
>> > > > > >> > > > to implement  anything more "fancy", than that. We allow
>> > both:
>> > > > > >> backup
>> > > > > >> > and
>> > > > > >> > > > restore to fail. What we do not allow is to have system
>> data
>> > > > > >> corrupted.
>> > > > > >> > > > Will it suffice for you? Do you have any other concerns,
>> you
>> > > > want
>> > > > > >> us to
>> > > > > >> > > > address?
>> > > > > >> > > >
>> > > > > >> > > > -Vlad
>> > > > > >> > > >
>> > > > > >> > > >
>> > > > > >> > > > On Fri, Sep 9, 2016 at 10:56 AM, Sean Busbey <
>> > > bus...@apache.org
>> > > > >
>> > > > > >> > wrote:
>> > > > > >> > > >
>> > > > > >> > > >> "docs will come to Apache soon" does not address my
>> concern
>> > > > > around
>> > > > > >> > docs
>> > > > > >> > > at
>> > > > > >> > > >> all, unless said docs have already made it into the
>> project
>> > > > > repo. I
>> > > > > >> > > don't
>> > > > > >> > > >> want third party resources for using a major and
>> important
>> > > > > feature
>> > > > > >> of
>> > > > > >> > > the
>> > > > > >> > > >> project, I want us to provide end users with what they
>> need
>> > > to
>> > > > > get
>> > > > > >> the
>> > > > > >> > > job
>> > > > > >> > > >> done.
>> > > > > >> > > >>
>> > > > > >> > > >> I see some calls for patience on the failure testing, but
>> > the
>> > > > > >> appeal
>> > > > > >> > to
>> > > > > >> > > us
>> > > > > >> > > >> having done a bad job of requiring proper tests of
>> previous
>> > > > > >> features
>> > > > > >> > > just
>> > > > > >> > > >> makes me more concerned about not getting them here. I
>> > don't
>> > > > want
>> > > > > >> to
>> > > > > >> > set
>> > > > > >> > > >> yet another bad example that will then be pointed to in
>> the
>> > > > > future.
>> > > > > >> > > >>
>> > > > > >> > > >> On Sep 8, 2016 10:50, "Ted Yu" <yuzhih...@gmail.com>
>> > wrote:
>> > > > > >> > > >>
>> > > > > >> > > >> > Is there any concern which is not addressed ?
>> > > > > >> > > >> >
>> > > > > >> > > >> > Do we need another Vote thread ?
>> > > > > >> > > >> >
>> > > > > >> > > >> > Thanks
>> > > > > >> > > >> >
>> > > > > >> > > >> > On Thu, Sep 8, 2016 at 9:21 AM, Andrew Purtell <
>> > > > > >> apurt...@apache.org
>> > > > > >> > >
>> > > > > >> > > >> > wrote:
>> > > > > >> > > >> >
>> > > > > >> > > >> > > Vlad,
>> > > > > >> > > >> > >
>> > > > > >> > > >> > > I apologize for using the term 'half-baked' in a way
>> > that
>> > > > > could
>> > > > > >> > > seem a
>> > > > > >> > > >> > > description of HBASE-7912. I meant that as a general
>> > > > > >> hypothetical.
>> > > > > >> > > >> > >
>> > > > > >> > > >> > > On Wed, Sep 7, 2016 at 9:36 AM, Vladimir Rodionov <
>> > > > > >> > > >> > vladrodio...@gmail.com>
>> > > > > >> > > >> > > wrote:
>> > > > > >> > > >> > >
>> > > > > >> > > >> > > > >> I'm not sure that "There is already lots of
>> > > half-baked
>> > > > > >> code
>> > > > > >> > in
>> > > > > >> > > the
>> > > > > >> > > >> > > > branch,
>> > > > > >> > > >> > > > so what's the harm in adding more?"
>> > > > > >> > > >> > > >
>> > > > > >> > > >> > > > I meant - not production - ready yet. This is 2.0
>> > > > > development
>> > > > > >> > > branch
>> > > > > >> > > >> > and,
>> > > > > >> > > >> > > > hence many features are in works,
>> > > > > >> > > >> > > > not being tested well etc. I do not consider backup
>> > as
>> > > > half
>> > > > > >> > baked
>> > > > > >> > > >> > > feature -
>> > > > > >> > > >> > > > it has passed our internal QA and has very good
>> doc,
>> > > > which
>> > > > > we
>> > > > > >> > will
>> > > > > >> > > >> > > provide
>> > > > > >> > > >> > > > to Apache shortly.
>> > > > > >> > > >> > > >
>> > > > > >> > > >> > > > -Vlad
>> > > > > >> > > >> > > >
>> > > > > >> > > >> > > > On Wed, Sep 7, 2016 at 9:13 AM, Andrew Purtell <
>> > > > > >> > > apurt...@apache.org>
>> > > > > >> > > >> > > > wrote:
>> > > > > >> > > >> > > >
>> > > > > >> > > >> > > > > We shouldn't admit half baked changes that won't
>> be
>> > > > > >> finished.
>> > > > > >> > > >> However
>> > > > > >> > > >> > > in
>> > > > > >> > > >> > > > > this case the crew working on this feature are
>> long
>> > > > > timers
>> > > > > >> and
>> > > > > >> > > less
>> > > > > >> > > >> > > > likely
>> > > > > >> > > >> > > > > than just about anyone to leave something in a
>> half
>> > > > baked
>> > > > > >> > > state. Of
>> > > > > >> > > >> > > > course
>> > > > > >> > > >> > > > > there is no guarantee how anything will turn out,
>> > > but I
>> > > > > am
>> > > > > >> > > willing
>> > > > > >> > > >> to
>> > > > > >> > > >> > > > take
>> > > > > >> > > >> > > > > a little on faith if they feel their best path
>> > > forward
>> > > > > now
>> > > > > >> is
>> > > > > >> > to
>> > > > > >> > > >> > merge
>> > > > > >> > > >> > > to
>> > > > > >> > > >> > > > > trunk. I only wish I had bandwidth to have done
>> > some
>> > > > real
>> > > > > >> > > kicking
>> > > > > >> > > >> of
>> > > > > >> > > >> > > the
>> > > > > >> > > >> > > > > tires by now. Maybe this week.
>> > > > > >> > > >> > > > >
>> > > > > >> > > >> > > > > (Yes, I'm using some of that time for this email
>> > :-)
>> > > > but
>> > > > > I
>> > > > > >> > type
>> > > > > >> > > >> > fast.)
>> > > > > >> > > >> > > > >
>> > > > > >> > > >> > > > > That said, I would like to agitate for making 2.0
>> > > more
>> > > > > real
>> > > > > >> > and
>> > > > > >> > > >> spend
>> > > > > >> > > >> > > > some
>> > > > > >> > > >> > > > > time on it now that I'm winding down with 0.98. I
>> > > think
>> > > > > >> that
>> > > > > >> > > means
>> > > > > >> > > >> > > > > branching for 2.0 real soon now and even evicting
>> > > > things
>> > > > > >> from
>> > > > > >> > > 2.0
>> > > > > >> > > >> > > branch
>> > > > > >> > > >> > > > > that aren't finished or stable, leaving them only
>> > > once
>> > > > > >> again
>> > > > > >> > in
>> > > > > >> > > the
>> > > > > >> > > >> > > > master
>> > > > > >> > > >> > > > > branch. Or, maybe just evicting them. Let's take
>> it
>> > > > case
>> > > > > by
>> > > > > >> > > case.
>> > > > > >> > > >> > > > >
>> > > > > >> > > >> > > > > I think this feature can come in relatively
>> safely.
>> > > As
>> > > > > >> added
>> > > > > >> > > >> > insurance,
>> > > > > >> > > >> > > > > let's admit the possibility it could be reverted
>> on
>> > > the
>> > > > > 2.0
>> > > > > >> > > branch
>> > > > > >> > > >> if
>> > > > > >> > > >> > > > folks
>> > > > > >> > > >> > > > > working on stabilizing 2.0 decide to evict it
>> > because
>> > > > it
>> > > > > is
>> > > > > >> > > >> > unfinished
>> > > > > >> > > >> > > or
>> > > > > >> > > >> > > > > unstable, because that certainly can happen. I
>> > would
>> > > > > >> expect if
>> > > > > >> > > talk
>> > > > > >> > > >> > > like
>> > > > > >> > > >> > > > > that starts, we'd get help finishing or
>> stabilizing
>> > > > > what's
>> > > > > >> > under
>> > > > > >> > > >> > > > discussion
>> > > > > >> > > >> > > > > for revert. Or, we'd have a revert. Either way
>> the
>> > > > > outcome
>> > > > > >> is
>> > > > > >> > > >> > > acceptable.
>> > > > > >> > > >> > > > >
>> > > > > >> > > >> > > > >
>> > > > > >> > > >> > > > > On Wed, Sep 7, 2016 at 8:56 AM, Dima Spivak <
>> > > > > >> > > dimaspi...@apache.org
>> > > > > >> > > >> >
>> > > > > >> > > >> > > > wrote:
>> > > > > >> > > >> > > > >
>> > > > > >> > > >> > > > > > I'm not sure that "There is already lots of
>> > > > half-baked
>> > > > > >> code
>> > > > > >> > in
>> > > > > >> > > >> the
>> > > > > >> > > >> > > > > branch,
>> > > > > >> > > >> > > > > > so what's the harm in adding more?" is a good
>> > code
>> > > > > commit
>> > > > > >> > > >> > philosophy
>> > > > > >> > > >> > > > for
>> > > > > >> > > >> > > > > a
>> > > > > >> > > >> > > > > > fault-tolerant distributed data store. ;)
>> > > > > >> > > >> > > > > >
>> > > > > >> > > >> > > > > > More seriously, a lack of test coverage for
>> > > existing
>> > > > > >> > features
>> > > > > >> > > >> > > shouldn't
>> > > > > >> > > >> > > > > be
>> > > > > >> > > >> > > > > > used as justification for introducing new
>> > features
>> > > > with
>> > > > > >> the
>> > > > > >> > > same
>> > > > > >> > > >> > > > > > shortcomings. Ultimately, it's the end user who
>> > > will
>> > > > > feel
>> > > > > >> > the
>> > > > > >> > > >> pain,
>> > > > > >> > > >> > > so
>> > > > > >> > > >> > > > > > shouldn't we do everything we can to mitigate
>> > that?
>> > > > > >> > > >> > > > > >
>> > > > > >> > > >> > > > > > -Dima
>> > > > > >> > > >> > > > > >
>> > > > > >> > > >> > > > > > On Wed, Sep 7, 2016 at 8:46 AM, Vladimir
>> > Rodionov <
>> > > > > >> > > >> > > > > vladrodio...@gmail.com>
>> > > > > >> > > >> > > > > > wrote:
>> > > > > >> > > >> > > > > >
>> > > > > >> > > >> > > > > > > Sean,
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > > * have docs
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > > Agree. We have a doc and backup is the most
>> > > > > documented
>> > > > > >> > > feature
>> > > > > >> > > >> > :),
>> > > > > >> > > >> > > we
>> > > > > >> > > >> > > > > > will
>> > > > > >> > > >> > > > > > > release it shortly to Apache.
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > > * have sunny-day correctness tests
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > > Feature has  close to 60 test cases, which
>> run
>> > > for
>> > > > > >> approx
>> > > > > >> > 30
>> > > > > >> > > >> min.
>> > > > > >> > > >> > > We
>> > > > > >> > > >> > > > > can
>> > > > > >> > > >> > > > > > > add more, if community do not mind :)
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > > * have correctness-in-face-of-failure tests
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > > Any examples of these tests in existing
>> > features?
>> > > > In
>> > > > > >> > works,
>> > > > > >> > > we
>> > > > > >> > > >> > > have a
>> > > > > >> > > >> > > > > > clear
>> > > > > >> > > >> > > > > > > understanding of what should be done by the
>> > time
>> > > of
>> > > > > 2.0
>> > > > > >> > > >> release.
>> > > > > >> > > >> > > > > > > That is very close goal for us, to verify IT
>> > > monkey
>> > > > > for
>> > > > > >> > > >> existing
>> > > > > >> > > >> > > > code.
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > > * don't rely on things outside of HBase for
>> > > normal
>> > > > > >> > operation
>> > > > > >> > > >> > (okay
>> > > > > >> > > >> > > > for
>> > > > > >> > > >> > > > > > > advanced operation)
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > > We do not.
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > > Enormous time has been spent already on the
>> > > > > development
>> > > > > >> > and
>> > > > > >> > > >> > testing
>> > > > > >> > > >> > > > the
>> > > > > >> > > >> > > > > > > feature, it has passed our internal tests and
>> > > many
>> > > > > >> rounds
>> > > > > >> > of
>> > > > > >> > > >> code
>> > > > > >> > > >> > > > > reviews
>> > > > > >> > > >> > > > > > > by HBase committers. We do not mind if
>> someone
>> > > from
>> > > > > >> HBase
>> > > > > >> > > >> > community
>> > > > > >> > > >> > > > > > > (outside of HW) will review the code, but it
>> > will
>> > > > > >> probably
>> > > > > >> > > >> takes
>> > > > > >> > > >> > > > > forever
>> > > > > >> > > >> > > > > > to
>> > > > > >> > > >> > > > > > > wait for volunteer?, the feature is quite
>> large
>> > > > (1MB+
>> > > > > >> > > >> cumulative
>> > > > > >> > > >> > > > patch)
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > > 2.0 branch is full of half baked features,
>> most
>> > > of
>> > > > > them
>> > > > > >> > are
>> > > > > >> > > in
>> > > > > >> > > >> > > active
>> > > > > >> > > >> > > > > > > development, therefore I am not following you
>> > > here,
>> > > > > >> Sean?
>> > > > > >> > > Why
>> > > > > >> > > >> > > > > HBASE-7912
>> > > > > >> > > >> > > > > > is
>> > > > > >> > > >> > > > > > > not good enough yet to be integrated into 2.0
>> > > > branch?
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > > -Vlad
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > > On Wed, Sep 7, 2016 at 8:23 AM, Sean Busbey <
>> > > > > >> > > bus...@apache.org
>> > > > > >> > > >> >
>> > > > > >> > > >> > > > wrote:
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > > > > On Tue, Sep 6, 2016 at 10:36 PM, Josh
>> Elser <
>> > > > > >> > > >> > > josh.el...@gmail.com>
>> > > > > >> > > >> > > > > > > wrote:
>> > > > > >> > > >> > > > > > > > > So, the answer to Sean's original
>> question
>> > is
>> > > > "as
>> > > > > >> > > robust as
>> > > > > >> > > >> > > > > snapshots
>> > > > > >> > > >> > > > > > > > > presently are"? (independence of
>> > > backup/restore
>> > > > > >> > failure
>> > > > > >> > > >> > > tolerance
>> > > > > >> > > >> > > > > > from
>> > > > > >> > > >> > > > > > > > > snapshot failure tolerance)
>> > > > > >> > > >> > > > > > > > >
>> > > > > >> > > >> > > > > > > > > Is this just a question WRT context of
>> the
>> > > > > change,
>> > > > > >> or
>> > > > > >> > > is it
>> > > > > >> > > >> > > means
>> > > > > >> > > >> > > > > > for a
>> > > > > >> > > >> > > > > > > > veto
>> > > > > >> > > >> > > > > > > > > from you, Sean? Just trying to make sure
>> > I'm
>> > > > > >> following
>> > > > > >> > > >> along
>> > > > > >> > > >> > > > > > > adequately.
>> > > > > >> > > >> > > > > > > > >
>> > > > > >> > > >> > > > > > > > >
>> > > > > >> > > >> > > > > > > >
>> > > > > >> > > >> > > > > > > > I'd say ATM I'm -0, bordering on -1 but not
>> > for
>> > > > > >> reasons
>> > > > > >> > I
>> > > > > >> > > can
>> > > > > >> > > >> > > > > > articulate
>> > > > > >> > > >> > > > > > > > well.
>> > > > > >> > > >> > > > > > > >
>> > > > > >> > > >> > > > > > > > Here's an attempt.
>> > > > > >> > > >> > > > > > > >
>> > > > > >> > > >> > > > > > > > We've been trying to move, as a community,
>> > > > towards
>> > > > > >> > > minimizing
>> > > > > >> > > >> > > risk
>> > > > > >> > > >> > > > to
>> > > > > >> > > >> > > > > > > > downstream folks by getting "complete
>> enough
>> > > for
>> > > > > use"
>> > > > > >> > > gates
>> > > > > >> > > >> in
>> > > > > >> > > >> > > > place
>> > > > > >> > > >> > > > > > > > before we introduce new features. This was
>> > > > spurred
>> > > > > >> by a
>> > > > > >> > > some
>> > > > > >> > > >> > > > features
>> > > > > >> > > >> > > > > > > > getting in half-baked and never making it
>> to
>> > > "can
>> > > > > >> really
>> > > > > >> > > use"
>> > > > > >> > > >> > > > status
>> > > > > >> > > >> > > > > > > > (I'm thinking of distributed log replay and
>> > the
>> > > > > >> zk-less
>> > > > > >> > > >> > > assignment
>> > > > > >> > > >> > > > > > > > stuff, I don't recall if there was more).
>> > > > > >> > > >> > > > > > > >
>> > > > > >> > > >> > > > > > > > The gates, generally, included things like:
>> > > > > >> > > >> > > > > > > >
>> > > > > >> > > >> > > > > > > > * have docs
>> > > > > >> > > >> > > > > > > > * have sunny-day correctness tests
>> > > > > >> > > >> > > > > > > > * have correctness-in-face-of-failure tests
>> > > > > >> > > >> > > > > > > > * don't rely on things outside of HBase for
>> > > > normal
>> > > > > >> > > operation
>> > > > > >> > > >> > > (okay
>> > > > > >> > > >> > > > > for
>> > > > > >> > > >> > > > > > > > advanced operation)
>> > > > > >> > > >> > > > > > > >
>> > > > > >> > > >> > > > > > > > As an example, we kept the MOB work off in
>> a
>> > > > branch
>> > > > > >> and
>> > > > > >> > > out
>> > > > > >> > > >> of
>> > > > > >> > > >> > > > master
>> > > > > >> > > >> > > > > > > > until it could pass these criteria. The big
>> > > > > exemption
>> > > > > >> > > we've
>> > > > > >> > > >> had
>> > > > > >> > > >> > > to
>> > > > > >> > > >> > > > > > > > this was the hbase-spark integration, where
>> > we
>> > > > all
>> > > > > >> > agreed
>> > > > > >> > > it
>> > > > > >> > > >> > > could
>> > > > > >> > > >> > > > > > > > land in master because it was very well
>> > > isolated
>> > > > > (the
>> > > > > >> > > slide
>> > > > > >> > > >> > away
>> > > > > >> > > >> > > > from
>> > > > > >> > > >> > > > > > > > including docs as a first-class part of
>> > > building
>> > > > up
>> > > > > >> that
>> > > > > >> > > >> > > > integration
>> > > > > >> > > >> > > > > > > > has led me to doubt the wisdom of this
>> > > decision).
>> > > > > >> > > >> > > > > > > >
>> > > > > >> > > >> > > > > > > > We've also been treating inclusion in a
>> > > "probably
>> > > > > >> will
>> > > > > >> > be
>> > > > > >> > > >> > > released
>> > > > > >> > > >> > > > to
>> > > > > >> > > >> > > > > > > > downstream" branches as a higher bar,
>> > requiring
>> > > > > >> > > >> > > > > > > >
>> > > > > >> > > >> > > > > > > > * don't moderately impact performance when
>> > the
>> > > > > >> feature
>> > > > > >> > > isn't
>> > > > > >> > > >> in
>> > > > > >> > > >> > > use
>> > > > > >> > > >> > > > > > > > * don't severely impact performance when
>> the
>> > > > > feature
>> > > > > >> is
>> > > > > >> > in
>> > > > > >> > > >> use
>> > > > > >> > > >> > > > > > > > * either default-to-on or show enough
>> demand
>> > to
>> > > > > >> believe
>> > > > > >> > a
>> > > > > >> > > >> > > > non-trivial
>> > > > > >> > > >> > > > > > > > number of folks will turn the feature on
>> > > > > >> > > >> > > > > > > >
>> > > > > >> > > >> > > > > > > > The above has kept MOB and hbase-spark
>> > > > integration
>> > > > > >> out
>> > > > > >> > of
>> > > > > >> > > >> > > branch-1,
>> > > > > >> > > >> > > > > > > > presumably while they've "gotten more
>> stable"
>> > > in
>> > > > > >> master
>> > > > > >> > > from
>> > > > > >> > > >> > the
>> > > > > >> > > >> > > > odd
>> > > > > >> > > >> > > > > > > > vendor inclusion.
>> > > > > >> > > >> > > > > > > >
>> > > > > >> > > >> > > > > > > > Are we going to have a 2.0 release before
>> the
>> > > end
>> > > > > of
>> > > > > >> the
>> > > > > >> > > >> year?
>> > > > > >> > > >> > > > We're
>> > > > > >> > > >> > > > > > > > coming up on 1.5 years since the release of
>> > > > version
>> > > > > >> 1.0;
>> > > > > >> > > >> seems
>> > > > > >> > > >> > > like
>> > > > > >> > > >> > > > > > > > it's about time, though I haven't seen any
>> > > > concrete
>> > > > > >> > plans
>> > > > > >> > > >> this
>> > > > > >> > > >> > > > year.
>> > > > > >> > > >> > > > > > > > Presuming we are going to have one by the
>> end
>> > > of
>> > > > > the
>> > > > > >> > > year, it
>> > > > > >> > > >> > > > seems a
>> > > > > >> > > >> > > > > > > > bit close to still be adding in "features
>> > that
>> > > > need
>> > > > > >> > > maturing"
>> > > > > >> > > >> > on
>> > > > > >> > > >> > > > the
>> > > > > >> > > >> > > > > > > > branch.
>> > > > > >> > > >> > > > > > > >
>> > > > > >> > > >> > > > > > > > The lack of a concrete plan for 2.0 keeps
>> me
>> > > from
>> > > > > >> > > considering
>> > > > > >> > > >> > > these
>> > > > > >> > > >> > > > > > > > things blocker at the moment. But I know
>> > first
>> > > > hand
>> > > > > >> how
>> > > > > >> > > much
>> > > > > >> > > >> > > > trouble
>> > > > > >> > > >> > > > > > > > folks have had with other features that
>> have
>> > > gone
>> > > > > >> into
>> > > > > >> > > >> > downstream
>> > > > > >> > > >> > > > > > > > facing releases without robustness checks
>> > (i.e.
>> > > > > >> > > replication),
>> > > > > >> > > >> > and
>> > > > > >> > > >> > > > I'm
>> > > > > >> > > >> > > > > > > > concerned about what we're setting up if
>> 2.0
>> > > goes
>> > > > > out
>> > > > > >> > with
>> > > > > >> > > >> this
>> > > > > >> > > >> > > > > > > > feature in its current state.
>> > > > > >> > > >> > > > > > > >
>> > > > > >> > > >> > > > > > >
>> > > > > >> > > >> > > > > >
>> > > > > >> > > >> > > > >
>> > > > > >> > > >> > > > >
>> > > > > >> > > >> > > > >
>> > > > > >> > > >> > > > > --
>> > > > > >> > > >> > > > > Best regards,
>> > > > > >> > > >> > > > >
>> > > > > >> > > >> > > > >    - Andy
>> > > > > >> > > >> > > > >
>> > > > > >> > > >> > > > > Problems worthy of attack prove their worth by
>> > > hitting
>> > > > > >> back. -
>> > > > > >> > > Piet
>> > > > > >> > > >> > > Hein
>> > > > > >> > > >> > > > > (via Tom White)
>> > > > > >> > > >> > > > >
>> > > > > >> > > >> > > >
>> > > > > >> > > >> > >
>> > > > > >> > > >> > >
>> > > > > >> > > >> > >
>> > > > > >> > > >> > > --
>> > > > > >> > > >> > > Best regards,
>> > > > > >> > > >> > >
>> > > > > >> > > >> > >    - Andy
>> > > > > >> > > >> > >
>> > > > > >> > > >> > > Problems worthy of attack prove their worth by
>> hitting
>> > > > back.
>> > > > > -
>> > > > > >> > Piet
>> > > > > >> > > >> Hein
>> > > > > >> > > >> > > (via Tom White)
>> > > > > >> > > >> > >
>> > > > > >> > > >> >
>> > > > > >> > > >>
>> > > > > >> > >
>> > > > > >> >
>> > > > > >>
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>



-- 
busbey

Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

Reply via email to