On Mon, Sep 11, 2017 at 11:07 AM, Vladimir Rodionov <[email protected]>
wrote:

> Stack, Andrew
>
> We have doc blocker and (partially) HBASE-15227: two sub-tasks remain: one
> is unit test (you can't call it blocker)
> and another for FT support during incremental backup with bulk loading. The
> latter one have been probably addressed
> already in other HBASE-15527 subtasks. I have to reassess this.
>
> That is mostly it. Yes, We have not done real testing with real data on a
> real cluster yet, except QA  testing on a small OpenStack
> cluster (10 nodes). That is our probably the biggest minus right now. I
> would like to inform community that this week we are going to start
> full scale testing with reasonably sized data sets.
>
> The recent committed improvements, such as ability to run backup/restore on
> a particular Yarn pool (queue) allows precise control
> of a cluster utilization during operation (not to interfere much with a
> regular cluster operations). Another one -
>  converting WAL on the fly to HFiles - significantly improves storage usage
> on a backup site.
>
> My plan is to finish HBASE-17825 (further performance optimizations). This
> will cut down number of MR jobs during incremental backup
> from 2*N to 2  (N - number of tables). That will probably take 2-3 more
> days
>
> Then:
>
> 1. Address remaining two sub-tasks in HBASE-15227
> 2. Update Release notes for all relevant B&R JIRAs
> 3. Work on doc
>
> After that we can call it feature full complete. Taking into account the
> vast amount of efforts
> spent on this feature (including QA testing) I would say that we are
> probably quite close to GA right now, but only
> after real testing is done (I do not anticipate significant issues, except
> probably correct failure handling).
>
> On a feature itself. We provide tools to fully automate backup and restore
> tasks: create backup (full and incremental), restore
> from image, delete backups, merge backups, history, history per table,
> backup set management.
>
> Hopefully, my write up addresses at least some of your concerns.
>
>
Thanks for updating us (community) w/ status. Completion of HA seems
important as is result of the scale testing.

St.Ack




> -Vlad
>
> On Sun, Sep 10, 2017 at 6:27 AM, Josh Elser <[email protected]> wrote:
>
> > On Sat, Sep 9, 2017 at 7:04 PM, stack <[email protected]> wrote:
> > > In spite of repeated requests for eng summary of state of this feature
> --
> > > summary of what is in 2.0, what is not, what the capabilities are, how
> > well
> > > it has been tested and at what scale -- all I get, when the requests
> are
> > > not ignored, are pointers to lists of ill-describing jiras and some
> > pending
> > > user facing doc update.
> >
> > Yes, this is a problem. We, especially you as RM, shouldn't have
> > outstanding questions as to the quality/state of B&R.
> >
> > > For other features, mob or region server groups, I know that they have
> > been
> > > running at scale in production for as much as a year and more. I have
> > some
> > > confidence these items basically work.  For backup/restore I have no
> such
> > > sense even after spending time in review and trying to use the feature.
> >
> > I can attest to the feature being tested on small clusters. I'm not
> > sure about larger than 10node tests. If this is less a worry and more
> > a veto, let's get some criteria on the kind of testing you're looking
> > for to avoid having to rehash later.
> >
> > Do we have any kind of integration tests in the codebase now that can
> > help increase Stack's confidence?
> >
> > > As release manager, I have say over what makes it into a release.
> Unless
> > > the work is done to convince me that backup/restore is more than a lump
> > of
> > > code and a few unit tests that can pass on some fellows laptop, I am
> > going
> > > to kick it out of branch-2.  Let the feature harden more in master
> branch
> > > before it ships in a release.
> >
> > While it was a few months ago now, I can also attest to this being
> > more than some unit tests (I think I looked at it after I saw you last
> > down in the weeds).
> >
> > I do worry about trying to remove it at this state.
> >
> > * Do you consider the B&R code in the repository implicitly harmful?
> > Is there harm in shipping with docs capturing the concern.
> > * Trying to revert all relevant pieces from branch-2 is non-trivial.
> > * I would feel quite dejected if some feature I spent a year+ working
> > on (*not* making assertions on my perception of quality) was removed
> > from the release line it was expected to land.
> >
> > > S
> > >
> > > On Sep 8, 2017 10:59 PM, "Vladimir Rodionov" <[email protected]>
> > wrote:
> > >
> > >> >> Have I grasped the state of things correctly, Vlad?
> > >>
> > >> Josh, the only thing which is still pending is doc update. All other
> > >> features are good to have but not a blockers for 2.0 release.
> > >>
> > >> -Vlad
> > >>
> > >> On Fri, Sep 8, 2017 at 10:42 PM, Vladimir Rodionov <
> > [email protected]
> > >> >
> > >> wrote:
> > >>
> > >> > >> What testing and at what
> > >> > >> scale has testing been done?
> > >> >
> > >> > Do we have have that for other features?
> > >> >
> > >> >
> > >> > On Fri, Sep 8, 2017 at 10:41 PM, Vladimir Rodionov <
> > >> [email protected]
> > >> > > wrote:
> > >> >
> > >> >> >> It asks: "How do I figure what of backup/restore feature is
> going
> > to
> > >> >> be in
> > >> >> >>hbase-2.0.0?
> > >> >>
> > >> >> Hmm, wait for doc update.
> > >> >>
> > >> >>
> > >> >> On Fri, Sep 8, 2017 at 2:39 PM, Stack <[email protected]> wrote:
> > >> >>
> > >> >>> HBASE-14414 is a JIRA with a list of random seeming issues w/
> > >> >>> non-descript
> > >> >>> summaries: "Add nonce support to TableBackupProcedure, BackupID
> must
> > >> >>> include backup set name, ...". The last comment in that issue is
> > from
> > >> >>> July.
> > >> >>> It asks: "How do I figure what of backup/restore feature is going
> > to be
> > >> >>> in
> > >> >>> hbase-2.0.0? Thanks Vladimir Rodionov
> > >> >>> <https://issues.apache.org/jira/secure/ViewProfile.jspa?
> > name=vrodionov
> > >> >>> >."
> > >> >>> to which there is no answer.  Doc update is TODO.
> > >> >>>
> > >> >>> Where is the summary of the capability in hbase-2? What testing
> and
> > at
> > >> >>> what
> > >> >>> scale has testing been done? Is this 'stable or experimental'? If
> I
> > >> can't
> > >> >>> get basic info on this feature though I ask repeatedly, what hope
> > does
> > >> >>> the
> > >> >>> poor old operator have?
> > >> >>>
> > >> >>> St.Ack
> > >> >>>
> > >> >>>
> > >> >>> On Fri, Sep 8, 2017 at 1:59 PM, Vladimir Rodionov <
> > >> >>> [email protected]>
> > >> >>> wrote:
> > >> >>>
> > >> >>> > HBASE-14414
> > >> >>> >
> > >> >>> > On Fri, Sep 8, 2017 at 1:14 PM, Stack <[email protected]> wrote:
> > >> >>> >
> > >> >>> > > Where do I go to get the current status of this feature?
> > Looking in
> > >> >>> JIRA
> > >> >>> > I
> > >> >>> > > see loads of issues open against backup including some against
> > >> >>> > hbase-2.0.0
> > >> >>> > > and no progress being made that I can discern.
> > >> >>> > >
> > >> >>> > > Thanks,
> > >> >>> > > S
> > >> >>> > >
> > >> >>> > >
> > >> >>> > >
> > >> >>> > > On Wed, Nov 23, 2016 at 8:52 AM, Stack <[email protected]>
> > wrote:
> > >> >>> > >
> > >> >>> > > > On Tue, Nov 22, 2016 at 6:48 PM, Stack <[email protected]>
> > wrote:
> > >> >>> > > >
> > >> >>> > > >> On Tue, Nov 22, 2016 at 3:17 PM, Vladimir Rodionov <
> > >> >>> > > >> [email protected]> wrote:
> > >> >>> > > >>
> > >> >>> > > >>> >> and/or he answered most of the review feedback
> > >> >>> > > >>>
> > >> >>> > > >>> No, questions are still open, but I do not see any
> blockers
> > and
> > >> >>> we
> > >> >>> > have
> > >> >>> > > >>> HBASE-16940 to address these questions.
> > >> >>> > > >>>
> > >> >>> > > >>>
> > >> >>> > > >> Agree. No blockers but stuff that should be dealt with (No
> > one
> > >> >>> will
> > >> >>> > pay
> > >> >>> > > >> me any attention once merge goes in -- smile).
> > >> >>> > > >>
> > >> >>> > > >>
> > >> >>> > > > Let me clarify the above. I want review addressed before
> merge
> > >> >>> happens.
> > >> >>> > > > Sorry if any confusion.
> > >> >>> > > > St.Ack
> > >> >>> > > >
> > >> >>> > > >
> > >> >>> > > >
> > >> >>> > > >
> > >> >>> > > >
> > >> >>> > > >
> > >> >>> > > >> St.Ack
> > >> >>> > > >>
> > >> >>> > > >>
> > >> >>> > > >>
> > >> >>> > > >>> On Tue, Nov 22, 2016 at 3:04 PM, Devaraj Das <
> > >> >>> [email protected]>
> > >> >>> > > >>> wrote:
> > >> >>> > > >>>
> > >> >>> > > >>> > Hi Stack, hats off to you for spending so much time on
> > this!
> > >> >>> > Thanks!
> > >> >>> > > >>> From
> > >> >>> > > >>> > my understanding, Vlad has raised follow-up jiras for
> the
> > >> >>> issues
> > >> >>> > you
> > >> >>> > > >>> > raised, and/or he answered most of the review feedback.
> > So,
> > >> do
> > >> >>> you
> > >> >>> > > >>> think we
> > >> >>> > > >>> > could do a merge vote now?
> > >> >>> > > >>> > Devaraj.
> > >> >>> > > >>> > ________________________________________
> > >> >>> > > >>> > From: Vladimir Rodionov <[email protected]>
> > >> >>> > > >>> > Sent: Monday, November 21, 2016 8:34 PM
> > >> >>> > > >>> > To: [email protected]
> > >> >>> > > >>> > Subject: Re: [DISCUSSION] Merge Backup / Restore -
> Branch
> > >> >>> > HBASE-7912
> > >> >>> > > >>> >
> > >> >>> > > >>> > >> I have spent a good bit of time reviewing and testing
> > this
> > >> >>> > > feature.
> > >> >>> > > >>> I
> > >> >>> > > >>> > would
> > >> >>> > > >>> > >> like my review and concerns addressed and I'd like it
> > to
> > >> be
> > >> >>> > clear
> > >> >>> > > >>> how;
> > >> >>> > > >>> > >> either explicit follow-on issues, pointers to where
> in
> > the
> > >> >>> patch
> > >> >>> > > or
> > >> >>> > > >>> doc
> > >> >>> > > >>> > my
> > >> >>> > > >>> > >> remarks have been catered to, etc. Until then, I am
> > >> against
> > >> >>> > > commit.
> > >> >>> > > >>> >
> > >> >>> > > >>> > Stack, mega patch review comments will be addressed in
> the
> > >> >>> > dedicated
> > >> >>> > > >>> JIRA:
> > >> >>> > > >>> > HBASE-16940
> > >> >>> > > >>> > I have open several other JIRAs to address your other
> > >> comments
> > >> >>> (not
> > >> >>> > > on
> > >> >>> > > >>> > review board).
> > >> >>> > > >>> >
> > >> >>> > > >>> > Details are here (end of the thread):
> > >> >>> > > >>> > https://issues.apache.org/jira/browse/HBASE-14123
> > >> >>> > > >>> >
> > >> >>> > > >>> > Let me know what else should we do to move merge
> forward.
> > >> >>> > > >>> >
> > >> >>> > > >>> > -Vlad
> > >> >>> > > >>> >
> > >> >>> > > >>> >
> > >> >>> > > >>> > On Fri, Nov 18, 2016 at 4:54 PM, Stack <
> [email protected]>
> > >> >>> wrote:
> > >> >>> > > >>> >
> > >> >>> > > >>> > > On Fri, Nov 18, 2016 at 3:53 PM, Ted Yu <
> > >> [email protected]
> > >> >>> >
> > >> >>> > > wrote:
> > >> >>> > > >>> > >
> > >> >>> > > >>> > > > Thanks, Matteo.
> > >> >>> > > >>> > > >
> > >> >>> > > >>> > > > bq. restore is not clear if given an incremental id
> it
> > >> >>> will do
> > >> >>> > > the
> > >> >>> > > >>> full
> > >> >>> > > >>> > > > restore from full up to that point or if i need to
> > apply
> > >> >>> > manually
> > >> >>> > > >>> > > > everything
> > >> >>> > > >>> > > >
> > >> >>> > > >>> > > > The restore takes into consideration of the
> dependent
> > >> >>> > backup(s).
> > >> >>> > > >>> > > > So there is no need to apply preceding backup(s)
> > >> manually.
> > >> >>> > > >>> > > >
> > >> >>> > > >>> > > >
> > >> >>> > > >>> > > I ask this question on the issue. It is not clear from
> > the
> > >> >>> usage
> > >> >>> > or
> > >> >>> > > >>> doc
> > >> >>> > > >>> > how
> > >> >>> > > >>> > > to run a restore from incremental. Can you fix in doc
> > and
> > >> >>> usage
> > >> >>> > how
> > >> >>> > > >>> so I
> > >> >>> > > >>> > > can be clear and try it. Currently I am stuck
> verifying
> > a
> > >> >>> round
> > >> >>> > > trip
> > >> >>> > > >>> > backup
> > >> >>> > > >>> > > restore made of incrementals.
> > >> >>> > > >>> > >
> > >> >>> > > >>> > > Thanks,
> > >> >>> > > >>> > > S
> > >> >>> > > >>> > >
> > >> >>> > > >>> > >
> > >> >>> > > >>> > >
> > >> >>> > > >>> > > > On Fri, Nov 18, 2016 at 3:48 PM, Matteo Bertozzi <
> > >> >>> > > >>> > > [email protected]>
> > >> >>> > > >>> > > > wrote:
> > >> >>> > > >>> > > >
> > >> >>> > > >>> > > > > I did one last pass to the mega patch. I don't see
> > >> >>> anything
> > >> >>> > > major
> > >> >>> > > >>> > that
> > >> >>> > > >>> > > > > should block the merge.
> > >> >>> > > >>> > > > >
> > >> >>> > > >>> > > > > - most of the code is isolated in the backup
> package
> > >> >>> > > >>> > > > > - all the backup code is client side
> > >> >>> > > >>> > > > > - there are few changes to the server side, mainly
> > for
> > >> >>> > > cleaners,
> > >> >>> > > >>> wal
> > >> >>> > > >>> > > > > rolling and similar (which is ok)
> > >> >>> > > >>> > > > > - there is a good number of tests, and an
> > integration
> > >> >>> test
> > >> >>> > > >>> > > > >
> > >> >>> > > >>> > > > > the code seems to have still some left overs from
> > the
> > >> old
> > >> >>> > > >>> > > implementation,
> > >> >>> > > >>> > > > > and some stuff needs a cleanup. but I don't think
> > this
> > >> >>> should
> > >> >>> > > be
> > >> >>> > > >>> used
> > >> >>> > > >>> > > as
> > >> >>> > > >>> > > > an
> > >> >>> > > >>> > > > > argument to block the merge. I think the guys will
> > keep
> > >> >>> > working
> > >> >>> > > >>> on
> > >> >>> > > >>> > this
> > >> >>> > > >>> > > > and
> > >> >>> > > >>> > > > > they may also get help of others once the patch is
> > in
> > >> >>> master.
> > >> >>> > > >>> > > > >
> > >> >>> > > >>> > > > > I still have my concerns about the current
> > limitations,
> > >> >>> but
> > >> >>> > > >>> these are
> > >> >>> > > >>> > > > > things already planned for phase 3, so some of
> this
> > >> >>> stuff may
> > >> >>> > > >>> even be
> > >> >>> > > >>> > > in
> > >> >>> > > >>> > > > > the final 2.0.
> > >> >>> > > >>> > > > > but as long as we have a "current limitations"
> > section
> > >> >>> in the
> > >> >>> > > >>> user
> > >> >>> > > >>> > > guide
> > >> >>> > > >>> > > > > mentioning important stuff like the ones below,
> I'm
> > ok
> > >> >>> with
> > >> >>> > it.
> > >> >>> > > >>> > > > >  - if you write to the table with
> > Durability.SKIP_WALS
> > >> >>> your
> > >> >>> > > data
> > >> >>> > > >>> will
> > >> >>> > > >>> > > not
> > >> >>> > > >>> > > > > be in the incremental-backup
> > >> >>> > > >>> > > > >  - if you bulkload files that data will not be in
> > the
> > >> >>> > > incremental
> > >> >>> > > >>> > > backup
> > >> >>> > > >>> > > > > (HBASE-14417)
> > >> >>> > > >>> > > > >  - the incremental backup will not only contains
> the
> > >> >>> data of
> > >> >>> > > the
> > >> >>> > > >>> > table
> > >> >>> > > >>> > > > you
> > >> >>> > > >>> > > > > specified but also the regions from other tables
> > that
> > >> >>> are on
> > >> >>> > > the
> > >> >>> > > >>> same
> > >> >>> > > >>> > > set
> > >> >>> > > >>> > > > > of RSs (HBASE-14141) ...maybe a note about
> security
> > >> >>> around
> > >> >>> > this
> > >> >>> > > >>> topic
> > >> >>> > > >>> > > > >  - the incremental backup will not contains just
> the
> > >> >>> "latest
> > >> >>> > > row"
> > >> >>> > > >>> > > between
> > >> >>> > > >>> > > > > backup A and B, but it will also contains all the
> > >> updates
> > >> >>> > > >>> occurred in
> > >> >>> > > >>> > > > > between. but the restore does not allow you to
> > restore
> > >> >>> up to
> > >> >>> > a
> > >> >>> > > >>> > certain
> > >> >>> > > >>> > > > > point in time, the restore will always be up to
> the
> > >> >>> "latest
> > >> >>> > > >>> backup
> > >> >>> > > >>> > > > point".
> > >> >>> > > >>> > > > >  - you should limit the number of "incremental" up
> > to N
> > >> >>> (or
> > >> >>> > > maybe
> > >> >>> > > >>> > > SIZE),
> > >> >>> > > >>> > > > to
> > >> >>> > > >>> > > > > avoid replay time becoming the bottleneck.
> > >> (HBASE-14135)
> > >> >>> > > >>> > > > >
> > >> >>> > > >>> > > > > I'll be ok even with the above not being in the
> > final
> > >> >>> 2.0,
> > >> >>> > > >>> > > > > but i'd like to see as blocker for the final 2.0
> > (not
> > >> the
> > >> >>> > > merge)
> > >> >>> > > >>> > > > >  - the backup code moved in an hbase-backup module
> > >> >>> > > >>> > > > >  - and some more work around tools, especially to
> > try
> > >> to
> > >> >>> > unify
> > >> >>> > > >>> and
> > >> >>> > > >>> > make
> > >> >>> > > >>> > > > > simple the backup experience (simple example: in
> > some
> > >> >>> case
> > >> >>> > > there
> > >> >>> > > >>> is a
> > >> >>> > > >>> > > > > backup_id argument in others a backupId argument.
> or
> > >> >>> things
> > >> >>> > > >>> like..
> > >> >>> > > >>> > > > restore
> > >> >>> > > >>> > > > > is not clear if given an incremental id it will do
> > the
> > >> >>> full
> > >> >>> > > >>> restore
> > >> >>> > > >>> > > from
> > >> >>> > > >>> > > > > full up to that point or if i need to apply
> manually
> > >> >>> > > everything).
> > >> >>> > > >>> > > > >
> > >> >>> > > >>> > > > > in conclusion, I think we can open a merge vote.
> > I'll
> > >> be
> > >> >>> +1
> > >> >>> > on
> > >> >>> > > >>> it,
> > >> >>> > > >>> > and
> > >> >>> > > >>> > > I
> > >> >>> > > >>> > > > > think we should try to reject -1 with just a "code
> > >> >>> cleanup"
> > >> >>> > > >>> > motivation,
> > >> >>> > > >>> > > > > since there will still be work going on on the
> code
> > >> >>> after the
> > >> >>> > > >>> merge.
> > >> >>> > > >>> > > > >
> > >> >>> > > >>> > > > > Matteo
> > >> >>> > > >>> > > > >
> > >> >>> > > >>> > > > >
> > >> >>> > > >>> > > > > On Sun, Nov 6, 2016 at 10:54 PM, Devaraj Das <
> > >> >>> > > >>> [email protected]>
> > >> >>> > > >>> > > > wrote:
> > >> >>> > > >>> > > > >
> > >> >>> > > >>> > > > > > Stack and others, anything else on the patch?
> > Merge
> > >> to
> > >> >>> > master
> > >> >>> > > >>> now?
> > >> >>> > > >>> > > > > >
> > >> >>> > > >>> > > > >
> > >> >>> > > >>> > > >
> > >> >>> > > >>> > >
> > >> >>> > > >>> >
> > >> >>> > > >>>
> > >> >>> > > >>
> > >> >>> > > >>
> > >> >>> > > >
> > >> >>> > >
> > >> >>> >
> > >> >>>
> > >> >>
> > >> >>
> > >> >
> > >>
> >
>

Reply via email to