Re: [DISCUSS] Hbase Backup design changes

Mallikarjun Sun, 31 Jan 2021 02:58:52 -0800

Bringing up this thread.

On Mon, Jan 25, 2021, 3:38 PM Viraj Jasani <[email protected]> wrote:


> Thanks, the image is visible now.
>
> > Since I wanted to open this for discussion, did not consider placing it
> in
> *hbase/dev_support/design-docs*.
>
> Definitely, only after we come to concrete conclusion with the reviewer, we
> should open up a PR. Until then this thread is anyways up for discussion.
>
>
> On Mon, 25 Jan 2021 at 1:58 PM, Mallikarjun <[email protected]>
> wrote:
>
> > Hope this link works --> https://ibb.co/hYjRpgP
> >
> > Inline reply
> > On Mon, Jan 25, 2021 at 1:16 PM Viraj Jasani <[email protected]> wrote:
> >
> > > Hi,
> > >
> > > Still not available :)
> > > The attachments don’t work on mailing lists. You can try uploading the
> > > attachment on some public hosting site and provide the url to the same
> > > here.
> > >
> > > Since I am not aware of the contents, I cannot confirm right away but
> if
> > > the reviewer feels we should have the attachment on our github repo:
> > > hbase/dev-support/design-docs , good to upload the content there later.
> > For
> > > instance, pdf file can contain existing design and new design diagrams
> > and
> > > talk about pros and cons etc once we have things finalized.
> > >
> > >
> > Since I wanted to open this for discussion, did not consider placing it
> in
> > *hbase/dev_support/design-docs*.
> >
> >
> > >
> > > On Mon, 25 Jan 2021 at 12:13 PM, Mallikarjun <[email protected]
> >
> > > wrote:
> > >
> > > > Attached as image. Please let me know if it is availabe now.
> > > >
> > > > ---
> > > > Mallikarjun
> > > >
> > > >
> > > > On Mon, Jan 25, 2021 at 10:32 AM Sean Busbey <[email protected]>
> > wrote:
> > > >
> > > >> Hi!
> > > >>
> > > >> Thanks for the write up. unfortunately, your image for the existing
> > > >> design didn't come through. Could you post it to some host and link
> it
> > > >> here?
> > > >>
> > > >> On Sun, Jan 24, 2021 at 3:12 AM Mallikarjun <
> [email protected]
> > >
> > > >> wrote:
> > > >> >
> > > >> > Existing Design:
> > > >> >
> > > >> >
> > > >> >
> > > >> > Problem 1:
> > > >> >
> > > >> > With this design, Incremental and Full backup can't be run in
> > parallel
> > > >> and leading to degraded RPO's in case Full backup is of longer
> > duration
> > > esp
> > > >> for large tables.
> > > >> >
> > > >> > Example:
> > > >> > Expectation: Say you have a big table with 10 TB and your RPO is
> 60
> > > >> minutes and you are allowed to ship the remote backup with 800 Mbps.
> > And
> > > >> you are allowed to take Full Backups once in a week and rest of them
> > > should
> > > >> be incremental backups
> > > >> >
> > > >> > Shortcoming: With the above design, one can't run parallel backups
> > and
> > > >> whenever there is a full backup running (which takes roughly 25
> hours)
> > > you
> > > >> are not allowed to take incremental backups and that would be a
> breach
> > > in
> > > >> your RPO.
> > > >> >
> > > >> > Proposed Solution: Barring some critical sections such as
> modifying
> > > >> state of the backup on meta tables, others can happen parallelly.
> > > Leaving
> > > >> incremental backups to be able to run based on older successful
> full /
> > > >> incremental backups and completion time of backup should be used
> > > instead of
> > > >> start time of backup for ordering. I have not worked on the full
> > > redesign,
> > > >> and will be doing so if this proposal seems acceptable for the
> > > community.
> > > >> >
> > > >> > Problem 2:
> > > >> >
> > > >> > With one backup at a time, it fails easily for a multi-tenant
> > system.
> > > >> This poses following problems
> > > >> >
> > > >> > Admins will not be able to achieve required RPO's for their tables
> > > >> because of dependence on other tenants present in the system. As one
> > > tenant
> > > >> doesn't have control over other tenants' table sizes and hence the
> > > duration
> > > >> of the backup
> > > >> > Management overhead of setting up a right sequence to achieve
> > required
> > > >> RPO's for different tenants could be very hard.
> > > >> >
> > > >> > Proposed Solution: Same as previous proposal
> > > >> >
> > > >> > Problem 3:
> > > >> >
> > > >> > Incremental backup works on WAL's and
> > > >> org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that
> > > WAL's
> > > >> are never cleaned up until the next backup (Full / Incremental) is
> > > taken.
> > > >> This poses following problem
> > > >> >
> > > >> > WAL's can grow unbounded in case there are transient problems like
> > > >> backup site facing issues or anything else until next backup
> scheduled
> > > goes
> > > >> successful
> > > >> >
> > > >> > Proposed Solution: I can't think of anything better, but I see
> this
> > > can
> > > >> be a potential problem. Also, one can force full backup if required
> > WAL
> > > >> files are missing for whatever other reasons not necessarily
> mentioned
> > > >> above.
> > > >> >
> > > >> > ---
> > > >> > Mallikarjun
> > > >>
> > > >
> > >
> >
>

Re: [DISCUSS] Hbase Backup design changes

Reply via email to