Thanks, the image is visible now.

> Since I wanted to open this for discussion, did not consider placing it in
*hbase/dev_support/design-docs*.

Definitely, only after we come to concrete conclusion with the reviewer, we
should open up a PR. Until then this thread is anyways up for discussion.


On Mon, 25 Jan 2021 at 1:58 PM, Mallikarjun <mallik.v.ar...@gmail.com>
wrote:

> Hope this link works --> https://ibb.co/hYjRpgP
>
> Inline reply
> On Mon, Jan 25, 2021 at 1:16 PM Viraj Jasani <vjas...@apache.org> wrote:
>
> > Hi,
> >
> > Still not available :)
> > The attachments don’t work on mailing lists. You can try uploading the
> > attachment on some public hosting site and provide the url to the same
> > here.
> >
> > Since I am not aware of the contents, I cannot confirm right away but if
> > the reviewer feels we should have the attachment on our github repo:
> > hbase/dev-support/design-docs , good to upload the content there later.
> For
> > instance, pdf file can contain existing design and new design diagrams
> and
> > talk about pros and cons etc once we have things finalized.
> >
> >
> Since I wanted to open this for discussion, did not consider placing it in
> *hbase/dev_support/design-docs*.
>
>
> >
> > On Mon, 25 Jan 2021 at 12:13 PM, Mallikarjun <mallik.v.ar...@gmail.com>
> > wrote:
> >
> > > Attached as image. Please let me know if it is availabe now.
> > >
> > > ---
> > > Mallikarjun
> > >
> > >
> > > On Mon, Jan 25, 2021 at 10:32 AM Sean Busbey <bus...@apache.org>
> wrote:
> > >
> > >> Hi!
> > >>
> > >> Thanks for the write up. unfortunately, your image for the existing
> > >> design didn't come through. Could you post it to some host and link it
> > >> here?
> > >>
> > >> On Sun, Jan 24, 2021 at 3:12 AM Mallikarjun <mallik.v.ar...@gmail.com
> >
> > >> wrote:
> > >> >
> > >> > Existing Design:
> > >> >
> > >> >
> > >> >
> > >> > Problem 1:
> > >> >
> > >> > With this design, Incremental and Full backup can't be run in
> parallel
> > >> and leading to degraded RPO's in case Full backup is of longer
> duration
> > esp
> > >> for large tables.
> > >> >
> > >> > Example:
> > >> > Expectation: Say you have a big table with 10 TB and your RPO is 60
> > >> minutes and you are allowed to ship the remote backup with 800 Mbps.
> And
> > >> you are allowed to take Full Backups once in a week and rest of them
> > should
> > >> be incremental backups
> > >> >
> > >> > Shortcoming: With the above design, one can't run parallel backups
> and
> > >> whenever there is a full backup running (which takes roughly 25 hours)
> > you
> > >> are not allowed to take incremental backups and that would be a breach
> > in
> > >> your RPO.
> > >> >
> > >> > Proposed Solution: Barring some critical sections such as modifying
> > >> state of the backup on meta tables, others can happen parallelly.
> > Leaving
> > >> incremental backups to be able to run based on older successful full /
> > >> incremental backups and completion time of backup should be used
> > instead of
> > >> start time of backup for ordering. I have not worked on the full
> > redesign,
> > >> and will be doing so if this proposal seems acceptable for the
> > community.
> > >> >
> > >> > Problem 2:
> > >> >
> > >> > With one backup at a time, it fails easily for a multi-tenant
> system.
> > >> This poses following problems
> > >> >
> > >> > Admins will not be able to achieve required RPO's for their tables
> > >> because of dependence on other tenants present in the system. As one
> > tenant
> > >> doesn't have control over other tenants' table sizes and hence the
> > duration
> > >> of the backup
> > >> > Management overhead of setting up a right sequence to achieve
> required
> > >> RPO's for different tenants could be very hard.
> > >> >
> > >> > Proposed Solution: Same as previous proposal
> > >> >
> > >> > Problem 3:
> > >> >
> > >> > Incremental backup works on WAL's and
> > >> org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that
> > WAL's
> > >> are never cleaned up until the next backup (Full / Incremental) is
> > taken.
> > >> This poses following problem
> > >> >
> > >> > WAL's can grow unbounded in case there are transient problems like
> > >> backup site facing issues or anything else until next backup scheduled
> > goes
> > >> successful
> > >> >
> > >> > Proposed Solution: I can't think of anything better, but I see this
> > can
> > >> be a potential problem. Also, one can force full backup if required
> WAL
> > >> files are missing for whatever other reasons not necessarily mentioned
> > >> above.
> > >> >
> > >> > ---
> > >> > Mallikarjun
> > >>
> > >
> >
>

Reply via email to