Attached as image. Please let me know if it is availabe now.

---
Mallikarjun


On Mon, Jan 25, 2021 at 10:32 AM Sean Busbey <bus...@apache.org> wrote:

> Hi!
>
> Thanks for the write up. unfortunately, your image for the existing
> design didn't come through. Could you post it to some host and link it
> here?
>
> On Sun, Jan 24, 2021 at 3:12 AM Mallikarjun <mallik.v.ar...@gmail.com>
> wrote:
> >
> > Existing Design:
> >
> >
> >
> > Problem 1:
> >
> > With this design, Incremental and Full backup can't be run in parallel
> and leading to degraded RPO's in case Full backup is of longer duration esp
> for large tables.
> >
> > Example:
> > Expectation: Say you have a big table with 10 TB and your RPO is 60
> minutes and you are allowed to ship the remote backup with 800 Mbps. And
> you are allowed to take Full Backups once in a week and rest of them should
> be incremental backups
> >
> > Shortcoming: With the above design, one can't run parallel backups and
> whenever there is a full backup running (which takes roughly 25 hours) you
> are not allowed to take incremental backups and that would be a breach in
> your RPO.
> >
> > Proposed Solution: Barring some critical sections such as modifying
> state of the backup on meta tables, others can happen parallelly. Leaving
> incremental backups to be able to run based on older successful full /
> incremental backups and completion time of backup should be used instead of
> start time of backup for ordering. I have not worked on the full redesign,
> and will be doing so if this proposal seems acceptable for the community.
> >
> > Problem 2:
> >
> > With one backup at a time, it fails easily for a multi-tenant system.
> This poses following problems
> >
> > Admins will not be able to achieve required RPO's for their tables
> because of dependence on other tenants present in the system. As one tenant
> doesn't have control over other tenants' table sizes and hence the duration
> of the backup
> > Management overhead of setting up a right sequence to achieve required
> RPO's for different tenants could be very hard.
> >
> > Proposed Solution: Same as previous proposal
> >
> > Problem 3:
> >
> > Incremental backup works on WAL's and
> org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's
> are never cleaned up until the next backup (Full / Incremental) is taken.
> This poses following problem
> >
> > WAL's can grow unbounded in case there are transient problems like
> backup site facing issues or anything else until next backup scheduled goes
> successful
> >
> > Proposed Solution: I can't think of anything better, but I see this can
> be a potential problem. Also, one can force full backup if required WAL
> files are missing for whatever other reasons not necessarily mentioned
> above.
> >
> > ---
> > Mallikarjun
>

Reply via email to