Hope this link works --> https://ibb.co/hYjRpgP
Inline reply On Mon, Jan 25, 2021 at 1:16 PM Viraj Jasani <vjas...@apache.org> wrote: > Hi, > > Still not available :) > The attachments don’t work on mailing lists. You can try uploading the > attachment on some public hosting site and provide the url to the same > here. > > Since I am not aware of the contents, I cannot confirm right away but if > the reviewer feels we should have the attachment on our github repo: > hbase/dev-support/design-docs , good to upload the content there later. For > instance, pdf file can contain existing design and new design diagrams and > talk about pros and cons etc once we have things finalized. > > Since I wanted to open this for discussion, did not consider placing it in *hbase/dev_support/design-docs*. > > On Mon, 25 Jan 2021 at 12:13 PM, Mallikarjun <mallik.v.ar...@gmail.com> > wrote: > > > Attached as image. Please let me know if it is availabe now. > > > > --- > > Mallikarjun > > > > > > On Mon, Jan 25, 2021 at 10:32 AM Sean Busbey <bus...@apache.org> wrote: > > > >> Hi! > >> > >> Thanks for the write up. unfortunately, your image for the existing > >> design didn't come through. Could you post it to some host and link it > >> here? > >> > >> On Sun, Jan 24, 2021 at 3:12 AM Mallikarjun <mallik.v.ar...@gmail.com> > >> wrote: > >> > > >> > Existing Design: > >> > > >> > > >> > > >> > Problem 1: > >> > > >> > With this design, Incremental and Full backup can't be run in parallel > >> and leading to degraded RPO's in case Full backup is of longer duration > esp > >> for large tables. > >> > > >> > Example: > >> > Expectation: Say you have a big table with 10 TB and your RPO is 60 > >> minutes and you are allowed to ship the remote backup with 800 Mbps. And > >> you are allowed to take Full Backups once in a week and rest of them > should > >> be incremental backups > >> > > >> > Shortcoming: With the above design, one can't run parallel backups and > >> whenever there is a full backup running (which takes roughly 25 hours) > you > >> are not allowed to take incremental backups and that would be a breach > in > >> your RPO. > >> > > >> > Proposed Solution: Barring some critical sections such as modifying > >> state of the backup on meta tables, others can happen parallelly. > Leaving > >> incremental backups to be able to run based on older successful full / > >> incremental backups and completion time of backup should be used > instead of > >> start time of backup for ordering. I have not worked on the full > redesign, > >> and will be doing so if this proposal seems acceptable for the > community. > >> > > >> > Problem 2: > >> > > >> > With one backup at a time, it fails easily for a multi-tenant system. > >> This poses following problems > >> > > >> > Admins will not be able to achieve required RPO's for their tables > >> because of dependence on other tenants present in the system. As one > tenant > >> doesn't have control over other tenants' table sizes and hence the > duration > >> of the backup > >> > Management overhead of setting up a right sequence to achieve required > >> RPO's for different tenants could be very hard. > >> > > >> > Proposed Solution: Same as previous proposal > >> > > >> > Problem 3: > >> > > >> > Incremental backup works on WAL's and > >> org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that > WAL's > >> are never cleaned up until the next backup (Full / Incremental) is > taken. > >> This poses following problem > >> > > >> > WAL's can grow unbounded in case there are transient problems like > >> backup site facing issues or anything else until next backup scheduled > goes > >> successful > >> > > >> > Proposed Solution: I can't think of anything better, but I see this > can > >> be a potential problem. Also, one can force full backup if required WAL > >> files are missing for whatever other reasons not necessarily mentioned > >> above. > >> > > >> > --- > >> > Mallikarjun > >> > > >