On Sat, Apr 20, 2019 at 12:44 PM Andrey Borodin <x4...@yandex-team.ru> wrote: > Incremental backup of 1Tb DB made with distance of few minutes (small change > set) is few Gbs. All of this size is made of FSM (no LSN) and VM (hard to use > LSN). > Sure, this overhead size is fine if we make daily backup. But at some > frequency of backups it will be too much.
It seems like if the backups are only a few minutes apart, PITR might be a better choice than super-frequent incremental backups. What do you think about that? > I think that problem of incrementing FSM and VM is too distant now. > But if I had to implement it right now I'd choose following way: do not > backup FSM and VM, recreate it during restore. Looks like it is possible, but > too much AM-specific. Interesting idea - that's worth some more thought. > BTW, I'm all hands for extensibility and "hackability". But, personally, I'd > be happy if pg_basebackup would be ubiquitous and sufficient. And tools like > WAL-G and others became part of a history. There is not fundamental reason > why external backup tool can be better than backup tool in core. (Unlike many > PLs, data types, hooks, tuners etc) +1 > Here's 53 mentions of "parallel backup". I want to note that there may be > parallel read from disk and parallel network transmission. Things between > these two are neglectable and can be single-threaded. From my POV, it's not > about threads, it's about saturated IO controllers. > Also I think parallel restore matters more than parallel backup. Backups > themself can be slow, on many clusters we even throttle disk IO. But users > may want parallel backup to catch-up standby. I'm not sure I entirely understand your point here -- are you saying that parallel backup is important, or that it's not important, or something in between? Do you think it's more or less important than incremental backup? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company