On Sat, Apr 20, 2019 at 12:44 PM Andrey Borodin <x4...@yandex-team.ru> wrote:
> Incremental backup of 1Tb DB made with distance of few minutes (small change 
> set) is few Gbs. All of this size is made of FSM (no LSN) and VM (hard to use 
> LSN).
> Sure, this overhead size is fine if we make daily backup. But at some 
> frequency of backups it will be too much.

It seems like if the backups are only a few minutes apart, PITR might
be a better choice than super-frequent incremental backups.  What do
you think about that?

> I think that problem of incrementing FSM and VM is too distant now.
> But if I had to implement it right now I'd choose following way: do not 
> backup FSM and VM, recreate it during restore. Looks like it is possible, but 
> too much AM-specific.

Interesting idea - that's worth some more thought.

> BTW, I'm all hands for extensibility and "hackability". But, personally, I'd 
> be happy if pg_basebackup would be ubiquitous and sufficient. And tools like 
> WAL-G and others became part of a history. There is not fundamental reason 
> why external backup tool can be better than backup tool in core. (Unlike many 
> PLs, data types, hooks, tuners etc)

+1

> Here's 53 mentions of "parallel backup". I want to note that there may be 
> parallel read from disk and parallel network transmission. Things between 
> these two are neglectable and can be single-threaded. From my POV, it's not 
> about threads, it's about saturated IO controllers.
> Also I think parallel restore matters more than parallel backup. Backups 
> themself can be slow, on many clusters we even throttle disk IO. But users 
> may want parallel backup to catch-up standby.

I'm not sure I entirely understand your point here -- are you saying
that parallel backup is important, or that it's not important, or
something in between?  Do you think it's more or less important than
incremental backup?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply via email to