Do you actually have 100G networking between the nodes? Because if not, a 
single CPU should be able to saturate 10G. 

Likewise the receiving end would need disk capable of keeping up. Which brings 
up the question, why not write to disk, but directly to the destination rather 
than write locally then copy?

Do you require dump-reload because of suspected corruption? That's a tough one. 
But if not, if the goal is just to get up and running on a new server, why not 
pg_basebackup, streaming replica, promote? That depends on the level of data 
modification activity being low enough that pg_basebackup can keep up with WAL 
as it's generated and apply it faster than new WAL comes in, but given that 
your server is currently keeping up with writing that much WAL and flushing 
that many changes, seems likely it would keep up as long as the network 
connection is fast enough. Anyway, in that scenario, you don't need to care how 
long pg_basebackup takes.

If you do need a dump/reload because of suspected corruption, the only thing I 
can think of is something like doing it a table at a time--partitioning would 
help here, if practical.

Reply via email to