On Wed, Apr 13, 2022 at 03:36:19PM +0000, l...@laurent-hasson.com wrote:
> After a lot of back and forth, someone in IT informed us that the database VM 
> is under a backup schedule using Veeam. Apparently, during the backup window, 
> Veeam creates a snapshot and that takes the VM offline for a couple of 
> minutes… And of course, they scheduled this right at the busiest time of the 
> day for this machine which is during our nightly ETL. Their backup doesn’t 
> perform very week either, which explained why the failure seemed to randomly 
> happen at various points during our ETL (which takes about 2h30mn).
> 
> They moved the schedule out and the issue has not happened again over the 
> past 3 weeks. This looks like it was the root cause and would explain (I 
> think) how the database and the client simultaneously reported a connection 
> timeout.
> 
> Thank you so much for all your help in trying to figure this out and 
> exonerate Postgres.

Great, thanks for letting us know.
This time it wasn't postgres' fault; you're 2 for 3 ;)

One issue I've seen is if a vmware snapshot is taken and then saved for a long
time.  It can be okay if VEEM takes a transient snapshot, copies its data, and
then destroys the snapshot.  But it can be bad if multiple snapshots are taken
and then left around for a long time to use as a backup themselves.

-- 
Justin


Reply via email to