On Wed, Apr 13, 2022 at 03:36:19PM +0000, l...@laurent-hasson.com wrote: > After a lot of back and forth, someone in IT informed us that the database VM > is under a backup schedule using Veeam. Apparently, during the backup window, > Veeam creates a snapshot and that takes the VM offline for a couple of > minutes… And of course, they scheduled this right at the busiest time of the > day for this machine which is during our nightly ETL. Their backup doesn’t > perform very week either, which explained why the failure seemed to randomly > happen at various points during our ETL (which takes about 2h30mn). > > They moved the schedule out and the issue has not happened again over the > past 3 weeks. This looks like it was the root cause and would explain (I > think) how the database and the client simultaneously reported a connection > timeout. > > Thank you so much for all your help in trying to figure this out and > exonerate Postgres.
Great, thanks for letting us know. This time it wasn't postgres' fault; you're 2 for 3 ;) One issue I've seen is if a vmware snapshot is taken and then saved for a long time. It can be okay if VEEM takes a transient snapshot, copies its data, and then destroys the snapshot. But it can be bad if multiple snapshots are taken and then left around for a long time to use as a backup themselves. -- Justin