Re: replication lag despite corrective config

2018-11-21 Thread Rene Romero Benavides
How big have been the delays after the new settings? I guess significantly lower than before, right? how much have they decreased? Am Mi., 21. Nov. 2018 um 13:18 Uhr schrieb Rene Romero Benavides < rene.romer...@gmail.com>: > You're welcome. > Since last Saturday when you addressed the 10 hour

Re: replication lag despite corrective config

2018-11-21 Thread Rene Romero Benavides
You're welcome. Since last Saturday when you addressed the 10 hour delay, with the new settings, have you seen more of such delay incidents? what the previous settings were? Beware that hot_standby_feedback = on and such long queries in the replica can increase bloat in the master, are you

Re: replication lag despite corrective config

2018-11-20 Thread Wyatt Alt
Hi Rene, On 11/19/18 8:46 PM, Rene Romero Benavides wrote: Not sure about the root cause but I can make these observations and raise some questions: 1) 9.6.6 is five bug fix versions behind Valid point to raise. 2) 300GB is so big a table, wouldn't make sense to you to partition it ? 2a) or

Re: replication lag despite corrective config

2018-11-20 Thread Wyatt Alt
On 11/19/18 11:09 PM, Laurenz Albe wrote: With these settings, any conflicting query will be canceled after five minutes. Perhaps your actual settings are different. What do you get for SELECT * FROM pg_settings WHERE name = 'max_standby_streaming_delay'; Hi Laurenz, thanks for backing up

Re: replication lag despite corrective config

2018-11-19 Thread Laurenz Albe
On Mon, 2018-11-19 at 17:46 -0800, Wyatt Alt wrote: > I've been struggling to eliminate replication lag on a Postgres 9.6.6 > instance on Amazon RDS. I believe the lag is caused by early cleanup > conflicts from vacuums on the master, because I can reliably resolve > it by killing long-running

Re: replication lag despite corrective config

2018-11-19 Thread Rene Romero Benavides
Not sure about the root cause but I can make these observations and raise some questions: 1) 9.6.6 is five bug fix versions behind 2) 300GB is so big a table, wouldn't make sense to you to partition it ? 2a) or if it's partitioned, doesn't the time of creation or dropping of new partitions match

Re: replication lag despite corrective config

2018-11-19 Thread Wyatt Alt
Sorry, I see now there was a similar question a few days ago: https://www.postgresql.org/message-id/CAJw4d1WtzOdYzd8Nq2=ufk+z0jy0l_pfg9tvcwprmt3nczq...@mail.gmail.com Two ideas proposed (aside from disconnects): * Autovacuum is truncating a page on the master and taking an AccessExclusiveLock on

replication lag despite corrective config

2018-11-19 Thread Wyatt Alt
I've been struggling to eliminate replication lag on a Postgres 9.6.6 instance on Amazon RDS. I believe the lag is caused by early cleanup conflicts from vacuums on the master, because I can reliably resolve it by killing long-running queries on the standby. I most recently saw ten hours of lag on