On Mon, 30 Sep 2019 at 23:38, Robert Haas <robertmh...@gmail.com> wrote: > > On Mon, Sep 30, 2019 at 7:35 AM Amit Khandekar <amitdkhan...@gmail.com> wrote: > > Alright. Attached is the updated patch that splits the file into two > > files, one that does only xmin related testing, and the other test > > file that tests conflict recovery scenarios, and also one scenario > > where drop-database drops the slots on the database on standby. > > Removed get_slot_xmins() and get_node_from_slotname(). > > Renamed 'replica' to 'standby'. > > Used node->backup() function instead of pg_basebackup command. > > Renamed $master_slot to $master_slotname, similarly for $standby_slot. > > In general, I think this code is getting a lot clearer and easier to > understand in these last few revisions. > > Why does create_logical_slot_on_standby include sleep(1)? Does the > test fail if you take that out? It has not failed for me, but I think sometimes it may happen that the system command 'pg_recvlogical' is so slow to start that before it tries to even create the slot, the subsequent checkpoint command concurrently runs, causing a "running transactions" record to arrive on standby *before* even pg_recvlogical decides the starting point from which to receive records. So effectively pg_recvlogical can miss this record.
> If so, it's probably going to fail on > the buildfarm even with that included, because some of the buildfarm > machines are really slow (e.g. because they use CLOBBER_CACHE_ALWAYS, > or because they're running on a shared system with low hardware > specifications and an ancient disk). Yeah right, then it makes sense to explicitly wait for the slot to calculate the restart_lsn, and only then run the checkpoint command. Did that now. > > Similarly for the sleep(1) just after you VACUUM FREEZE all the databases. I checked that VACUUM command returns only after updating the pg_database.datfrozenxid. So now I think it's safe to immediately run the checkpoint command after vacuum. So removed the sleep() now. Attached is the updated patch series. > > I'm not sure wait the point of the wait_for_xmins() stuff is in > 019_standby_logical_decoding_conflicts.pl. Isn't that just duplicating > stuff we've already tested in 018? Actually, in 019, the function call is more to wait for hot_standby_feedback to take effect. -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company
logicaldecodng_standby_v4.tar.gz
Description: application/gzip