On Sat, Sep 18, 2021 at 05:19:04PM -0300, Alvaro Herrera wrote: > Hmm, sounds a possibly useful idea to explore, but I would only do so if > the other ideas prove fruitless, because it sounds like it'd have more > moving parts. Can you please first test if the idea of sending the signal > twice is enough?
This idea does not work. I got one failure after 5 tries. > If that doesn't work, let's try Horiguchi-san's idea > of using some `ps` flags to find the process. Tried this one as well, to see the same failure. I was just looking at the state of the test while it was querying pg_replication_slots and that was the expected state after the WAL sender received SIGCONT: USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND toto 12663 0.0 0.0 5014468 3384 ?? Ss 8:30PM 0:00.00 postgres: primary3: walsender toto [local] streaming 0/720000 toto 12662 0.0 0.0 4753092 3936 ?? Ts 8:30PM 0:00.01 postgres: standby_3: walreceiver streaming 0/7000D8 The test gets the right PIDs, as the logs showed: ok 17 - have walsender pid 12663 ok 18 - have walreceiver pid 12662 So it does not seem that this is not an issue with the signals. Perhaps we'd better wait for a checkpoint to complete by for example scanning the logs before running the query on pg_replication_slots to make sure that the slot is invalidated? -- Michael
signature.asc
Description: PGP signature
