Andres Freund <[email protected]> writes:
> On April 15, 2022 11:23:40 AM EDT, Tom Lane <[email protected]> wrote:
>> The something is the logical replication launcher. In the failing runs,
>> it is advertising xmin = 724 (the post-initdb NextXID) and continues to
>> do so well past the point where tenk1 gets vacuumed.
> That explains it. Before shmstat autovac needed to wait for the stats
> collector to write out stats. Now it's near instantaneous. So the issue
> probably existed before, just unlikely to ever be reached.
Um, this is the logical replication launcher, not the autovac launcher.
Your observation that a sleep in get_database_list() reproduces it
confirms that, and I don't entirely see why the timing of the LR launcher
would have changed.
(On thinking about it, I suppose the AV launcher might trigger this
too, but that is not the PID I saw in testing.)
> We can't just ignore database less xmins for non-shared rels, because
> walsender propagates hot_standby_feedback that way. But we can probably add a
> flag somewhere indicating whether a database less PGPROC has to be accounted
> in the horizon for non-shared rels.
Yeah, I was also thinking about a flag in PGPROC being a more reliable
way to do this. Is there anything besides walsenders that should set
that flag?
regards, tom lane