Re: Intermittent buildfarm failures on wrasse

Tom Lane Fri, 15 Apr 2022 09:37:16 -0700

Andres Freund <and...@anarazel.de> writes:
> On April 15, 2022 11:23:40 AM EDT, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> The something is the logical replication launcher.  In the failing runs,
>> it is advertising xmin = 724 (the post-initdb NextXID) and continues to
>> do so well past the point where tenk1 gets vacuumed.


> That explains it. Before shmstat autovac needed to wait for the stats 
> collector to write out stats. Now it's near instantaneous. So the issue 
> probably existed before, just unlikely to ever be reached.

Um, this is the logical replication launcher, not the autovac launcher.
Your observation that a sleep in get_database_list() reproduces it
confirms that, and I don't entirely see why the timing of the LR launcher
would have changed.

(On thinking about it, I suppose the AV launcher might trigger this
too, but that is not the PID I saw in testing.)

> We can't just ignore database less xmins for non-shared rels, because 
> walsender propagates hot_standby_feedback that way. But we can probably add a 
> flag somewhere indicating whether a database less PGPROC has to be accounted 
> in the horizon for non-shared rels.

Yeah, I was also thinking about a flag in PGPROC being a more reliable
way to do this.  Is there anything besides walsenders that should set
that flag?

                        regards, tom lane

Re: Intermittent buildfarm failures on wrasse

Reply via email to