On Tue, Jan 28, 2020 at 02:26:34PM +0100, Julien Rouhaud wrote:
On Tue, Jan 28, 2020 at 2:09 PM Tomas Vondra
<tomas.von...@2ndquadrant.com> wrote:
I agree a separate "leader_id" column is easier to work with, as it does
not require unnesting and so on.
As for the consistency, I agree we probably can't make this perfect, as
we're fetching and processing the PGPROC records one by one. Fixing that
would require acquiring a much stronger lock on PGPROC, and perhaps some
other locks. That's pre-existing behavior, of course, it's just not very
obvious as we don't have any dependencies between the rows, I think.
Adding the leader_id will change, that, of course. But I think it's
still mostly OK, even with the possible inconsistency.
There were already some dependencies between the rows since parallel
queries were added, as you could see eg. a parallel worker while no
query is currently active. This patch will make those corner cases
more obvious.
Yeah, sure. I mean explicit dependencies, e.g. a column referencing
values from another row, like leader_id does.
Should I document the possible inconsistencies?
I think it's worth mentioning that as a comment in the code, say before
the pg_stat_get_activity function. IMO we don't need to document all
possible inconsistencies, a generic explanation is enough.
Not sure about the user docs. Does it currently say anything about this
topic - consistency with stat catalogs?
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services