I was just looking into why the -DCLOBBER_CACHE_ALWAYS buildfarm
critters aren't managing to run the new "timeouts" isolation test
successfully, despite very generous timeouts.  The answer is that
2 seconds isn't quite enough time to parse+plan+execute the query
that isolationtester uses to see if the current test session is
blocked on a lock, if CLOBBER_CACHE_ALWAYS is on.  Now, that query
is totally horrible:

    appendPQExpBufferStr(&wait_query,
                         "SELECT 1 FROM pg_locks holder, pg_locks waiter "
                         "WHERE NOT waiter.granted AND waiter.pid = $1 "
                         "AND holder.granted "
                         "AND holder.pid <> $1 AND holder.pid IN (");
    /* The spec syntax requires at least one session; assume that here. */
    appendPQExpBuffer(&wait_query, "%s", backend_pids[1]);
    for (i = 2; i < nconns; i++)
        appendPQExpBuffer(&wait_query, ", %s", backend_pids[i]);
    appendPQExpBufferStr(&wait_query,
                         ") "

                         "AND holder.mode = ANY (CASE waiter.mode "
                         "WHEN 'AccessShareLock' THEN ARRAY["
                         "'AccessExclusiveLock'] "
                         "WHEN 'RowShareLock' THEN ARRAY["
                         "'ExclusiveLock',"
                         "'AccessExclusiveLock'] "
                         "WHEN 'RowExclusiveLock' THEN ARRAY["
                         "'ShareLock',"
                         "'ShareRowExclusiveLock',"
                         "'ExclusiveLock',"
                         "'AccessExclusiveLock'] "
                         "WHEN 'ShareUpdateExclusiveLock' THEN ARRAY["
                         "'ShareUpdateExclusiveLock',"
                         "'ShareLock',"
                         "'ShareRowExclusiveLock',"
                         "'ExclusiveLock',"
                         "'AccessExclusiveLock'] "
                         "WHEN 'ShareLock' THEN ARRAY["
                         "'RowExclusiveLock',"
                         "'ShareUpdateExclusiveLock',"
                         "'ShareRowExclusiveLock',"
                         "'ExclusiveLock',"
                         "'AccessExclusiveLock'] "
                         "WHEN 'ShareRowExclusiveLock' THEN ARRAY["
                         "'RowExclusiveLock',"
                         "'ShareUpdateExclusiveLock',"
                         "'ShareLock',"
                         "'ShareRowExclusiveLock',"
                         "'ExclusiveLock',"
                         "'AccessExclusiveLock'] "
                         "WHEN 'ExclusiveLock' THEN ARRAY["
                         "'RowShareLock',"
                         "'RowExclusiveLock',"
                         "'ShareUpdateExclusiveLock',"
                         "'ShareLock',"
                         "'ShareRowExclusiveLock',"
                         "'ExclusiveLock',"
                         "'AccessExclusiveLock'] "
                         "WHEN 'AccessExclusiveLock' THEN ARRAY["
                         "'AccessShareLock',"
                         "'RowShareLock',"
                         "'RowExclusiveLock',"
                         "'ShareUpdateExclusiveLock',"
                         "'ShareLock',"
                         "'ShareRowExclusiveLock',"
                         "'ExclusiveLock',"
                         "'AccessExclusiveLock'] END) "

                  "AND holder.locktype IS NOT DISTINCT FROM waiter.locktype "
                  "AND holder.database IS NOT DISTINCT FROM waiter.database "
                  "AND holder.relation IS NOT DISTINCT FROM waiter.relation "
                         "AND holder.page IS NOT DISTINCT FROM waiter.page "
                         "AND holder.tuple IS NOT DISTINCT FROM waiter.tuple "
              "AND holder.virtualxid IS NOT DISTINCT FROM waiter.virtualxid "
        "AND holder.transactionid IS NOT DISTINCT FROM waiter.transactionid "
                    "AND holder.classid IS NOT DISTINCT FROM waiter.classid "
                         "AND holder.objid IS NOT DISTINCT FROM waiter.objid "
                "AND holder.objsubid IS NOT DISTINCT FROM waiter.objsubid ");

This is way more knowledge than we (should) want a client to embed about
which lock types block which others.  What's worse, it's still wrong.
The query will find cases where one of the test sessions *directly*
blocks another one, but not cases where the blockage is indirect.
For example, consider that A holds AccessShareLock, B is waiting for
AccessExclusiveLock on the same object, and C is queued up behind B
for another AccessShareLock.  This query will not think that C is
blocked, not even if B is part of the set of sessions of interest
(because B will show the lock as not granted); but especially so if
B is not part of the set.

I think that such situations may not arise in the specific context that
isolationtester says it's worried about, which is to disregard waits for
locks held by autovacuum.  But in general, you can't reliably tell who's
blocking whom with a query like this.

If isolationtester were the only market for this type of information,
maybe it wouldn't be worth worrying about.  But I'm pretty sure that
there are a *lot* of monitoring applications out there that are trying
to extract who-blocks-whom information from pg_locks.  I hadn't realized
before quite how painful it is to do that, even incorrectly.

I propose that we should add a backend function that simplifies this
type of query.  The API that comes to mind is (name subject to
bikeshedding)

        pg_blocking_pids(pid int) returns int[]

defined to return NULL if the argument isn't the PID of any backend or
that backend isn't waiting for a lock, and otherwise an array of the
PIDs of the backends that are blocking it from getting the lock.
I would compute the array as

        PIDs of backends already holding conflicting locks,
        plus PIDs of backends requesting conflicting locks that are
        ahead of this one in the lock's wait queue,
        plus PIDs of backends that block the latter group of PIDs
        (ie, are holding locks conflicting with their requests,
        or are awaiting such locks and are ahead of them in the queue)

There would be some cases where this definition would be too expansive,
ie we'd release the waiter after only some of the listed sessions had
released their lock or request.  (That could happen for instance if we
concluded we had to move up the waiter's request to escape a deadlock.)
But I think that it's better to err in that direction than to
underestimate the set of relevant PIDs.

In the isolationtester use-case, we'd get the right answer by testing
whether this function's result has any overlap with the set of PIDs of
test sessions, ie

        select pg_blocking_pids($1) && array[pid1, pid2, pid3, ...]

Thoughts?

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to