On Tue, Nov 29, 2016 at 6:26 AM, Simon Riggs <si...@2ndquadrant.com> wrote: > New (9.4) XactLockTableWait() sets the reason for the wait, so that if > we emit a log message then it will produce a log message to say its > not waiting on a transaction, its waiting on a lock. > > Nice. log_lock_waits works great. > > New (9.6) wait_event code is not correctly set. In ProcSleep() we set > the wait reason according to the actual lock tag, effectively ignoring > the information specifically provided by XactLockTableWait(). > > Ugh. > > pg_stat_activity shows a wait_event of 'transactionid' rather than 'tuple' > pg_locks shows a transactionid lock rather than a wait for tuple lock > > That looks like a bug since we get different answers from > log_lock_wait and wait_event, which is annoying and confusing.
What you find annoying or confusing is up to you, but it's not a bug, and it's not particularly inconsistent, either. log_lock_waits reports a transaction lock, pg_stat_activity reports a transactionid lock, and pg_locks reports a transactionid lock. Now, it is true *in addition* to reporting that the lock in question is a transaction lock, log_lock_waits also reports some context information: 2016-11-29 13:49:59.702 EST [33658] LOG: process 33658 still waiting for ShareLock on transaction 2997486 after 1000.748 ms 2016-11-29 13:49:59.702 EST [33658] DETAIL: Process holding the lock: 33653. Wait queue: 33658. 2016-11-29 13:49:59.702 EST [33658] CONTEXT: while updating tuple (0,67) in relation "pgbench_accounts" 2016-11-29 13:49:59.702 EST [33658] STATEMENT: update pgbench_accounts set filler = 'b' where aid = 1; ...but right there on the first line, it clearly says "waiting for ShareLock on transaction 2997". I'm not totally unsympathetic to the argument that perhaps XactLockTableWait() ought to pass down the context information through LockAcquire() so that it reaches ProcSleep() and we can set the wait event to something like Tuple/Update instead of Lock/Transaction. But I don't think it's a slam-dunk, either. If we did that, it would be consistent with the additional detail provided by log_lock_waits rather than the primary message, and it would be inconsistent with pg_locks. Also, I think that the reason why we originally used an error-context callback here rather than passing down some additional detail into LockAcquire() is that the breaking the abstraction layer seemed messy and unappealing. IIRC, Tom, in particular, was concerned about that last point. But feel free to propose a patch... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers