On Mon, Feb 15, 2016 at 12:03 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > > Robert Haas <robertmh...@gmail.com> writes: > > > With respect to pg_locks - and for that matter also pg_stat_activity - > > I think you are right that improvement is needed. > > This is really the core of my concern at the moment. I think that > isolationtester is probably outright broken for any situation where the > queries-under-test are being parallel executed, and so will be any other > client that's trying to identify who blocks whom from pg_locks. > > > The simplest thing we could do to make that easier is, in > > pg_stat_activity, have parallel workers advertise the PID that > > launched them in a new field; and in pg_locks, have members of a lock > > group advertise the leader's PID in a new field. > >
The lock information for parallel query which uses one parallel worker and another unrelated backend trying to acquire Access Exclusive lock on the table which parallel query is using will be displayed as below by using lock monitoring query [1] blocked_pid | blocking_pid | blocked_statement | current_ statement_in_blocking_process -------------+--------------+-----------------------------------------+--------- ------------------------------- 5052 | 3464 | Lock table t1 in Access Exclusive Mode; | 5052 | 4128 | Lock table t1 in Access Exclusive Mode; | select c ount(*) from t1 where c1 < 10; (2 rows) Here, backend 5052 is waiting for acquiring Access Exclusive lock on t1 which is held in Access Share mode by master backend 4128 and parallel worker 3464. Now, I think it is tricky for user to find what exactly is going on by using current lock monitoring queries. Do you think by adding additional leader pid column, we can eliminate duplicity in above case and all such cases without having additional join or making above query more complex? Can we think of not having lock information in pg_locks for parallel worker for certain cases where parallel worker acquires lock in same mode on same resource as leader backend? Currently, pg_stat_activity displays NULL for both query and state for parallel workers. I think we should call pgstat_report_activity() in ParallelWorkerMain() or ParallelQueryMain() to display the information. One thing which needs some thought is what query should we display for parallel worker, currently while forming query descriptor in parallel worker we use "<parallel query>", so one idea is to just display the same or display the query the used by master backend (if currently not available, then we might need to pass it from master backend). > > That would be simple for us, but it would break every existing client-side > query that tries to identify blockers/blockees; and not only would those > queries need work but they would become very substantially more complex > and slower (probably at least 4-way joins not 2-way). We already know > that isolationtester's query has performance problems in the buildfarm. > I think more thought is needed here, Agreed. I think before deciding what exactly to add or change in pg_locks or lock monitoring queries, it might be helpful if we define how we want to convey the information to user/dba when group of parallel processes are involved as blockers/blockees. Some of the cases are: a) a standalone backend waits for group of parallel processes holding lock in same mode on same resource. b) group of parallel workers are waiting for lock held by standalone backend or some other parallel group c) a standalone backend waits for group of parallel processes holding lock in different mode on same resource. and other similar cases. Before jumping into discussion about solution, I would like to know do you and or Robert also have above kind of cases in mind where you want a better way to display information for user or something else? [1] - SELECT blocked_locks.pid AS blocked_pid, blocking_locks.pid AS blocking_pid, blocked_activity.query AS blocked_statement, blocking_activity.query AS current_statement_in_blocking_process FROM pg_catalog.pg_locks blocked_locks JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid JOIN pg_catalog.pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktype AND blocking_locks.DATABASE IS NOT DISTINCT FROM blocked_locks.DATABASE AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid AND blocking_locks.pid != blocked_locks.pid JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid WHERE NOT blocked_locks.GRANTED; With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com