Hi, 2012/10/13 23:05, Satoshi Nagayasu wrote: > Hi all, > > I have fixed my previous patch for pg_stat_lwlocks view, and > as Josh commented, it now supports local and global (shared) > statistics in the same system view.
Sorry, I found my mistakes. New fixed one is attached to this mail. Regards, > > Local statistics means the counters are only effective in the > same session, and shared ones means the counters are shared within > the entire cluster. > > Also the global statistics would be collected via pgstat collector > process like other statistics do. > > Now, the global statistics struct has been splitted into two parts > for different use, for bgwriter stats and lwlock stats. > > Therefore, calling pg_stat_reset_shared('bgwriter') or > pg_stat_reset_shared('lwlocks') would reset dedicated struct, > not entire PgStat_GlobalStats. > > Comments and review are always welcome. > > Regards, > > ------------------------------------------------------------------------------ > postgres=# SELECT * FROM pg_stat_lwlocks; > lwlockid | local_calls | local_waits | local_time_ms | shared_calls | > shared_waits | shared_time_ms > ----------+-------------+-------------+---------------+--------------+--------------+---------------- > 0 | 0 | 0 | 0 | 4268 | > 0 | 0 > 1 | 43 | 0 | 0 | 387 | > 0 | 0 > 2 | 0 | 0 | 0 | 19 | > 0 | 0 > 3 | 0 | 0 | 0 | 28 | > 0 | 0 > 4 | 3 | 0 | 0 | 315 | > 0 | 0 > 5 | 0 | 0 | 0 | 24 | > 0 | 0 > 6 | 1 | 0 | 0 | 76 | > 0 | 0 > 7 | 0 | 0 | 0 | 16919 | > 0 | 0 > 8 | 0 | 0 | 0 | 0 | > 0 | 0 > 9 | 0 | 0 | 0 | 0 | > 0 | 0 > 10 | 0 | 0 | 0 | 0 | > 0 | 0 > 11 | 0 | 0 | 0 | 75 | > 0 | 0 > 12 | 0 | 0 | 0 | 0 | > 0 | 0 > 13 | 0 | 0 | 0 | 0 | > 0 | 0 > 14 | 0 | 0 | 0 | 0 | > 0 | 0 > 15 | 0 | 0 | 0 | 0 | > 0 | 0 > 16 | 0 | 0 | 0 | 0 | > 0 | 0 > 17 | 0 | 0 | 0 | 61451 | > 6 | 0 > 18 | 0 | 0 | 0 | 0 | > 0 | 0 > 19 | 0 | 0 | 0 | 0 | > 0 | 0 > 20 | 0 | 0 | 0 | 0 | > 0 | 0 > 21 | 1 | 0 | 0 | 9 | > 0 | 0 > 22 | 0 | 0 | 0 | 0 | > 0 | 0 > 23 | 0 | 0 | 0 | 0 | > 0 | 0 > 24 | 0 | 0 | 0 | 1 | > 0 | 0 > 25 | 0 | 0 | 0 | 0 | > 0 | 0 > 26 | 2 | 0 | 0 | 18 | > 0 | 0 > 27 | 0 | 0 | 0 | 0 | > 0 | 0 > 28 | 0 | 0 | 0 | 0 | > 0 | 0 > 29 | 0 | 0 | 0 | 0 | > 0 | 0 > 30 | 0 | 0 | 0 | 0 | > 0 | 0 > 31 | 0 | 0 | 0 | 0 | > 0 | 0 > 32 | 0 | 0 | 0 | 0 | > 0 | 0 > 33 | 4 | 0 | 0 | 207953 | > 0 | 0 > 50 | 8 | 0 | 0 | 33388 | > 0 | 0 > 67 | 0 | 0 | 0 | 0 | > 0 | 0 > (36 rows) > > postgres=# > ------------------------------------------------------------------------------ > > > 2012/06/26 21:11, Satoshi Nagayasu wrote: >> Hi all, >> >> I've modified the pg_stat_lwlocks patch to be able to work with >> the latest PostgreSQL Git code. >> >> This patch provides: >> pg_stat_lwlocks New system view to show lwlock statistics. >> pg_stat_get_lwlocks() New function to retrieve lwlock statistics. >> pg_stat_reset_lwlocks() New function to reset lwlock statistics. >> >> Please try it out. >> >> Regards, >> >> 2012/06/26 5:29, Satoshi Nagayasu wrote: >>> Hi all, >>> >>> I've been working on a new system view, pg_stat_lwlocks, to observe >>> LWLock, and just completed my 'proof-of-concept' code that can work >>> with version 9.1. >>> >>> Now, I'd like to know the possibility of this feature for future >>> release. >>> >>> With this patch, DBA can easily determine a bottleneck around lwlocks. >>> -------------------------------------------------- >>> postgres=# SELECT * FROM pg_stat_lwlocks ORDER BY time_ms DESC LIMIT 10; >>> lwlockid | calls | waits | time_ms >>> ----------+--------+-------+--------- >>> 49 | 193326 | 32096 | 23688 >>> 8 | 3305 | 133 | 1335 >>> 2 | 21 | 0 | 0 >>> 4 | 135188 | 0 | 0 >>> 5 | 57935 | 0 | 0 >>> 6 | 141 | 0 | 0 >>> 7 | 24580 | 1 | 0 >>> 3 | 3282 | 0 | 0 >>> 1 | 41 | 0 | 0 >>> 9 | 3 | 0 | 0 >>> (10 rows) >>> >>> postgres=# >>> -------------------------------------------------- >>> >>> In this view, >>> 'lwlockid' column represents LWLockId used in the backends. >>> 'calls' represents how many times LWLockAcquire() was called. >>> 'waits' represents how many times LWLockAcquire() needed to wait >>> within it before lock acquisition. >>> 'time_ms' represents how long LWLockAcquire() totally waited on >>> a lwlock. >>> >>> And lwlocks that use a LWLockId range, such as BufMappingLock or >>> LockMgrLock, would be grouped and summed up in a single record. >>> For example, lwlockid 49 in the above view represents LockMgrLock >>> statistics. >>> >>> Now, I know there are some considerations. >>> >>> (1) Performance >>> >>> I've measured LWLock performance both with and without the patch, >>> and confirmed that this patch does not affect the LWLock perfomance >>> at all. >>> >>> pgbench scores with the patch: >>> tps = 900.906658 (excluding connections establishing) >>> tps = 908.528422 (excluding connections establishing) >>> tps = 903.900977 (excluding connections establishing) >>> tps = 910.470595 (excluding connections establishing) >>> tps = 909.685396 (excluding connections establishing) >>> >>> pgbench scores without the patch: >>> tps = 909.096785 (excluding connections establishing) >>> tps = 894.868712 (excluding connections establishing) >>> tps = 910.074669 (excluding connections establishing) >>> tps = 904.022770 (excluding connections establishing) >>> tps = 895.673830 (excluding connections establishing) >>> >>> Of course, this experiment was not I/O bound, and the cache hit ratio >>> was>99.9%. >>> >>> (2) Memory space >>> >>> In this patch, I added three new members to LWLock structure >>> as uint64 to collect statistics. >>> >>> It means that those members must be held in the shared memory, >>> but I'm not sure whether it's appropriate. >>> >>> I think another possible option is holding those statistics >>> values in local (backend) process memory, and send them through >>> the stat collector process (like other statistics values). >>> >>> (3) LWLock names (or labels) >>> >>> Now, pg_stat_lwlocks view shows LWLockId itself. But LWLockId is >>> not easy for DBA to determine actual lock type. >>> >>> So, I want to show LWLock names (or labels), like 'WALWriteLock' >>> or 'LockMgrLock', but how should I implement it? >>> >>> Any comments? >>> >>> Regards, >> >> > > -- Satoshi Nagayasu <sn...@uptime.jp> Uptime Technologies, LLC. http://www.uptime.jp
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 607a72f..2f84940 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -671,6 +671,17 @@ CREATE VIEW pg_stat_bgwriter AS pg_stat_get_buf_alloc() AS buffers_alloc, pg_stat_get_bgwriter_stat_reset_time() AS stats_reset; +CREATE VIEW pg_stat_lwlocks AS + SELECT + S.lwlockid, + S.local_calls, + S.local_waits, + S.local_time_ms, + S.shared_calls, + S.shared_waits, + S.shared_time_ms + FROM pg_stat_get_lwlocks() AS S; + CREATE VIEW pg_user_mappings AS SELECT U.oid AS umid, diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index 8389d5c..970e8bd 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -282,6 +282,7 @@ static void pgstat_recv_bgwriter(PgStat_MsgBgWriter *msg, int len); static void pgstat_recv_funcstat(PgStat_MsgFuncstat *msg, int len); static void pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len); static void pgstat_recv_recoveryconflict(PgStat_MsgRecoveryConflict *msg, int len); +static void pgstat_recv_lwlockstat(PgStat_MsgLWLockstat *msg, int len); static void pgstat_recv_deadlock(PgStat_MsgDeadlock *msg, int len); static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len); @@ -1188,6 +1189,8 @@ pgstat_reset_shared_counters(const char *target) if (strcmp(target, "bgwriter") == 0) msg.m_resettarget = RESET_BGWRITER; + else if (strcmp(target, "lwlocks") == 0) + msg.m_resettarget = RESET_LWLOCKSTAT; else ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), @@ -1344,6 +1347,72 @@ pgstat_report_recovery_conflict(int reason) } /* -------- + * pgstat_report_lwlockstat() - + * + * Tell the collector about lwlock statistics. + * -------- + */ +void +pgstat_report_lwlockstat(void) +{ + PgStat_MsgLWLockstat msg; + + int32 lockid = 0; + int need_continue = 0; + + report_continue: + memset(&msg, 0, sizeof(PgStat_MsgLWLockstat)); + + for ( ; lockid<NumFixedLWLocks+1 ; lockid++) + { + uint64 calls, waits, time_ms; + + calls = waits = time_ms = 0; + + calls = lwlock_get_stat_calls_global(lockid); + waits = lwlock_get_stat_waits_global(lockid); + time_ms = lwlock_get_stat_time_ms_global(lockid); + + if ( calls>0 || waits>0 || time_ms>0 ) + { + msg.m_entry[msg.m_nentries].lockid = lockid; + msg.m_entry[msg.m_nentries].calls = calls; + msg.m_entry[msg.m_nentries].waits = waits; + msg.m_entry[msg.m_nentries].waited_time = time_ms; + + msg.m_nentries++; + + lwlock_reset_stat_global(lockid); + + /* + * Need to keep a message packet smaller than PGSTAT_MSG_PAYLOAD. + * So, going to split a report into multiple messages. + */ + if ( msg.m_nentries>=MAX_LWLOCKSTAT_ENTRIES ) + { + need_continue = 1; + break; + } + } + } + + if (pgStatSock == PGINVALID_SOCKET || !pgstat_track_counts) + return; + + pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_LWLOCKSTAT); + pgstat_send(&msg, sizeof(msg)); + + /* + * Need to continue because of the larger report? + */ + if ( need_continue ) + { + need_continue = 0; + goto report_continue; + } +} + +/* -------- * pgstat_report_deadlock() - * * Tell the collector about a deadlock detected. @@ -3219,6 +3288,10 @@ PgstatCollectorMain(int argc, char *argv[]) pgstat_recv_recoveryconflict((PgStat_MsgRecoveryConflict *) &msg, len); break; + case PGSTAT_MTYPE_LWLOCKSTAT: + pgstat_recv_lwlockstat((PgStat_MsgLWLockstat *) &msg, len); + break; + case PGSTAT_MTYPE_DEADLOCK: pgstat_recv_deadlock((PgStat_MsgDeadlock *) &msg, len); break; @@ -4379,8 +4452,15 @@ pgstat_recv_resetsharedcounter(PgStat_MsgResetsharedcounter *msg, int len) if (msg->m_resettarget == RESET_BGWRITER) { /* Reset the global background writer statistics for the cluster. */ - memset(&globalStats, 0, sizeof(globalStats)); - globalStats.stat_reset_timestamp = GetCurrentTimestamp(); + memset(&globalStats.bgwriterstats, 0, sizeof(globalStats.bgwriterstats)); + globalStats.bgwriterstats.reset_timestamp = GetCurrentTimestamp(); + } + + if (msg->m_resettarget == RESET_LWLOCKSTAT) + { + /* Reset the global lwlock statistics for the cluster. */ + memset(&globalStats.lwlockstats, 0, sizeof(globalStats.lwlockstats)); + globalStats.lwlockstats.reset_timestamp = GetCurrentTimestamp(); } /* @@ -4521,16 +4601,16 @@ pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len) static void pgstat_recv_bgwriter(PgStat_MsgBgWriter *msg, int len) { - globalStats.timed_checkpoints += msg->m_timed_checkpoints; - globalStats.requested_checkpoints += msg->m_requested_checkpoints; - globalStats.checkpoint_write_time += msg->m_checkpoint_write_time; - globalStats.checkpoint_sync_time += msg->m_checkpoint_sync_time; - globalStats.buf_written_checkpoints += msg->m_buf_written_checkpoints; - globalStats.buf_written_clean += msg->m_buf_written_clean; - globalStats.maxwritten_clean += msg->m_maxwritten_clean; - globalStats.buf_written_backend += msg->m_buf_written_backend; - globalStats.buf_fsync_backend += msg->m_buf_fsync_backend; - globalStats.buf_alloc += msg->m_buf_alloc; + globalStats.bgwriterstats.timed_checkpoints += msg->m_timed_checkpoints; + globalStats.bgwriterstats.requested_checkpoints += msg->m_requested_checkpoints; + globalStats.bgwriterstats.checkpoint_write_time += msg->m_checkpoint_write_time; + globalStats.bgwriterstats.checkpoint_sync_time += msg->m_checkpoint_sync_time; + globalStats.bgwriterstats.buf_written_checkpoints += msg->m_buf_written_checkpoints; + globalStats.bgwriterstats.buf_written_clean += msg->m_buf_written_clean; + globalStats.bgwriterstats.maxwritten_clean += msg->m_maxwritten_clean; + globalStats.bgwriterstats.buf_written_backend += msg->m_buf_written_backend; + globalStats.bgwriterstats.buf_fsync_backend += msg->m_buf_fsync_backend; + globalStats.bgwriterstats.buf_alloc += msg->m_buf_alloc; } /* ---------- @@ -4574,6 +4654,27 @@ pgstat_recv_recoveryconflict(PgStat_MsgRecoveryConflict *msg, int len) } /* ---------- + * pgstat_recv_lwlockstat() - + * + * Process a LWLockstat message. + * ---------- + */ +static void +pgstat_recv_lwlockstat(PgStat_MsgLWLockstat *msg, int len) +{ + int i; + + for (i=0 ; i<msg->m_nentries ; i++) + { + int32 lockid = msg->m_entry[i].lockid; + + globalStats.lwlockstats.lwlock_stat[lockid].calls += msg->m_entry[i].calls; + globalStats.lwlockstats.lwlock_stat[lockid].waits += msg->m_entry[i].waits; + globalStats.lwlockstats.lwlock_stat[lockid].waited_time += msg->m_entry[i].waited_time; + } +} + +/* ---------- * pgstat_recv_deadlock() - * * Process a DEADLOCK message. diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c index 5e1ce17..402799d 100644 --- a/src/backend/storage/lmgr/lwlock.c +++ b/src/backend/storage/lmgr/lwlock.c @@ -32,6 +32,7 @@ #include "storage/proc.h" #include "storage/spin.h" +#include <sys/time.h> /* We use the ShmemLock spinlock to protect LWLockAssign */ extern slock_t *ShmemLock; @@ -48,6 +49,25 @@ typedef struct LWLock /* tail is undefined when head is NULL */ } LWLock; +typedef struct LWLockCounter2 +{ + /* statistics stuff */ + uint64 calls; + uint64 waits; + uint64 time_ms; +} LWLockCounter2; + +/* + * LWLockCounterLocal has <NumFixedLWLocks> counters + * and one additional counter for dynamic LWLocks + * to hold lwlock statistic in the local session. + */ +LWLockCounter2 LWLockCounterLocal[NumFixedLWLocks+1]; + +LWLockCounter2 LWLockCounterGlobal[NumFixedLWLocks+1]; + +#define LWLockCounterId(X) ((X) < (NumFixedLWLocks+1) ? (X) : (NumFixedLWLocks+1)) + /* * All the LWLock structs are allocated as an array in shared memory. * (LWLockIds are indexes into the array.) We force the array stride to @@ -90,6 +110,8 @@ static LWLockId held_lwlocks[MAX_SIMUL_LWLOCKS]; static int lock_addin_request = 0; static bool lock_addin_request_allowed = true; +static void InitLWockCounter(void); + #ifdef LWLOCK_STATS static int counts_for_pid = 0; static int *sh_acquire_counts; @@ -253,6 +275,26 @@ LWLockShmemSize(void) return size; } +/* + * Initialize local and global counters for lwlock statistics. + */ +static void +InitLWockCounter(void) +{ + int i; + + for (i=0 ; i<NumFixedLWLocks+1 ; i++) + { + LWLockCounterLocal[i].calls = 0; + LWLockCounterLocal[i].waits = 0; + LWLockCounterLocal[i].time_ms = 0; + + LWLockCounterGlobal[i].calls = 0; + LWLockCounterGlobal[i].waits = 0; + LWLockCounterGlobal[i].time_ms = 0; + } +} + /* * Allocate shmem space for LWLocks and initialize the locks. @@ -298,6 +340,8 @@ CreateLWLocks(void) LWLockCounter = (int *) ((char *) LWLockArray - 2 * sizeof(int)); LWLockCounter[0] = (int) NumFixedLWLocks; LWLockCounter[1] = numLocks; + + InitLWockCounter(); } @@ -344,9 +388,13 @@ LWLockAcquire(LWLockId lockid, LWLockMode mode) PGPROC *proc = MyProc; bool retry = false; int extraWaits = 0; + struct timeval wait_start,wait_done; PRINT_LWDEBUG("LWLockAcquire", lockid, lock); + LWLockCounterLocal[ LWLockCounterId(lockid) ].calls++; + LWLockCounterGlobal[ LWLockCounterId(lockid) ].calls++; + #ifdef LWLOCK_STATS /* Set up local count state first time through in a given process */ if (counts_for_pid != MyProcPid) @@ -395,6 +443,7 @@ LWLockAcquire(LWLockId lockid, LWLockMode mode) for (;;) { bool mustwait; + uint64 waited; /* Acquire mutex. Time spent holding mutex should be short! */ #ifdef LWLOCK_STATS @@ -473,6 +522,9 @@ LWLockAcquire(LWLockId lockid, LWLockMode mode) #endif TRACE_POSTGRESQL_LWLOCK_WAIT_START(lockid, mode); + LWLockCounterLocal[ LWLockCounterId(lockid) ].waits++; + LWLockCounterGlobal[ LWLockCounterId(lockid) ].waits++; + gettimeofday(&wait_start, NULL); for (;;) { @@ -484,6 +536,20 @@ LWLockAcquire(LWLockId lockid, LWLockMode mode) } TRACE_POSTGRESQL_LWLOCK_WAIT_DONE(lockid, mode); + gettimeofday(&wait_done, NULL); + + if ( wait_done.tv_usec >= wait_start.tv_usec ) + { + waited = ( wait_done.tv_usec - wait_start.tv_usec ) / 1000 ; + waited += ( wait_done.tv_sec - wait_start.tv_sec ) * 1000 ; + } + else + { + waited = ( wait_done.tv_usec + 1000*1000 - wait_start.tv_usec ) / 1000 ; + waited += ( wait_done.tv_sec - 1 - wait_start.tv_sec ) * 1000 ; + } + LWLockCounterLocal[ LWLockCounterId(lockid) ].time_ms += waited; + LWLockCounterGlobal[ LWLockCounterId(lockid) ].time_ms += waited; LOG_LWDEBUG("LWLockAcquire", lockid, "awakened"); @@ -885,3 +951,55 @@ LWLockHeldByMe(LWLockId lockid) } return false; } + +uint64 +lwlock_get_stat_calls_local(LWLockId lockid) +{ + return LWLockCounterLocal[ LWLockCounterId(lockid) ].calls; +} + +uint64 +lwlock_get_stat_waits_local(LWLockId lockid) +{ + return LWLockCounterLocal[ LWLockCounterId(lockid) ].waits; +} + +uint64 +lwlock_get_stat_time_ms_local(LWLockId lockid) +{ + return LWLockCounterLocal[ LWLockCounterId(lockid) ].time_ms; +} + +void +lwlock_reset_stat_local(LWLockId lockid) +{ + LWLockCounterLocal[ LWLockCounterId(lockid) ].calls = 0; + LWLockCounterLocal[ LWLockCounterId(lockid) ].waits = 0; + LWLockCounterLocal[ LWLockCounterId(lockid) ].time_ms = 0; +} + +uint64 +lwlock_get_stat_calls_global(LWLockId lockid) +{ + return LWLockCounterGlobal[ LWLockCounterId(lockid) ].calls; +} + +uint64 +lwlock_get_stat_waits_global(LWLockId lockid) +{ + return LWLockCounterGlobal[ LWLockCounterId(lockid) ].waits; +} + +uint64 +lwlock_get_stat_time_ms_global(LWLockId lockid) +{ + return LWLockCounterGlobal[ LWLockCounterId(lockid) ].time_ms; +} + +void +lwlock_reset_stat_global(LWLockId lockid) +{ + LWLockCounterGlobal[ LWLockCounterId(lockid) ].calls = 0; + LWLockCounterGlobal[ LWLockCounterId(lockid) ].waits = 0; + LWLockCounterGlobal[ LWLockCounterId(lockid) ].time_ms = 0; +} diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 585db1a..5ca2c6f 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -3919,6 +3919,8 @@ PostgresMain(int argc, char *argv[], const char *username) pgstat_report_activity(STATE_IDLE, NULL); } + pgstat_report_lwlockstat(); + ReadyForQuery(whereToSendOutput); send_ready_for_query = false; } diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c index 7d4059f..0a26626 100644 --- a/src/backend/utils/adt/pgstatfuncs.c +++ b/src/backend/utils/adt/pgstatfuncs.c @@ -118,6 +118,8 @@ extern Datum pg_stat_reset_shared(PG_FUNCTION_ARGS); extern Datum pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS); extern Datum pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS); +extern Datum pg_stat_get_lwlocks(PG_FUNCTION_ARGS); + /* Global bgwriter statistics, from bgwriter.c */ extern PgStat_MsgBgWriter bgwriterStats; @@ -1399,69 +1401,69 @@ pg_stat_get_db_blk_write_time(PG_FUNCTION_ARGS) Datum pg_stat_get_bgwriter_timed_checkpoints(PG_FUNCTION_ARGS) { - PG_RETURN_INT64(pgstat_fetch_global()->timed_checkpoints); + PG_RETURN_INT64(pgstat_fetch_global()->bgwriterstats.timed_checkpoints); } Datum pg_stat_get_bgwriter_requested_checkpoints(PG_FUNCTION_ARGS) { - PG_RETURN_INT64(pgstat_fetch_global()->requested_checkpoints); + PG_RETURN_INT64(pgstat_fetch_global()->bgwriterstats.requested_checkpoints); } Datum pg_stat_get_bgwriter_buf_written_checkpoints(PG_FUNCTION_ARGS) { - PG_RETURN_INT64(pgstat_fetch_global()->buf_written_checkpoints); + PG_RETURN_INT64(pgstat_fetch_global()->bgwriterstats.buf_written_checkpoints); } Datum pg_stat_get_bgwriter_buf_written_clean(PG_FUNCTION_ARGS) { - PG_RETURN_INT64(pgstat_fetch_global()->buf_written_clean); + PG_RETURN_INT64(pgstat_fetch_global()->bgwriterstats.buf_written_clean); } Datum pg_stat_get_bgwriter_maxwritten_clean(PG_FUNCTION_ARGS) { - PG_RETURN_INT64(pgstat_fetch_global()->maxwritten_clean); + PG_RETURN_INT64(pgstat_fetch_global()->bgwriterstats.maxwritten_clean); } Datum pg_stat_get_checkpoint_write_time(PG_FUNCTION_ARGS) { /* time is already in msec, just convert to double for presentation */ - PG_RETURN_FLOAT8((double) pgstat_fetch_global()->checkpoint_write_time); + PG_RETURN_FLOAT8((double) pgstat_fetch_global()->bgwriterstats.checkpoint_write_time); } Datum pg_stat_get_checkpoint_sync_time(PG_FUNCTION_ARGS) { /* time is already in msec, just convert to double for presentation */ - PG_RETURN_FLOAT8((double) pgstat_fetch_global()->checkpoint_sync_time); + PG_RETURN_FLOAT8((double) pgstat_fetch_global()->bgwriterstats.checkpoint_sync_time); } Datum pg_stat_get_bgwriter_stat_reset_time(PG_FUNCTION_ARGS) { - PG_RETURN_TIMESTAMPTZ(pgstat_fetch_global()->stat_reset_timestamp); + PG_RETURN_TIMESTAMPTZ(pgstat_fetch_global()->bgwriterstats.reset_timestamp); } Datum pg_stat_get_buf_written_backend(PG_FUNCTION_ARGS) { - PG_RETURN_INT64(pgstat_fetch_global()->buf_written_backend); + PG_RETURN_INT64(pgstat_fetch_global()->bgwriterstats.buf_written_backend); } Datum pg_stat_get_buf_fsync_backend(PG_FUNCTION_ARGS) { - PG_RETURN_INT64(pgstat_fetch_global()->buf_fsync_backend); + PG_RETURN_INT64(pgstat_fetch_global()->bgwriterstats.buf_fsync_backend); } Datum pg_stat_get_buf_alloc(PG_FUNCTION_ARGS) { - PG_RETURN_INT64(pgstat_fetch_global()->buf_alloc); + PG_RETURN_INT64(pgstat_fetch_global()->bgwriterstats.buf_alloc); } Datum @@ -1701,3 +1703,162 @@ pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS) PG_RETURN_VOID(); } + +Datum +pg_stat_get_lwlocks(PG_FUNCTION_ARGS) +{ + FuncCallContext *funcctx; + + /* stuff done only on the first call of the function */ + if (SRF_IS_FIRSTCALL()) + { + MemoryContext oldcontext; + TupleDesc tupdesc; + + /* create a function context for cross-call persistence */ + funcctx = SRF_FIRSTCALL_INIT(); + + oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); + + tupdesc = CreateTemplateTupleDesc(7, false); + TupleDescInitEntry(tupdesc, (AttrNumber) 1, "lockid", + INT8OID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 2, "local_calls", + INT8OID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 3, "local_waits", + INT8OID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 4, "local_time_ms", + INT8OID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 5, "shared_calls", + INT8OID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 6, "shared_waits", + INT8OID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 7, "shared_time_ms", + INT8OID, -1, 0); + + funcctx->tuple_desc = BlessTupleDesc(tupdesc); + funcctx->max_calls = NumFixedLWLocks + 1; + + MemoryContextSwitchTo(oldcontext); + } + + /* stuff done on every call of the function */ + funcctx = SRF_PERCALL_SETUP(); + + if (funcctx->call_cntr < funcctx->max_calls) + { + Datum values[7]; + bool nulls[7]; + HeapTuple tuple; + LWLockId lockid; + uint64 local_calls,local_waits,local_time_ms; + uint64 shared_calls,shared_waits,shared_time_ms; + int i; + PgStat_LWLockEntry *lwlock_stat = pgstat_fetch_global()->lwlockstats.lwlock_stat; + + MemSet(values, 0, sizeof(values)); + MemSet(nulls, 0, sizeof(nulls)); + + lockid = funcctx->call_cntr; + + local_calls = local_waits = local_time_ms = 0; + shared_calls = shared_waits = shared_time_ms = 0; + + /* + * Partitioned locks need to be summed up by the lock group. + */ + if ( FirstBufMappingLock <= lockid && lockid < FirstLockMgrLock ) + { + for (i=0 ; i<NUM_BUFFER_PARTITIONS ; i++) + { + /* local statistics */ + local_calls = lwlock_get_stat_calls_local(FirstBufMappingLock+i); + local_waits = lwlock_get_stat_waits_local(FirstBufMappingLock+i); + local_time_ms = lwlock_get_stat_time_ms_local(FirstBufMappingLock+i); + + /* global statistics */ + shared_calls += lwlock_stat[FirstBufMappingLock+i].calls; + shared_waits += lwlock_stat[FirstBufMappingLock+i].waits; + shared_time_ms += lwlock_stat[FirstBufMappingLock+i].waited_time; + } + + funcctx->call_cntr += NUM_BUFFER_PARTITIONS; + } + else if ( FirstLockMgrLock <= lockid && lockid < FirstPredicateLockMgrLock ) + { + for (i=0 ; i<NUM_LOCK_PARTITIONS ; i++) + { + /* local statistics */ + local_calls = lwlock_get_stat_calls_local(FirstLockMgrLock+i); + local_waits = lwlock_get_stat_waits_local(FirstLockMgrLock+i); + local_time_ms = lwlock_get_stat_time_ms_local(FirstLockMgrLock+i); + + /* global statistics */ + shared_calls += lwlock_stat[FirstLockMgrLock+i].calls; + shared_waits += lwlock_stat[FirstLockMgrLock+i].waits; + shared_time_ms += lwlock_stat[FirstLockMgrLock+i].waited_time; + } + + funcctx->call_cntr += NUM_LOCK_PARTITIONS; + } + else if ( FirstPredicateLockMgrLock <= lockid && lockid < NumFixedLWLocks ) + { + for (i=0 ; i<NUM_PREDICATELOCK_PARTITIONS ; i++) + { + /* local statistics */ + local_calls = lwlock_get_stat_calls_local(FirstPredicateLockMgrLock+i); + local_waits = lwlock_get_stat_waits_local(FirstPredicateLockMgrLock+i); + local_time_ms = lwlock_get_stat_time_ms_local(FirstPredicateLockMgrLock+i); + + /* global statistics */ + shared_calls += lwlock_stat[FirstPredicateLockMgrLock+i].calls; + shared_waits += lwlock_stat[FirstPredicateLockMgrLock+i].waits; + shared_time_ms += lwlock_stat[FirstPredicateLockMgrLock+i].waited_time; + } + + funcctx->call_cntr += NUM_PREDICATELOCK_PARTITIONS; + } + else + { + /* local statistics */ + local_calls = lwlock_get_stat_calls_local(lockid); + local_waits = lwlock_get_stat_waits_local(lockid); + local_time_ms = lwlock_get_stat_time_ms_local(lockid); + + /* global statistics */ + shared_calls = lwlock_stat[lockid].calls; + shared_waits = lwlock_stat[lockid].waits; + shared_time_ms = lwlock_stat[lockid].waited_time; + } + + values[0] = Int64GetDatum(lockid); + values[1] = Int64GetDatum(local_calls); + values[2] = Int64GetDatum(local_waits); + values[3] = Int64GetDatum(local_time_ms); + values[4] = Int64GetDatum(shared_calls); + values[5] = Int64GetDatum(shared_waits); + values[6] = Int64GetDatum(shared_time_ms); + + tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls); + + SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(tuple)); + } + else + { + SRF_RETURN_DONE(funcctx); + } +} + +Datum +pg_stat_reset_lwlocks(PG_FUNCTION_ARGS) +{ + LWLockId lockid; + + for (lockid=0 ; lockid<NumLWLocks() ; lockid++) + { + lwlock_reset_stat_local(lockid); + } + + PG_RETURN_VOID(); +} + diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index f935eb1..4582b12 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -2612,6 +2612,10 @@ DATA(insert OID = 1936 ( pg_stat_get_backend_idset PGNSP PGUID 12 1 100 0 0 f DESCR("statistics: currently active backend IDs"); DATA(insert OID = 2022 ( pg_stat_get_activity PGNSP PGUID 12 1 100 0 0 f f f f f t s 1 0 2249 "23" "{23,26,23,26,25,25,25,16,1184,1184,1184,1184,869,25,23}" "{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o}" "{pid,datid,pid,usesysid,application_name,state,query,waiting,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port}" _null_ pg_stat_get_activity _null_ _null_ _null_ )); DESCR("statistics: information about currently active backends"); +DATA(insert OID = 3764 ( pg_stat_get_lwlocks PGNSP PGUID 12 1 100 0 0 f f f f f t s 0 0 2249 "" "{20,20,20,20,20,20,20}" "{o,o,o,o,o,o,o}" "{lwlockid,local_calls,local_waits,local_time_ms,shared_calls,shared_waits,shared_time_ms}" _null_ pg_stat_get_lwlocks _null_ _null_ _null_ )); +DESCR("statistics: light-weight lock statistics"); +DATA(insert OID = 3765 ( pg_stat_reset_lwlocks PGNSP PGUID 12 1 0 0 0 f f f f f f v 0 0 2278 "" _null_ _null_ _null_ _null_ pg_stat_reset_lwlocks _null_ _null_ _null_ )); +DESCR("statistics: reset light-weight lock statistics"); DATA(insert OID = 3099 ( pg_stat_get_wal_senders PGNSP PGUID 12 1 10 0 0 f f f f f t s 0 0 2249 "" "{23,25,25,25,25,25,23,25}" "{o,o,o,o,o,o,o,o}" "{pid,state,sent_location,write_location,flush_location,replay_location,sync_priority,sync_state}" _null_ pg_stat_get_wal_senders _null_ _null_ _null_ )); DESCR("statistics: information about currently active replication"); DATA(insert OID = 2026 ( pg_backend_pid PGNSP PGUID 12 1 0 0 0 f f f f t f s 0 0 23 "" _null_ _null_ _null_ _null_ pg_backend_pid _null_ _null_ _null_ )); diff --git a/src/include/pgstat.h b/src/include/pgstat.h index 613c1c2..0e52534 100644 --- a/src/include/pgstat.h +++ b/src/include/pgstat.h @@ -15,6 +15,7 @@ #include "fmgr.h" #include "libpq/pqcomm.h" #include "portability/instr_time.h" +#include "storage/lwlock.h" #include "utils/hsearch.h" #include "utils/relcache.h" @@ -49,6 +50,7 @@ typedef enum StatMsgType PGSTAT_MTYPE_FUNCPURGE, PGSTAT_MTYPE_RECOVERYCONFLICT, PGSTAT_MTYPE_TEMPFILE, + PGSTAT_MTYPE_LWLOCKSTAT, PGSTAT_MTYPE_DEADLOCK } StatMsgType; @@ -102,7 +104,8 @@ typedef struct PgStat_TableCounts /* Possible targets for resetting cluster-wide shared values */ typedef enum PgStat_Shared_Reset_Target { - RESET_BGWRITER + RESET_BGWRITER, + RESET_LWLOCKSTAT } PgStat_Shared_Reset_Target; /* Possible object types for resetting single counters */ @@ -605,13 +608,31 @@ typedef struct PgStat_StatFuncEntry PgStat_Counter f_self_time; } PgStat_StatFuncEntry; +#define MAX_LWLOCKSTAT_ENTRIES 20 + +typedef struct PgStat_LWLockEntry +{ + LWLockId lockid; + PgStat_Counter calls; + PgStat_Counter waits; + PgStat_Counter waited_time; /* time in milliseconds */ +} PgStat_LWLockEntry; + +typedef struct PgStat_MsgLWLockstat +{ + PgStat_MsgHdr m_hdr; + int m_nentries; + + /* Need to keep a msg smaller than PGSTAT_MSG_PAYLOAD */ + PgStat_LWLockEntry m_entry[MAX_LWLOCKSTAT_ENTRIES]; +} PgStat_MsgLWLockstat; + /* - * Global statistics kept in the stats collector + * Global statistics for BgWriter */ -typedef struct PgStat_GlobalStats +typedef struct PgStat_BgWriterGlobalStats { - TimestampTz stats_timestamp; /* time of stats file update */ PgStat_Counter timed_checkpoints; PgStat_Counter requested_checkpoints; PgStat_Counter checkpoint_write_time; /* times in milliseconds */ @@ -622,6 +643,28 @@ typedef struct PgStat_GlobalStats PgStat_Counter buf_written_backend; PgStat_Counter buf_fsync_backend; PgStat_Counter buf_alloc; + TimestampTz reset_timestamp; +} PgStat_BgWriterGlobalStats; + +/* + * Global statistics for LWLocks + */ +typedef struct PgStat_LWLockGlobalStats +{ + PgStat_LWLockEntry lwlock_stat[NumFixedLWLocks+1]; + TimestampTz reset_timestamp; +} PgStat_LWLockGlobalStats; + +/* + * Global statistics kept in the stats collector + */ +typedef struct PgStat_GlobalStats +{ + TimestampTz stats_timestamp; /* time of stats file update */ + + PgStat_BgWriterGlobalStats bgwriterstats; + PgStat_LWLockGlobalStats lwlockstats; + TimestampTz stat_reset_timestamp; } PgStat_GlobalStats; diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h index 82d8ec4..a101db5 100644 --- a/src/include/storage/lwlock.h +++ b/src/include/storage/lwlock.h @@ -119,4 +119,14 @@ extern void CreateLWLocks(void); extern void RequestAddinLWLocks(int n); +extern uint64 lwlock_get_stat_calls_local(LWLockId); +extern uint64 lwlock_get_stat_waits_local(LWLockId); +extern uint64 lwlock_get_stat_time_ms_local(LWLockId); +extern void lwlock_reset_stat_local(LWLockId); + +extern uint64 lwlock_get_stat_calls_global(LWLockId); +extern uint64 lwlock_get_stat_waits_global(LWLockId); +extern uint64 lwlock_get_stat_time_ms_global(LWLockId); +extern void lwlock_reset_stat_global(LWLockId); + #endif /* LWLOCK_H */
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers