Hi all,
I've been working on a new system view, pg_stat_lwlocks, to observe
LWLock, and just completed my 'proof-of-concept' code that can work
with version 9.1.
Now, I'd like to know the possibility of this feature for future
release.
With this patch, DBA can easily determine a bottleneck around lwlocks.
--------------------------------------------------
postgres=# SELECT * FROM pg_stat_lwlocks ORDER BY time_ms DESC LIMIT 10;
lwlockid | calls | waits | time_ms
----------+--------+-------+---------
49 | 193326 | 32096 | 23688
8 | 3305 | 133 | 1335
2 | 21 | 0 | 0
4 | 135188 | 0 | 0
5 | 57935 | 0 | 0
6 | 141 | 0 | 0
7 | 24580 | 1 | 0
3 | 3282 | 0 | 0
1 | 41 | 0 | 0
9 | 3 | 0 | 0
(10 rows)
postgres=#
--------------------------------------------------
In this view,
'lwlockid' column represents LWLockId used in the backends.
'calls' represents how many times LWLockAcquire() was called.
'waits' represents how many times LWLockAcquire() needed to wait
within it before lock acquisition.
'time_ms' represents how long LWLockAcquire() totally waited on
a lwlock.
And lwlocks that use a LWLockId range, such as BufMappingLock or
LockMgrLock, would be grouped and summed up in a single record.
For example, lwlockid 49 in the above view represents LockMgrLock
statistics.
Now, I know there are some considerations.
(1) Performance
I've measured LWLock performance both with and without the patch,
and confirmed that this patch does not affect the LWLock perfomance
at all.
pgbench scores with the patch:
tps = 900.906658 (excluding connections establishing)
tps = 908.528422 (excluding connections establishing)
tps = 903.900977 (excluding connections establishing)
tps = 910.470595 (excluding connections establishing)
tps = 909.685396 (excluding connections establishing)
pgbench scores without the patch:
tps = 909.096785 (excluding connections establishing)
tps = 894.868712 (excluding connections establishing)
tps = 910.074669 (excluding connections establishing)
tps = 904.022770 (excluding connections establishing)
tps = 895.673830 (excluding connections establishing)
Of course, this experiment was not I/O bound, and the cache hit ratio
was >99.9%.
(2) Memory space
In this patch, I added three new members to LWLock structure
as uint64 to collect statistics.
It means that those members must be held in the shared memory,
but I'm not sure whether it's appropriate.
I think another possible option is holding those statistics
values in local (backend) process memory, and send them through
the stat collector process (like other statistics values).
(3) LWLock names (or labels)
Now, pg_stat_lwlocks view shows LWLockId itself. But LWLockId is
not easy for DBA to determine actual lock type.
So, I want to show LWLock names (or labels), like 'WALWriteLock'
or 'LockMgrLock', but how should I implement it?
Any comments?
Regards,
--
Satoshi Nagayasu <[email protected]>
Uptime Technologies, LLC. http://www.uptime.jp
diff -rc postgresql-9.1.2.orig/src/backend/catalog/postgres.bki
postgresql-9.1.2/src/backend/catalog/postgres.bki
*** postgresql-9.1.2.orig/src/backend/catalog/postgres.bki 2012-06-20
03:32:46.000000000 +0900
--- postgresql-9.1.2/src/backend/catalog/postgres.bki 2012-06-26
01:51:52.000000000 +0900
***************
*** 1553,1558 ****
--- 1553,1559 ----
insert OID = 3071 ( pg_xlog_replay_pause 11 10 12 1 0 0 f f f t f v 0 0 2278
"" _null_ _null_ _null_ _null_ pg_xlog_replay_pause _null_ _null_ _null_ )
insert OID = 3072 ( pg_xlog_replay_resume 11 10 12 1 0 0 f f f t f v 0 0 2278
"" _null_ _null_ _null_ _null_ pg_xlog_replay_resume _null_ _null_ _null_ )
insert OID = 3073 ( pg_is_xlog_replay_paused 11 10 12 1 0 0 f f f t f v 0 0
16 "" _null_ _null_ _null_ _null_ pg_is_xlog_replay_paused _null_ _null_ _null_
)
+ insert OID = 3764 ( pg_stat_get_lwlocks 11 10 12 1 100 0 f f f f t s 0 0 2249
"" "{20,20,20,20}" "{o,o,o,o}" "{lwlockid,calls,waits,time_ms}" _null_
pg_stat_get_lwlocks _null_ _null_ _null_ )
insert OID = 2621 ( pg_reload_conf 11 10 12 1 0 0 f f f t f v 0 0 16 ""
_null_ _null_ _null_ _null_ pg_reload_conf _null_ _null_ _null_ )
insert OID = 2622 ( pg_rotate_logfile 11 10 12 1 0 0 f f f t f v 0 0 16 ""
_null_ _null_ _null_ _null_ pg_rotate_logfile _null_ _null_ _null_ )
insert OID = 2623 ( pg_stat_file 11 10 12 1 0 0 f f f t f v 1 0 2249 "25"
"{25,20,1184,1184,1184,1184,16}" "{i,o,o,o,o,o,o}"
"{filename,size,access,modification,change,creation,isdir}" _null_ pg_stat_file
_null_ _null_ _null_ )
diff -rc postgresql-9.1.2.orig/src/backend/catalog/postgres.description
postgresql-9.1.2/src/backend/catalog/postgres.description
*** postgresql-9.1.2.orig/src/backend/catalog/postgres.description
2012-06-20 03:32:46.000000000 +0900
--- postgresql-9.1.2/src/backend/catalog/postgres.description 2012-06-26
01:51:52.000000000 +0900
***************
*** 947,952 ****
--- 947,953 ----
3071 pg_proc 0 pause xlog replay
3072 pg_proc 0 resume xlog replay, if it was paused
3073 pg_proc 0 true if xlog replay is paused
+ 3764 pg_proc 0 light-weight lock statistics
2621 pg_proc 0 reload configuration files
2622 pg_proc 0 rotate log file
2623 pg_proc 0 get information about file
diff -rc postgresql-9.1.2.orig/src/backend/catalog/system_views.sql
postgresql-9.1.2/src/backend/catalog/system_views.sql
*** postgresql-9.1.2.orig/src/backend/catalog/system_views.sql 2012-06-20
03:32:46.000000000 +0900
--- postgresql-9.1.2/src/backend/catalog/system_views.sql 2012-06-26
03:35:59.000000000 +0900
***************
*** 594,599 ****
--- 594,607 ----
pg_stat_get_buf_alloc() AS buffers_alloc,
pg_stat_get_bgwriter_stat_reset_time() AS stats_reset;
+ CREATE VIEW pg_stat_lwlocks AS
+ SELECT
+ S.lwlockid,
+ S.calls,
+ S.waits,
+ S.time_ms
+ FROM pg_stat_get_lwlocks() AS S;
+
CREATE VIEW pg_user_mappings AS
SELECT
U.oid AS umid,
diff -rc postgresql-9.1.2.orig/src/backend/storage/lmgr/lwlock.c
postgresql-9.1.2/src/backend/storage/lmgr/lwlock.c
*** postgresql-9.1.2.orig/src/backend/storage/lmgr/lwlock.c 2012-06-20
03:32:46.000000000 +0900
--- postgresql-9.1.2/src/backend/storage/lmgr/lwlock.c 2012-06-26
01:57:28.000000000 +0900
***************
*** 32,37 ****
--- 32,38 ----
#include "storage/proc.h"
#include "storage/spin.h"
+ #include <sys/time.h>
/* We use the ShmemLock spinlock to protect LWLockAssign */
extern slock_t *ShmemLock;
***************
*** 46,51 ****
--- 47,57 ----
PGPROC *head; /* head of list of waiting
PGPROCs */
PGPROC *tail; /* tail of list of waiting
PGPROCs */
/* tail is undefined when head is NULL */
+
+ /* statistics stuff */
+ uint64 calls;
+ uint64 waits;
+ uint64 time_ms;
} LWLock;
/*
***************
*** 268,273 ****
--- 274,282 ----
lock->lock.shared = 0;
lock->lock.head = NULL;
lock->lock.tail = NULL;
+ lock->lock.calls = 0;
+ lock->lock.waits = 0;
+ lock->lock.time_ms = 0;
}
/*
***************
*** 323,330 ****
--- 332,341 ----
PGPROC *proc = MyProc;
bool retry = false;
int extraWaits = 0;
+ struct timeval t1,t2;
PRINT_LWDEBUG("LWLockAcquire", lockid, lock);
+ lock->calls++;
#ifdef LWLOCK_STATS
/* Set up local count state first time through in a given process */
***************
*** 457,462 ****
--- 468,475 ----
#endif
TRACE_POSTGRESQL_LWLOCK_WAIT_START(lockid, mode);
+ lock->waits++;
+ gettimeofday(&t1, NULL);
for (;;)
{
***************
*** 468,473 ****
--- 481,488 ----
}
TRACE_POSTGRESQL_LWLOCK_WAIT_DONE(lockid, mode);
+ gettimeofday(&t2, NULL);
+ lock->time_ms += (t2.tv_sec-t1.tv_sec)*1000 +
(t2.tv_usec-t1.tv_usec)/1000;
LOG_LWDEBUG("LWLockAcquire", lockid, "awakened");
***************
*** 701,703 ****
--- 716,752 ----
}
return false;
}
+
+ void
+ lwlock_get_stat(LWLockId lockid, uint64 *calls, uint64 *waits, uint64
*time_ms)
+ {
+ volatile LWLock *lock = &(LWLockArray[lockid].lock);
+
+ SpinLockAcquire(&lock->mutex);
+ *calls = lock->calls;
+ *waits = lock->waits;
+ *time_ms = lock->time_ms;
+ SpinLockRelease(&lock->mutex);
+ }
+
+ void
+ lwlock_get_stat_sum(LWLockId start, LWLockId end, uint64 *calls, uint64
*waits, uint64 *time_ms)
+ {
+ LWLockId lockid;
+
+ *calls = 0;
+ *waits = 0;
+ *time_ms = 0;
+
+ for (lockid=start ; lockid<=end ; lockid++)
+ {
+ uint64 calls2, waits2, time_ms2;
+
+ lwlock_get_stat(lockid, &calls2, &waits2, &time_ms2);
+
+ *calls += calls2;
+ *waits += waits2;
+ *time_ms += time_ms2;
+ }
+ }
+
diff -rc postgresql-9.1.2.orig/src/backend/utils/adt/pgstatfuncs.c
postgresql-9.1.2/src/backend/utils/adt/pgstatfuncs.c
*** postgresql-9.1.2.orig/src/backend/utils/adt/pgstatfuncs.c 2012-06-20
03:32:50.000000000 +0900
--- postgresql-9.1.2/src/backend/utils/adt/pgstatfuncs.c 2012-06-26
01:59:17.000000000 +0900
***************
*** 109,114 ****
--- 109,116 ----
extern Datum pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS);
extern Datum pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS);
+ extern Datum pg_stat_get_lwlocks(PG_FUNCTION_ARGS);
+
/* Global bgwriter statistics, from bgwriter.c */
extern PgStat_MsgBgWriter bgwriterStats;
***************
*** 1572,1574 ****
--- 1574,1670 ----
PG_RETURN_VOID();
}
+
+
+ Datum
+ pg_stat_get_lwlocks(PG_FUNCTION_ARGS)
+ {
+ FuncCallContext *funcctx;
+
+ /* stuff done only on the first call of the function */
+ if (SRF_IS_FIRSTCALL())
+ {
+ MemoryContext oldcontext;
+ TupleDesc tupdesc;
+
+ /* create a function context for cross-call persistence */
+ funcctx = SRF_FIRSTCALL_INIT();
+
+ oldcontext =
MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+ tupdesc = CreateTemplateTupleDesc(4, false);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "lockid",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "calls",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "waits",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "time_ms",
+ INT8OID, -1, 0);
+
+ funcctx->tuple_desc = BlessTupleDesc(tupdesc);
+ funcctx->max_calls = NumLWLocks();
+
+ MemoryContextSwitchTo(oldcontext);
+ }
+
+ /* stuff done on every call of the function */
+ funcctx = SRF_PERCALL_SETUP();
+
+ if (funcctx->call_cntr < funcctx->max_calls)
+ {
+ Datum values[4];
+ bool nulls[4];
+ HeapTuple tuple;
+ LWLockId lockid;
+ uint64 calls,waits,time_ms;
+
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
+
+ lockid = funcctx->call_cntr;
+
+ if ( lockid<FirstBufMappingLock )
+ {
+ lwlock_get_stat(lockid, &calls, &waits, &time_ms);
+ }
+ else if ( FirstBufMappingLock<=lockid &&
lockid<FirstLockMgrLock )
+ {
+ lwlock_get_stat_sum(FirstBufMappingLock,
FirstLockMgrLock-1,
+ &calls, &waits, &time_ms);
+ funcctx->call_cntr = FirstLockMgrLock - 1;
+ }
+ else if ( FirstLockMgrLock<=lockid &&
lockid<FirstPredicateLockMgrLock )
+ {
+ lwlock_get_stat_sum(FirstLockMgrLock,
FirstPredicateLockMgrLock-1,
+ &calls, &waits, &time_ms);
+ funcctx->call_cntr = FirstPredicateLockMgrLock - 1;
+ }
+ else if ( FirstPredicateLockMgrLock<=lockid &&
lockid<NumFixedLWLocks )
+ {
+ lwlock_get_stat_sum(FirstPredicateLockMgrLock,
NumFixedLWLocks-1,
+ &calls, &waits, &time_ms);
+ funcctx->call_cntr = NumFixedLWLocks - 1;
+ }
+ else if ( NumFixedLWLocks<=lockid && lockid<NumLWLocks() )
+ {
+ lwlock_get_stat_sum(NumFixedLWLocks, NumLWLocks()-1,
+ &calls, &waits, &time_ms);
+ funcctx->call_cntr = NumLWLocks() - 1;
+ }
+
+ values[0] = Int64GetDatum(lockid);
+ values[1] = Int64GetDatum(calls);
+ values[2] = Int64GetDatum(waits);
+ values[3] = Int64GetDatum(time_ms);
+
+ tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);
+
+ SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(tuple));
+ }
+ else
+ {
+ SRF_RETURN_DONE(funcctx);
+ }
+ }
+
diff -rc postgresql-9.1.2.orig/src/backend/utils/fmgroids.h
postgresql-9.1.2/src/backend/utils/fmgroids.h
*** postgresql-9.1.2.orig/src/backend/utils/fmgroids.h 2012-06-20
03:32:50.000000000 +0900
--- postgresql-9.1.2/src/backend/utils/fmgroids.h 2012-06-26
01:51:52.000000000 +0900
***************
*** 1977,1982 ****
--- 1977,1983 ----
#define F_GET_CURRENT_TS_CONFIG 3759
#define F_TS_MATCH_TT 3760
#define F_TS_MATCH_TQ 3761
+ #define F_PG_STAT_GET_LWLOCKS 3764
#define F_PG_TS_TEMPLATE_IS_VISIBLE 3768
#define F_REGDICTIONARYIN 3771
#define F_REGDICTIONARYOUT 3772
diff -rc postgresql-9.1.2.orig/src/backend/utils/fmgrtab.c
postgresql-9.1.2/src/backend/utils/fmgrtab.c
*** postgresql-9.1.2.orig/src/backend/utils/fmgrtab.c 2012-06-20
03:32:46.000000000 +0900
--- postgresql-9.1.2/src/backend/utils/fmgrtab.c 2012-06-26
01:51:52.000000000 +0900
***************
*** 1966,1971 ****
--- 1966,1972 ----
extern Datum get_current_ts_config (PG_FUNCTION_ARGS);
extern Datum ts_match_tt (PG_FUNCTION_ARGS);
extern Datum ts_match_tq (PG_FUNCTION_ARGS);
+ extern Datum pg_stat_get_lwlocks (PG_FUNCTION_ARGS);
extern Datum pg_ts_template_is_visible (PG_FUNCTION_ARGS);
extern Datum regdictionaryin (PG_FUNCTION_ARGS);
extern Datum regdictionaryout (PG_FUNCTION_ARGS);
***************
*** 4156,4161 ****
--- 4157,4163 ----
{ 3759, "get_current_ts_config", 0, true, false, get_current_ts_config },
{ 3760, "ts_match_tt", 2, true, false, ts_match_tt },
{ 3761, "ts_match_tq", 2, true, false, ts_match_tq },
+ { 3764, "pg_stat_get_lwlocks", 0, false, true, pg_stat_get_lwlocks },
{ 3768, "pg_ts_template_is_visible", 1, true, false,
pg_ts_template_is_visible },
{ 3771, "regdictionaryin", 1, true, false, regdictionaryin },
{ 3772, "regdictionaryout", 1, true, false, regdictionaryout },
diff -rc postgresql-9.1.2.orig/src/include/catalog/pg_proc.h
postgresql-9.1.2/src/include/catalog/pg_proc.h
*** postgresql-9.1.2.orig/src/include/catalog/pg_proc.h 2012-06-20
03:32:43.000000000 +0900
--- postgresql-9.1.2/src/include/catalog/pg_proc.h 2012-06-26
01:51:46.000000000 +0900
***************
*** 2881,2886 ****
--- 2881,2889 ----
DATA(insert OID = 3073 ( pg_is_xlog_replay_paused PGNSP PGUID 12 1 0 0 f
f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_is_xlog_replay_paused _null_
_null_ _null_ ));
DESCR("true if xlog replay is paused");
+ DATA(insert OID = 3764 ( pg_stat_get_lwlocks PGNSP PGUID 12 1 100 0 f f f f
t s 0 0 2249 "" "{20,20,20,20}" "{o,o,o,o}" "{lwlockid,calls,waits,time_ms}"
_null_ pg_stat_get_lwlocks _null_ _null_ _null_ ));
+ DESCR("light-weight lock statistics");
+
DATA(insert OID = 2621 ( pg_reload_conf PGNSP PGUID 12
1 0 0 f f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_reload_conf _null_
_null_ _null_ ));
DESCR("reload configuration files");
DATA(insert OID = 2622 ( pg_rotate_logfile PGNSP PGUID 12 1 0 0 f
f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_rotate_logfile _null_ _null_
_null_ ));
diff -rc postgresql-9.1.2.orig/src/include/storage/lwlock.h
postgresql-9.1.2/src/include/storage/lwlock.h
*** postgresql-9.1.2.orig/src/include/storage/lwlock.h 2012-06-20
03:32:43.000000000 +0900
--- postgresql-9.1.2/src/include/storage/lwlock.h 2012-06-26
01:57:22.000000000 +0900
***************
*** 115,118 ****
--- 115,121 ----
extern void RequestAddinLWLocks(int n);
+ extern void lwlock_get_stat(LWLockId, uint64 *, uint64 *, uint64 *);
+ extern void lwlock_get_stat_sum(LWLockId, LWLockId, uint64 *, uint64 *,
uint64 *);
+
#endif /* LWLOCK_H */
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers