Hi, I have a patch which is actually not commitfest-ready now, but it always better to start discussing proof of concept having some patch instead of just an idea.
Since I'am a DBA rather than C programmer, I will appreciate any suggestions/critics about the patch and code quality to make things better. What is all about. >From an Oracle DBA's point of view, currently we have a lack of performance diagnostics tools. Saying that, principally they mean an Oracle Wait Interface analogue. The Basic idea is to have counters or sensors all around database kernel to measure what a particular backend is currently waiting for and how long/how often it waits. Obviously, implementing such a complex system is not an easy task. However decomposing the task and implementing some small diagnostics tools proved to be a good solution: things like pg_stat_bgwriter, pg_stat_statements or pg_stat_archiver make life significantly easier. Implementing such histogram for LWLock tracing was my goal. Why LWLock tracing is important. Suppose we have a PostgreSQL instance under heavy write workload, but we do not know any details. We could pull from time to time pg_stat_lwlock function which would say pid n1 currently in WALWriteLock and pid n2 in WALInsertLock. That means we should think about write ahead log tuning. Or pid n1 is in some clog-related LWLock, which means we need move clog to ramdisk. This is a stupid example, but it shows how useful LWLock tracing could be for DBAs. Even better idea is to collect daily LWLock distribution, find most frequent of them etc. Problems. As far as I know, there are two major problems implementing LWLock tracing in Postgres: Performance and stability of the server. The patch https://commitfest.postgresql.org/action/patch_view?id=885 (discussion starts here I hope - http://www.postgresql.org/message-id/4fe8ca2c.3030...@uptime.jp) demonstrates performance problems; LWLOCK_STAT, LOCK_DEBUG and DTrace-like approach are slow, unsafe for production use and a bit clumsy for using by DBA. An Idea. An idea of this patch is to trace LWLocks with the lowest possible performance impact. We put integer lwLockID into procarray, then acquiring the LWLock we put its id to procarray and now we could pull procarray using a function to see if particular pid holds LWLock. Not perfect, but if we see sometimes somebody consumes a lot of particular LWLocks, we could investigate this matter in a more precise way using another tool. Something like that was implemented in the attached patch: issuing pgbench -c 50 -t 1000 -j 50 we have something like that: postgres=# select now(),* from pg_stat_lwlock ; now | lwlockid | pid -------------------------------+----------+------ 2014-10-01 15:11:43.848765+02 | 57 | 4257 (1 row) postgres=# select now(),* from pg_stat_lwlock ; now | lwlockid | pid -------------------------------+----------+------ 2014-10-01 15:11:45.892428+02 | 67 | 4269 2014-10-01 15:11:45.892428+02 | 67 | 4258 2014-10-01 15:11:45.892428+02 | 57 | 4270 2014-10-01 15:11:45.892428+02 | 67 | 4245 2014-10-01 15:11:45.892428+02 | 67 | 4271 2014-10-01 15:11:45.892428+02 | 57 | 4256 2014-10-01 15:11:45.892428+02 | 54 | 4241 (7 rows) postgres=# select now(),* from pg_stat_lwlock ; now | lwlockid | pid -------------------------------+----------+------ 2014-10-01 15:11:47.211024+02 | 58 | 4262 2014-10-01 15:11:47.211024+02 | 69 | 4243 2014-10-01 15:11:47.211024+02 | 69 | 4246 (3 rows) postgres=# select now(),* from pg_stat_lwlock ; now | lwlockid | pid -----+----------+----- (0 rows) postgres=# select now(),* from pg_stat_lwlock ; now | lwlockid | pid -------------------------------+----------+------ 2014-10-01 15:11:49.897357+02 | 55 | 4240 2014-10-01 15:11:49.897357+02 | 61 | 4264 2014-10-01 15:11:49.897357+02 | 55 | 4258 2014-10-01 15:11:49.897357+02 | 61 | 4260 2014-10-01 15:11:49.897357+02 | 61 | 4283 2014-10-01 15:11:49.897357+02 | 62 | 4242 We could collect it to some view or table on a periodic basis. Questions. 1. I've decided to put pg_stat_lwlock into extension pg_stat_lwlock (simply for test purposes). Is it OK, or better to implement it somewhere inside pg_catalog or in another extension (for example pg_stat_statements)? 2. Currently lwLockID is in main procarray (this is for PoC purpose only). I know why procarray was split into two and I know why the main one should be kept as small as possible. Anyway, which design approach better: to keep it inside the main one (this is an important feature, lwLockID is small and I use proc->lwLockId = T_ID(l) to get the id) or to put it into another procarray (create new one or use existent)? 3. Which is the best way to retrieve the name of LWLock instead of only its ID (for usability reasons, WALWriteLock looks more useful than just 8)? To get it from tranche as T_NAME(lock) at the same time as lwLockId seems to be not a nice idea, especially to put the name into procarray to pull it. It spoils whole idea. Any advice? 4. Going through procarray to get pairs pid, lwlockid I do LWLockAcquire(ProcArrayLock, LW_SHARED); I am not quite sure is a good idea, because of its performance impact. Probably such a histogram-style thing does not need strong consistency and ProcArrayLock is not needed? So, any thoughts, should I proceed implementing this feature? Best regards, Ilya PS I am thankful to Simon Riggs and Bruce Momjian for discussing this idea with me on PGCon, and especially to Simon who gave me a brief but pretty useful tutorial how LWLock-related code lives in Postgres. Also I am thankful to Heikki Linnakangas and Magnus Hagander for answering some of my stupid questions about procarray internals. -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com
Description: Binary data
-- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers