Re: [HACKERS] Detail part for still waiting for lock log message
Thank you for feedback. On 26.08.2013 22:20, Alvaro Herrera wrote: 1. this assumes there is only one holder, which is not correct. (Consider two backends holding shared lock on something and another one stuck trying to acquire exclusive) Hmm, true. Then it's not that simple as I thought in first place. I guess it is possible to find all those backends that are holding that shared lock, but i'm not sure about the usefulness anymore as this list can be huge. 2. I think pgstat_get_backend_current_activity() can be helpful. Yes, I saw that function, but i also wanted tx start time. 3. Doesn't this risk excessive overhead? About the overhead, i may be wrong, but i was thinking that that particular backend will be put to sleep anyway and also in normal workload such log messages are not very common (or at least shouldn't be). Can the other backends be gone (or done with the lock) before the report has completed? If this happens, is there a problem? That's why i have added check if other backend is found etc, but maybe i missed something. This is one of reasons why i wanted feedback. Regards Tarvi Pillessaar -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detail part for still waiting for lock log message
Tarvi Pillessaar escribió: Thank you for feedback. On 26.08.2013 22:20, Alvaro Herrera wrote: 1. this assumes there is only one holder, which is not correct. (Consider two backends holding shared lock on something and another one stuck trying to acquire exclusive) Hmm, true. Then it's not that simple as I thought in first place. I guess it is possible to find all those backends that are holding that shared lock, but i'm not sure about the usefulness anymore as this list can be huge. We already do this for deadlock reports, so I think it's fine. 2. I think pgstat_get_backend_current_activity() can be helpful. Yes, I saw that function, but i also wanted tx start time. So add that. But note the existing function has checks for changecount and stuff that your code didn't. I think for robustness you need to add that too. 3. Doesn't this risk excessive overhead? About the overhead, i may be wrong, but i was thinking that that particular backend will be put to sleep anyway and also in normal workload such log messages are not very common (or at least shouldn't be). I guess there's no way to know how long the process is going to sleep after logging this message. Can the other backends be gone (or done with the lock) before the report has completed? If this happens, is there a problem? That's why i have added check if other backend is found etc, but maybe i missed something. This is one of reasons why i wanted feedback. Make sure you understand the changecount stuff and the reasons to use it in pgstat_get_backend_current_activity. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detail part for still waiting for lock log message
Fixed patch attached. Regards, Tarvi Pillessaar On 24.08.2013 17:58, Peter Eisentraut wrote: On Tue, 2013-08-20 at 19:21 +0300, Tarvi Pillessaar wrote: About patch: Patch is tested against 9.2.4. I was not sure that i should check if the lock holder's proclock was found (as lock holder's proclock should be always there), check is there to be on the safe side, but maybe it's unnecessary. If it's not needed then fallback to old behavior (logging without detail) is not needed as well. And yes, i know that the lock holding time is not actually correct and it actually shows milliseconds since transaction start. Please fix this compiler warning: proc.c: In function ‘ProcSleep’: proc.c:1258:6: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index 3bfdd23..d75ca93 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -49,6 +49,7 @@ #include storage/procsignal.h #include storage/spin.h #include utils/timestamp.h +#include pgstat.h /* GUC variables */ @@ -1198,9 +1199,67 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable) } if (myWaitStatus == STATUS_WAITING) -ereport(LOG, - (errmsg(process %d still waiting for %s on %s after %ld.%03d ms, - MyProcPid, modename, buf.data, msecs, usecs))); + { +PROCLOCK *proclock2; +PGPROC *proc2; +PgBackendStatus *be; +bool found = false; + +/* find lock holder */ +proclock2 = (PROCLOCK *) SHMQueueNext((lock-procLocks), (lock-procLocks), + offsetof(PROCLOCK, lockLink)); +while (proclock2) +{ + if (lockMethodTable-conflictTab[lockmode] proclock2-holdMask) + break; + + proclock2 = (PROCLOCK *) SHMQueueNext((lock-procLocks), proclock2-lockLink, + offsetof(PROCLOCK, lockLink)); +} + +if (proclock2) +{ + int numbackends; + + /* get lock holder's beentry */ + proc2 = proclock2-tag.myProc; + numbackends = pgstat_fetch_stat_numbackends(); + for (i = 1; i = numbackends; i++) + { + be = pgstat_fetch_stat_beentry(i); + if (be) + { + if (be-st_procpid == proc2-pid) + { +found = true; +break; + } + } + } +} + +if (found) +{ + long secs2; + int usecs2; + long msecs2; + + /* calculate lock holder's tx duration */ + TimestampDifference(be-st_xact_start_timestamp, GetCurrentTimestamp(), secs2, usecs2); + msecs2 = secs2 * 1000 + usecs2 / 1000; + usecs2 = usecs2 % 1000; + + ereport(LOG, + (errmsg(process %d still waiting for %s on %s after %ld.%03d ms, + MyProcPid, modename, buf.data, msecs, usecs), + errdetail_log(process %d is holding lock for %ld.%03d ms, activity: %s, + proc2-pid, msecs2, usecs2, be-st_activity))); +} +else + ereport(LOG, + (errmsg(process %d still waiting for %s on %s after %ld.%03d ms, + MyProcPid, modename, buf.data, msecs, usecs))); + } else if (myWaitStatus == STATUS_OK) ereport(LOG, (errmsg(process %d acquired %s on %s after %ld.%03d ms, -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detail part for still waiting for lock log message
Tarvi Pillessaar escribió: Fixed patch attached. 1. this assumes there is only one holder, which is not correct. (Consider two backends holding shared lock on something and another one stuck trying to acquire exclusive) 2. I think pgstat_get_backend_current_activity() can be helpful. 3. Doesn't this risk excessive overhead? Can the other backends be gone (or done with the lock) before the report has completed? If this happens, is there a problem? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detail part for still waiting for lock log message
On Tue, 2013-08-20 at 19:21 +0300, Tarvi Pillessaar wrote: About patch: Patch is tested against 9.2.4. I was not sure that i should check if the lock holder's proclock was found (as lock holder's proclock should be always there), check is there to be on the safe side, but maybe it's unnecessary. If it's not needed then fallback to old behavior (logging without detail) is not needed as well. And yes, i know that the lock holding time is not actually correct and it actually shows milliseconds since transaction start. Please fix this compiler warning: proc.c: In function ‘ProcSleep’: proc.c:1258:6: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Detail part for still waiting for lock log message
Hello! From time to time when investigating different locking issues using postgres log i have thought that process x is still waiting for message could be more informative, for example at the moment it is quite painful to find out which session was actually holding that particular lock. I would like to add detail part for this message to show information about the lock holder and also show what the lock holder is actually doing (yes, i know that i could get the lock holder's statement from the log, but this not the same, maybe lock holder was idle in transaction). So, i wrote a small patch, but i am not sure that this is the best/correct way to do it. Maybe someone can take a look at my patch and give some feedback. Even if this idea won't reach to upstream, i would still like to get feedback. About patch: Patch is tested against 9.2.4. I was not sure that i should check if the lock holder's proclock was found (as lock holder's proclock should be always there), check is there to be on the safe side, but maybe it's unnecessary. If it's not needed then fallback to old behavior (logging without detail) is not needed as well. And yes, i know that the lock holding time is not actually correct and it actually shows milliseconds since transaction start. Regards, Tarvi Pillessaar diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index 3bfdd23..4dc7b37 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -49,6 +49,7 @@ #include storage/procsignal.h #include storage/spin.h #include utils/timestamp.h +#include pgstat.h /* GUC variables */ @@ -1198,9 +1199,66 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable) } if (myWaitStatus == STATUS_WAITING) -ereport(LOG, - (errmsg(process %d still waiting for %s on %s after %ld.%03d ms, - MyProcPid, modename, buf.data, msecs, usecs))); + { +PROCLOCK *proclock2; +PGPROC *proc2; +PgBackendStatus *be; +bool found = false; + +/* find lock holder */ +proclock2 = (PROCLOCK *) SHMQueueNext((lock-procLocks), (lock-procLocks), + offsetof(PROCLOCK, lockLink)); +while (proclock2) +{ + if (lockMethodTable-conflictTab[lockmode] proclock2-holdMask) + break; + + proclock2 = (PROCLOCK *) SHMQueueNext((lock-procLocks), proclock2-lockLink, + offsetof(PROCLOCK, lockLink)); +} + +if (proclock2) +{ + proc2 = proclock2-tag.myProc; + + /* get lock holder's beentry */ + int numbackends = pgstat_fetch_stat_numbackends(); + for (i = 1; i = numbackends; i++) + { + be = pgstat_fetch_stat_beentry(i); + if (be) + { + if (be-st_procpid == proc2-pid) + { +found = true; +break; + } + } + } +} + +if (found) +{ + long secs2; + int usecs2; + long msecs2; + + /* calculate lock holder's tx duration */ + TimestampDifference(be-st_xact_start_timestamp, GetCurrentTimestamp(), secs2, usecs2); + msecs2 = secs2 * 1000 + usecs2 / 1000; + usecs2 = usecs2 % 1000; + + ereport(LOG, + (errmsg(process %d still waiting for %s on %s after %ld.%03d ms, + MyProcPid, modename, buf.data, msecs, usecs), + errdetail_log(process %d is holding lock for %ld.%03d ms, activity: %s, + proc2-pid, msecs2, usecs2, be-st_activity))); +} +else + ereport(LOG, + (errmsg(process %d still waiting for %s on %s after %ld.%03d ms, + MyProcPid, modename, buf.data, msecs, usecs))); + } else if (myWaitStatus == STATUS_OK) ereport(LOG, (errmsg(process %d acquired %s on %s after %ld.%03d ms, -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detail part for still waiting for lock log message
Tarvi Pillessaar tar...@gmail.com wrote: Maybe someone can take a look at my patch and give some feedback. Even if this idea won't reach to upstream, i would still like to get feedback. Please add the patch to the open CommitFest. You can read about the process here: http://wiki.postgresql.org/wiki/CommitFest -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers