Re: [HACKERS] [PATCH] Log crashed backend's query v2
On Wed, Oct 5, 2011 at 02:36, gabrielle gor...@gmail.com wrote: This review was compiled from a PDXPUG group review (Dan Colish, Mark Wong, Brent Dombrowski, and Gabrielle Roth). Whaat, you marked the patch as Returned with Feedback based on this review? The only obvious change I need to make in response to your feedback is the function name fix in a comment. Most points are incorrect: there's no regression test in this patch and no requirement of plpythonu. I didn't introduce any new messages with the text unknown. The behavior of ascii_safe_strncpy is deliberate and was implemented from feedback on the first patch version. What's left is the indentation alignment in the if() statement. No way is that a a reason to delay the patch to the next CommitFest! Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Log crashed backend's query v2
This review was compiled from a PDXPUG group review (Dan Colish, Mark Wong, Brent Dombrowski, and Gabrielle Roth). -- We all agree this is a really useful feature. The patch applies cleanly to the current git master with git apply, it's in context diff, and does what it's supposed to do on Ubuntu, OSX, and gentoo. We found a few problems with it, and we'd like to see the patch resubmitted with the following changes: Tests: - Regression test requires plpythonu; it needs to work without that. Docs: - The doc comment 'pgstat_get_backend_current_activity' doesn't match the function name 'pgstat_get_crashed_backend_activity'. Project coding guidelines: - There are some formatting problems, such as all newlines at the same indent level need to line up. (The author mentioned not [being] happy with the indenting depth, so I think this is not a surprise.) - Wayward tabs, line 2725 in pgstat.c specifically - Unknown is used a lot (see http://developer.postgresql.org/pgdocs/postgres/error-style-guide.html#AEN94099) We had some questions about ascii_safe_strncpy: - It doesn't convert just unknown encodings, it converts all encodings. Is that intentional? - Is there an existing function that already does this? Looking forward to the next revision! gabrielle (on behalf of PDXPUG) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Log crashed backend's query v2
On Wed, Oct 5, 2011 at 02:36, gabrielle gor...@gmail.com wrote: This review was compiled from a PDXPUG group review (Dan Colish, Mark Wong, Brent Dombrowski, and Gabrielle Roth). Hi, thanks for the review! - Regression test requires plpythonu; it needs to work without that. The patch contains no regression tests and -- as far as I can tell -- cannot be reliably tested in the current pg_regress framework. The plpythonu line is just an example to demonstrate the patch output. - The doc comment 'pgstat_get_backend_current_activity' doesn't match the function name 'pgstat_get_crashed_backend_activity'. Oops, copy-paste error. :) - There are some formatting problems, such as all newlines at the same indent level need to line up. (The author mentioned not [being] happy with the indenting depth, so I think this is not a surprise.) That was deliberate. As far as I've seen, in Postgres source, complex expressions are usually lined up to the starting parentheses, unless the expression is too long, in which case it's aligned to the 78-character right margin. I decided to use this because splitting the expression on yet one more line wouldn't improve code readability. Or have I misunderstood the coding style? - Unknown is used a lot (see http://developer.postgresql.org/pgdocs/postgres/error-style-guide.html#AEN94099) The string (unknown) in postmaster.c was there before me, I didn't change that. The other instance of unknown in the comment for ascii_safe_strncpy I believe expresses the function quite well -- the function doesn't know and doesn't care what the input encoding is. We had some questions about ascii_safe_strncpy: - It doesn't convert just unknown encodings, it converts all encodings. Is that intentional? Technically we always know the right encoding -- the query is in the backend's database encoding. The point here is being simple and foolproof -- not introducing unnecessary amounts of code into the postmaster. Since ASCII characters are valid in any encoding, we only keep ASCII characters and throw away the rest. This was added in response to concerns that ereport might attempt to convert the string to another encoding and fail -- because the command string may be corrupt or because postmaster's encoding doesn't match the backend's encoding. And while this concern doesn't seem to apply with current code, it's still prudent to add it to pre-empt future changes and to protect the log file from potentially corrupt strings. See: http://archives.postgresql.org/pgsql-hackers/2011-09/msg00418.php - Is there an existing function that already does this? The bytea-text conversion (byteaout function) does something similar when bytea_output='escape', but it doesn't preserve newline/tab characters and it doesn't currently exist as an independent function. It also uses the outdated octal representation. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] Log crashed backend's query v2
On Sat, Sep 24, 2011 at 22:57, Marti Raudsepp ma...@juffo.org wrote: However, I now realize that it does make sense to write a separate simpler function for the crashed backend case with no vbeentry-st_changecount check loops, no checkUser, etc. That would be more robust and easier to review. I implemented this now, but I'm not convinced anymore that it's the right way to go. I'm duplicating some amount of code that could be subject to bitrot in the future since this code path won't be excercised often. But I'll let the reviewers decide. Is there a sane way to regression-test backend crashes? I propsed replacing non-ASCII characters with '?' earlier. This is also in. I created a new function in backend/utils/adt/ascii.c. It didn't quite fit in because all other functions in this file are dealing with Datums, but I couldn't find a better place. (I'm still not sure what adt means) Regards, Marti From 0f46bb1357bafbe940e7df8fce38c01e2237f57e Mon Sep 17 00:00:00 2001 From: Marti Raudsepp ma...@juffo.org Date: Wed, 28 Sep 2011 00:46:48 +0300 Subject: [PATCH] Log crashed backend's query (activity string) The crashing query is often a good starting point in debugging the cause, and much more easily accessible than core dumps. We're extra-paranoid in reading the activity buffer since it might be corrupt. All non-ASCII characters are replaced with '?' Example output: LOG: server process (PID 31451) was terminated by signal 9: Killed DETAIL: Running query: DO LANGUAGE plpythonu 'import os;os.kill(os.getpid(),9)' --- src/backend/postmaster/pgstat.c | 59 +++ src/backend/postmaster/postmaster.c | 17 +++--- src/backend/utils/adt/ascii.c | 31 ++ src/include/pgstat.h|1 + src/include/utils/ascii.h |1 + 5 files changed, 104 insertions(+), 5 deletions(-) diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index eb9adc8..812bf04 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -58,6 +58,7 @@ #include storage/pg_shmem.h #include storage/pmsignal.h #include storage/procsignal.h +#include utils/ascii.h #include utils/guc.h #include utils/memutils.h #include utils/ps_status.h @@ -2747,6 +2748,64 @@ pgstat_get_backend_current_activity(int pid, bool checkUser) return backend information not available; } +/* -- + * pgstat_get_backend_current_activity() - + * + * Return a string representing the current activity of the backend with + * the specified PID. Like the function above, but reads shared memory with + * the expectation that it may be corrupt. + * + * This function is only intended to be used by postmaster to report the + * query that crashed the backend. In particular, no attempt is made to + * follow the correct concurrency protocol when accessing the + * BackendStatusArray. But that's OK, in the worst case we get a corrupted + * message. We also must take care not to trip on ereport(ERROR). + * + * Note: return strings for special cases match pg_stat_get_backend_activity. + * -- + */ +const char * +pgstat_get_crashed_backend_activity(int pid) +{ + volatile PgBackendStatus *beentry; + int i; + + beentry = BackendStatusArray; + for (i = 1; i = MaxBackends; i++) + { + if (beentry-st_procpid == pid) + { + /* Read pointer just once, so it can't change after validation */ + const char *activity = beentry-st_activity; + char *buffer; + + /* + * We can't access activity pointer before we verify that it + * falls into BackendActivityBuffer. To make sure that the + * string's ending is contained within the buffer, the string + * can start at offset (MaxBackends - 1) at most. + */ + if (activity BackendActivityBuffer || +activity (BackendActivityBuffer + + (MaxBackends - 1) * pgstat_track_activity_query_size)) +return command string corrupt; + + if (*(activity) == '\0') +return command string not enabled; + + buffer = (char *) palloc(pgstat_track_activity_query_size); + ascii_safe_strncpy(buffer, activity, + pgstat_track_activity_query_size); + + return buffer; + } + + beentry++; + } + + /* PID not found */ + return backend information not available; +} /* * Local support functions follow diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 94b57fa..9ba622c 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -2763,6 +2763,8 @@ HandleChildCrash(int pid, int exitstatus, const char *procname) static void LogChildExit(int lev, const char *procname, int pid, int exitstatus) { + const char *activity = pgstat_get_crashed_backend_activity(pid); + if (WIFEXITED(exitstatus)) ereport(lev, @@ -2770,7 +2772,8 @@ LogChildExit(int lev, const char *procname, int pid, int exitstatus) translator: %s is a noun
Re: [HACKERS] [PATCH] Log crashed backend's query v2
On Sep28, 2011, at 00:19 , Marti Raudsepp wrote: (I'm still not sure what adt means) I always assumed it stood for abstract data type. Most of the files in this directory seem to correspond to an SQL-level data type like intX, varchar, tsquery, ..., and contain the I/O functions for that type, plus some supporting operations and functions. Over time, it seems that this directory was also used for SQL-level functions not directly related to a single type, like windowfuncs.c and pgstatfuncs.c. The fact that ri_triggers.c lives there also might be a relict from times where you'd create FK constraint with CREATE CONSTRAINT TRIGGER and specified one of the functions from ri_triggers.c as the procedure to execute. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Log crashed backend's query v2
Florian Pflug f...@phlo.org writes: On Sep28, 2011, at 00:19 , Marti Raudsepp wrote: (I'm still not sure what adt means) I always assumed it stood for abstract data type. Yeah, that's what I think too. Over time it's been used to hold most code that is a SQL-callable function, many of which are not directly connected to any SQL datatype. Not sure if it's worth trying to clean that up. Another annoying thing is that adt should probably have been directly under src/backend/ --- dropping it under utils/ seems just weird for a category that is going to hold a ton of code. (I had once had some hope that git would allow us to move code around more easily, but my experiments with back-patching after code movement have convinced me that it doesn't work any better for that than CVS. So I'm not in favor of massive code rearrangements just to make the source tree structure cleaner.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers