Re: [PATCHES] log_autovacuum
In response to Alvaro Herrera [EMAIL PROTECTED]: Simon Riggs wrote: On Tue, 2007-04-17 at 14:06 -0400, Alvaro Herrera wrote: I've tinkered with this patch a bit. Sample output: LOG: automatic vacuum of table alvherre.public.foo: index scans: 0 pages: removed 0, 11226 remain tuples: 1300683 removed, 1096236 remain system usage: CPU 0.29s/0.38u sec elapsed 2.56 sec Please comment. Well, 'tis great except when you have very very frequent autovacuums. That was why I wanted it in 1 log line. Perhaps we need this as an integer, so we can log all vacuums that last for longer in seconds than the setting, 0 logs all. That would significantly reduce the volume if we set it to 5, say. That way you would get your readability and I would get my reasonable size logs. It kinda smells funny to have a setting like that. Do we have a precedent? If other people is OK with it, I'll do that. Would it work to add a separate GUC var to control the minimum autovac time? Also, why do it by time and not by amount of tuples/pages removed? When I submitted the log_temp_files stuff, there was considerable discussion on the topic of how the GUC vars should be done. IIRC, the eventual decision was to have a single GUC var, where -1 equated to off, 0 equated to log all, and numbers higher than 0 were a size limit on when things get logged. -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ [EMAIL PROTECTED] Phone: 412-422-3463x4023 IMPORTANT: This message contains confidential information and is intended only for the individual named. If the reader of this message is not an intended recipient (or the individual responsible for the delivery of this message to an intended recipient), please be advised that any re-use, dissemination, distribution or copying of this message is prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [pgsql-patches] [PATCHES] Patch to log usage of temporary files
In response to Guillaume Smet [EMAIL PROTECTED]: On 1/12/07, Bruce Momjian [EMAIL PROTECTED] wrote: Usually people don't want th query unless they ask for it. One nify trick would be to print the query as DETAIL unless they are already logging queries, but that just seems too complex. If you want the query, why not just log them all? Because they can't? On a highly loaded production server, people usually don't log all the queries. Anyway, if it's too complicated to implement it, perhaps it's not worth it. I'm just curious to see how people will use this information if they don't know why the temp file was created. I can only speak for myself but: * I'm already using the patch in our lab. Since the lab is the same hardware/config/etc as production, I can use the information to fine tune configs that then get migrated to production after careful testing. Since it's a lab environment, I'm free to turn on and off all sorts of stuff that would be unwise in production. Thus the lab frequently has full query logging turned on. * Currently, our production systems have plenty of spare IO. The result is that I _do_ log queries on production servers, and will continue to do so until it becomes an issue. Additionally, we have lots of room to grow with this hardware, so I can use the data collected about temp file usage to justify additional RAM. Don't know how long I'll be able to leave query logging enabled on the production systems, but I'm taking advantage of it as long as possible. * This variable can be tweaked per-session, which means if I've got queries that I suspect are causing unwarranted temp files on a production server, I can enable it on a per-connection basis to track down the problem and work on a specific query, on production systems, without too much disruption of the rest of the work that's going on: set log_temp_files = 0; run suspect query set log_temp_files = -1; investigate logs At least, those are my current plans ... -- Bill Moran Collaborative Fusion Inc. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] Patch to log usage of temporary files
In response to Bruce Momjian [EMAIL PROTECTED]: I have applied a modified version of your patch. I renamed the parameter to 'log_temp_files', for consistency, added documentation, and improved the wording, particularly mentioning that the logging happens at file deletion time. Thanks. -- Bill Moran Collaborative Fusion Inc. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Patch to log usage of temporary files
In response to Tom Lane [EMAIL PROTECTED]: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: Surely the measurement unit should be kbytes or disk blocks. And why aren't you using that GUC UNITS infrastructure Peter put in? Agreed. I have applied the following patch to make it kilobytes, and documented it. I didn't put '-1kB' in the postgresql.conf file because the -1 value is special. (ideas?) Hmm, that could be a little bit ugly. Suggestion: redefine the value such that files *greater than* the given size are logged, It already is that way, with 0 effectively meaning log all. and then zero can be the off position, and we need not worry about whether -1 is -1 byte or -1 kbyte. All doing this does is make it impossible to log temp files of 1 byte. I thought the -1 = magic off was more intuitive. But 0 = off isn't terribly cryptic or anything. I'd considered creating two GUC variables: a boolean log_temp_files and an int log_temp_file_max_size, or something like that, but it seemed like overkill if I could just use a magic value from the int. Someone (I don't remember who) suggested that it might be useful to also have the ability to log temp files _smaller_ than a certain size. Allowing _either_ larger than _or_ smaller than will certainly complicate the config options. I had this devilish master plan that - values would be less than and positive values greater than, then I decided I was being stupid. So, that's my input. But, at this point I'm not entirely sure what the best approach is. -- Bill Moran Collaborative Fusion Inc. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [PATCHES] Patch to log usage of temporary files
In response to Andrew Dunstan [EMAIL PROTECTED]: Bill Moran wrote: Andrew Dunstan [EMAIL PROTECTED] wrote: Bill Moran wrote: +if (trace_temp_files != -1) Might be more robust to say if (trace_temp_files = 0) Because it would allow for the easy addition of more negative numbers with magic value? because ISTM any negative number here should mean no action is to be taken. Otherwise how else is it different from 0? ?? I specified in the GUC config that minimum allowable value is -1. /usr/local/etc/rc.d/postgresql start FATAL: -5 is outside the valid range for parameter trace_temp_files (-1 .. 2147483647) set trace_temp_files to -8; ERROR: -8 is outside the valid range for parameter trace_temp_files (-1 .. 2147483647) Perhaps there's another reason to use the = 0 check, but handling invalid values with POLA doesn't seem to be a good one. -- Bill Moran Collaborative Fusion Inc. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Patch to log usage of temporary files
In response to Tom Lane [EMAIL PROTECTED]: Bill Moran [EMAIL PROTECTED] writes: Andrew Dunstan [EMAIL PROTECTED] wrote: Might be more robust to say if (trace_temp_files = 0) I specified in the GUC config that minimum allowable value is -1. I'd still tend to go with Andrew's suggestion because it makes this particular bit of code self-defending against bad values. Yes, it's reasonably safe given that bit of coding way over yonder in guc.c, but there's no particularly good reason why this code has to depend on that to avoid doing something stupid. And it's easier to understand too --- you don't have to go looking in guc.c to convince yourself it's safe. Ahh ... well, I've probably already argued about it more than it's worth. The patch is easy enough to adjust, find attached. -- Bill Moran Collaborative Fusion Inc. diff -c -r src.orig/backend/storage/file/fd.c src/backend/storage/file/fd.c *** src.orig/backend/storage/file/fd.c Thu Dec 7 15:44:42 2006 --- src/backend/storage/file/fd.c Wed Jan 3 15:05:54 2007 *** *** 50,55 --- 50,56 #include access/xact.h #include storage/fd.h #include storage/ipc.h + #include utils/guc.h /* *** *** 938,944 void FileClose(File file) { ! Vfd *vfdP; Assert(FileIsValid(file)); --- 939,946 void FileClose(File file) { ! Vfd *vfdP; ! struct stat filestats; Assert(FileIsValid(file)); *** *** 968,973 --- 970,992 { /* reset flag so that die() interrupt won't cause problems */ vfdP-fdstate = ~FD_TEMPORARY; + PG_TRACE1(temp__file__cleanup, vfdP-fileName); + if (trace_temp_files = 0) + { + if (stat(vfdP-fileName, filestats) == 0) + { + if (filestats.st_size = trace_temp_files) + { + ereport(LOG, + (errmsg(temp file: size %lu path \%s\, + filestats.st_size, vfdP-fileName))); + } + } + else + { + elog(LOG, Could not stat \%s\: %m, vfdP-fileName); + } + } if (unlink(vfdP-fileName)) elog(LOG, failed to unlink \%s\: %m, vfdP-fileName); diff -c -r src.orig/backend/utils/misc/guc.c src/backend/utils/misc/guc.c *** src.orig/backend/utils/misc/guc.c Wed Nov 29 09:50:07 2006 --- src/backend/utils/misc/guc.c Wed Jan 3 13:51:14 2007 *** *** 180,186 int log_min_messages = NOTICE; int client_min_messages = NOTICE; int log_min_duration_statement = -1; ! int num_temp_buffers = 1000; char *ConfigFileName; --- 180,187 int log_min_messages = NOTICE; int client_min_messages = NOTICE; int log_min_duration_statement = -1; ! int trace_temp_files = -1; ! int num_temp_buffers = 1000; char *ConfigFileName; *** *** 1471,1477 log_min_duration_statement, -1, -1, INT_MAX / 1000, NULL, NULL }, ! { {bgwriter_delay, PGC_SIGHUP, RESOURCES, gettext_noop(Background writer sleep time between rounds in milliseconds), --- 1472,1478 log_min_duration_statement, -1, -1, INT_MAX / 1000, NULL, NULL }, ! { {bgwriter_delay, PGC_SIGHUP, RESOURCES, gettext_noop(Background writer sleep time between rounds in milliseconds), *** *** 1657,1662 --- 1658,1673 }, server_version_num, PG_VERSION_NUM, PG_VERSION_NUM, PG_VERSION_NUM, NULL, NULL + }, + + { + {trace_temp_files, PGC_USERSET, LOGGING_WHAT, + gettext_noop(Log the use of temp files larger than this size.), + gettext_noop(Size and location of each temp file is reported.), + NULL + }, + trace_temp_files, + -1, -1, INT_MAX, NULL, NULL }, /* End-of-list marker */ diff -c -r src.orig/backend/utils/misc/postgresql.conf.sample src/backend/utils/misc/postgresql.conf.sample *** src.orig/backend/utils/misc/postgresql.conf.sample Mon Nov 20 20:23:37 2006 --- src/backend/utils/misc/postgresql.conf.sample Wed Jan 3 11:05:48 2007 *** *** 333,338 --- 333,341 #log_statement = 'none' # none, ddl, mod, all #log_hostname = off + #trace_temp_files = -1 # Log usage of temporary files larger than + # the specified size (in bytes). -1 disables. + # 0 effectively logs all temp file usage. #--- # RUNTIME STATISTICS diff -c -r src.orig/include/utils/guc.h src/include/utils/guc.h *** src.orig/include/utils/guc.h Thu Oct 19 14:32:47 2006 --- src/include/utils/guc.h Wed Jan 3 13:45:52 2007 *** *** 123,128 --- 123,129 extern int log_min_messages; extern int client_min_messages; extern int log_min_duration_statement; + extern int trace_temp_files; extern int num_temp_buffers; ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] Patch to log usage of temporary files
In response to Simon Riggs [EMAIL PROTECTED]: On Tue, 2007-01-02 at 18:20 -0500, Tom Lane wrote: Bill Moran [EMAIL PROTECTED] writes: In response to Alvaro Herrera [EMAIL PROTECTED]: Please change things to save the stat() syscall when the feature is not in use. Do you have a suggestion on how to do that and still have the PG_TRACE1() work? That was specifically requested by Simon Riggs. Well, we are NOT paying a stat() call on every single file close, whether Simon wants it or not. Simon doesn't/wouldn't want the stat() call on each file close. If you put the PG_TRACE macro outside of the if test, yet prior to the file close, you can pass the filename through like this PG_TRACE1(temp__file__cleanup, vfdP-fileName); That way DTrace can make its own call to find out filesize, if it would like to... and we don't need to stat() before each temp file close. That's much more flexible and useful, as well as better performance. OK, I think I've managed to adjust this patch so that everyone is happy ;) and it's better as well. * PG_TRACE will work whether the GUC var is enabled or not, it sends the fileName, as suggested by Simon * stat() call is not made if trace_temp_files is disabled * trace_temp_files is now an int: -1 disables, 0 and up equate to log if the file is this size or larger * Cleaned things up a bit -- should be more in line with PostgreSQL coding standards * failed stat is reported as LOG instead of ERROR Done a bit of testing here, and everything seems to be in order. -- Bill Moran Collaborative Fusion Inc. diff -c -r src.orig/backend/storage/file/fd.c src/backend/storage/file/fd.c *** src.orig/backend/storage/file/fd.c Thu Dec 7 15:44:42 2006 --- src/backend/storage/file/fd.c Wed Jan 3 15:05:54 2007 *** *** 50,55 --- 50,56 #include access/xact.h #include storage/fd.h #include storage/ipc.h + #include utils/guc.h /* *** *** 938,944 void FileClose(File file) { ! Vfd *vfdP; Assert(FileIsValid(file)); --- 939,946 void FileClose(File file) { ! Vfd *vfdP; ! struct stat filestats; Assert(FileIsValid(file)); *** *** 968,973 --- 970,992 { /* reset flag so that die() interrupt won't cause problems */ vfdP-fdstate = ~FD_TEMPORARY; + PG_TRACE1(temp__file__cleanup, vfdP-fileName); + if (trace_temp_files != -1) + { + if (stat(vfdP-fileName, filestats) == 0) + { + if (filestats.st_size = trace_temp_files) + { + ereport(LOG, + (errmsg(temp file: size %lu path \%s\, + filestats.st_size, vfdP-fileName))); + } + } + else + { + elog(LOG, Could not stat \%s\: %m, vfdP-fileName); + } + } if (unlink(vfdP-fileName)) elog(LOG, failed to unlink \%s\: %m, vfdP-fileName); diff -c -r src.orig/backend/utils/misc/guc.c src/backend/utils/misc/guc.c *** src.orig/backend/utils/misc/guc.c Wed Nov 29 09:50:07 2006 --- src/backend/utils/misc/guc.c Wed Jan 3 13:51:14 2007 *** *** 180,186 int log_min_messages = NOTICE; int client_min_messages = NOTICE; int log_min_duration_statement = -1; ! int num_temp_buffers = 1000; char *ConfigFileName; --- 180,187 int log_min_messages = NOTICE; int client_min_messages = NOTICE; int log_min_duration_statement = -1; ! int trace_temp_files = -1; ! int num_temp_buffers = 1000; char *ConfigFileName; *** *** 1471,1477 log_min_duration_statement, -1, -1, INT_MAX / 1000, NULL, NULL }, ! { {bgwriter_delay, PGC_SIGHUP, RESOURCES, gettext_noop(Background writer sleep time between rounds in milliseconds), --- 1472,1478 log_min_duration_statement, -1, -1, INT_MAX / 1000, NULL, NULL }, ! { {bgwriter_delay, PGC_SIGHUP, RESOURCES, gettext_noop(Background writer sleep time between rounds in milliseconds), *** *** 1657,1662 --- 1658,1673 }, server_version_num, PG_VERSION_NUM, PG_VERSION_NUM, PG_VERSION_NUM, NULL, NULL + }, + + { + {trace_temp_files, PGC_USERSET, LOGGING_WHAT, + gettext_noop(Log the use of temp files larger than this size.), + gettext_noop(Size and location of each temp file is reported.), + NULL + }, + trace_temp_files, + -1, -1, INT_MAX, NULL, NULL }, /* End-of-list marker */ diff -c -r src.orig/backend/utils/misc/postgresql.conf.sample src/backend/utils/misc/postgresql.conf.sample *** src.orig/backend/utils/misc/postgresql.conf.sample Mon Nov 20 20:23:37 2006 --- src/backend/utils/misc/postgresql.conf.sample Wed Jan 3 11:05:48 2007 *** *** 333,338 --- 333,341 #log_statement = 'none' # none, ddl, mod, all #log_hostname = off + #trace_temp_files = -1 # Log usage of temporary files larger than + # the specified size (in bytes). -1 disables. + # 0
Re: [HACKERS] [PATCHES] Patch to log usage of temporary files
Andrew Dunstan [EMAIL PROTECTED] wrote: Bill Moran wrote: + if (trace_temp_files != -1) Might be more robust to say if (trace_temp_files = 0) Because it would allow for the easy addition of more negative numbers with magic value? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PATCHES] Patch to log usage of temporary files
Thanks to Simon Riggs and Bruce for input that helped me put this together. -- Bill Moran Collaborative Fusion Inc. diff -c -r src.orig/backend/storage/file/fd.c src/backend/storage/file/fd.c *** src.orig/backend/storage/file/fd.c Thu Dec 7 15:44:42 2006 --- src/backend/storage/file/fd.c Tue Jan 2 12:33:25 2007 *** *** 50,55 --- 50,56 #include access/xact.h #include storage/fd.h #include storage/ipc.h + #include utils/guc.h /* *** *** 938,944 void FileClose(File file) { ! Vfd *vfdP; Assert(FileIsValid(file)); --- 939,946 void FileClose(File file) { ! Vfd *vfdP; ! struct stat filestats; Assert(FileIsValid(file)); *** *** 968,973 --- 970,984 { /* reset flag so that die() interrupt won't cause problems */ vfdP-fdstate = ~FD_TEMPORARY; + if (stat(vfdP-fileName, filestats) == 0) { + if (trace_temp_files) + ereport(LOG, + (errmsg(temp file: size %lu path \%s\, + filestats.st_size, vfdP-fileName))); + PG_TRACE1(temp__file__cleanup, filestats.st_size); + } else { + elog(ERROR, Could not stat \%s\: %m, vfdP-fileName); + } if (unlink(vfdP-fileName)) elog(LOG, failed to unlink \%s\: %m, vfdP-fileName); diff -c -r src.orig/backend/utils/misc/guc.c src/backend/utils/misc/guc.c *** src.orig/backend/utils/misc/guc.c Wed Nov 29 09:50:07 2006 --- src/backend/utils/misc/guc.c Fri Dec 29 10:28:08 2006 *** *** 180,186 int log_min_messages = NOTICE; int client_min_messages = NOTICE; int log_min_duration_statement = -1; ! int num_temp_buffers = 1000; char *ConfigFileName; --- 180,187 int log_min_messages = NOTICE; int client_min_messages = NOTICE; int log_min_duration_statement = -1; ! bool trace_temp_files = false; ! int num_temp_buffers = 1000; char *ConfigFileName; *** *** 1010,1016 IgnoreSystemIndexes, false, NULL, NULL }, ! /* End-of-list marker */ { {NULL, 0, 0, NULL, NULL}, NULL, false, NULL, NULL --- 1011,1027 IgnoreSystemIndexes, false, NULL, NULL }, ! ! { ! {trace_temp_files, PGC_USERSET, LOGGING_WHAT, ! gettext_noop(Enables logging the usage of temp files.), ! gettext_noop(Size and location of each temp file is reported.), ! NULL ! }, ! trace_temp_files, ! false, NULL, NULL ! }, ! /* End-of-list marker */ { {NULL, 0, 0, NULL, NULL}, NULL, false, NULL, NULL *** *** 1471,1477 log_min_duration_statement, -1, -1, INT_MAX / 1000, NULL, NULL }, ! { {bgwriter_delay, PGC_SIGHUP, RESOURCES, gettext_noop(Background writer sleep time between rounds in milliseconds), --- 1482,1488 log_min_duration_statement, -1, -1, INT_MAX / 1000, NULL, NULL }, ! { {bgwriter_delay, PGC_SIGHUP, RESOURCES, gettext_noop(Background writer sleep time between rounds in milliseconds), diff -c -r src.orig/backend/utils/misc/postgresql.conf.sample src/backend/utils/misc/postgresql.conf.sample *** src.orig/backend/utils/misc/postgresql.conf.sample Mon Nov 20 20:23:37 2006 --- src/backend/utils/misc/postgresql.conf.sample Fri Dec 29 12:45:16 2006 *** *** 333,338 --- 333,339 #log_statement = 'none' # none, ddl, mod, all #log_hostname = off + #trace_temp_files = off # Log usage of temporary files #--- # RUNTIME STATISTICS diff -c -r src.orig/include/utils/guc.h src/include/utils/guc.h *** src.orig/include/utils/guc.h Thu Oct 19 14:32:47 2006 --- src/include/utils/guc.h Fri Dec 29 10:00:40 2006 *** *** 123,128 --- 123,129 extern int log_min_messages; extern int client_min_messages; extern int log_min_duration_statement; + extern bool trace_temp_files; extern int num_temp_buffers; ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Patch to log usage of temporary files
In response to Alvaro Herrera [EMAIL PROTECTED]: Bill Moran wrote: Thanks to Simon Riggs and Bruce for input that helped me put this together. Please change things to save the stat() syscall when the feature is not in use. Do you have a suggestion on how to do that and still have the PG_TRACE1() work? That was specifically requested by Simon Riggs. I'm not at all familiar with how the PG_TRACE probes work, so I'd be interested to hear suggestions on how to wrap that in an if. If I remove the PG_TRACE, it becomes a simple matter to skip the stat() if the feature is disabled. Nitpick: also note our brace placement convention (though this would be fixed by pgindent, but still). Sorry, I thought I _was_ following the convention. Must have missed something. Is there a written style guide somewhere? Might drive things home a little better than just looking at other folks code. -- Bill Moran Collaborative Fusion Inc. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq