Re: [PATCHES] Display Pg buffer cache (WIP)
Neil Conway wrote: If you do decide to hold the BufMappingLock, it might make sense to: 1. allocate an array of NBuffers elements 2. acquire BufferMappingLock in share mode 3. sequentially scan through the buffer pool, copying data into the array 4. release the lock 5. on each subsequent call to the SRF, format and return an element of the array Which should reduce the time to lock is held. This will require allocating NBuffers * size_of_stats memory (where size_of_stats will be something like 16 bytes). That is a better approach, so I've used it in this new iteration. In addition to holding the BufMappingLock, each buffer header is (spin) locked before examining it, hopefully this is correct - BTW, I like the new buffer lock design. I'm still using BuildTupleFromCStrings, so there is considerable use of sprintf conversion and temporary char * stuff. I would like this to be a bit cleaner, so any suggestions welcome. regards Mark diff -Naur pgsql.orig/src/backend/catalog/system_views.sql pgsql/src/backend/catalog/system_views.sql --- pgsql.orig/src/backend/catalog/system_views.sql Fri Mar 4 14:23:09 2005 +++ pgsql/src/backend/catalog/system_views.sql Fri Mar 4 14:21:33 2005 @@ -277,3 +277,9 @@ DO INSTEAD NOTHING; GRANT SELECT, UPDATE ON pg_settings TO PUBLIC; + +CREATE VIEW pg_cache_dump AS + SELECT D.* FROM pg_cache_dump() AS D + (bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid, +isdirty bool, refcount int4); + diff -Naur pgsql.orig/src/backend/utils/adt/cachedump.c pgsql/src/backend/utils/adt/cachedump.c --- pgsql.orig/src/backend/utils/adt/cachedump.cThu Jan 1 12:00:00 1970 +++ pgsql/src/backend/utils/adt/cachedump.c Sat Mar 5 20:21:45 2005 @@ -0,0 +1,221 @@ +/*- + * + * cachedump.c + *display some contents of the buffer cache + * + * Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * $PostgreSQL$ + *- + */ +#include postgres.h +#include funcapi.h +#include catalog/pg_type.h +#include storage/buf_internals.h +#include storage/bufmgr.h +#include utils/relcache.h +#include utils/builtins.h + + +#define NUM_CACHE_DUMP_ELEM6 + +/* + * Record structure holding the to be exposed cache data. + */ +typedef struct +{ + uint32 bufferid; + Oid relfilenode; + Oid reltablespace; + Oid reldatabase; + boolisdirty; + uint32 refcount; + BlockNumber blocknum; + +} CacheDumpRec; + + +/* + * Function context for data persisting over repeated calls. + */ +typedef struct +{ + AttInMetadata *attinmeta; + CacheDumpRec*record; + char*values[NUM_CACHE_DUMP_ELEM]; +} CacheDumpContext; + + +/* + * Function returning data from the shared buffer cache - buffer number, + * relation node/tablespace/database, dirty indicator and refcount. + */ +Datum +cache_dump(PG_FUNCTION_ARGS) +{ + FuncCallContext *funcctx; + Datum result; + MemoryContext oldcontext; + CacheDumpContext*fctx; /* User function context. */ + TupleDesc tupledesc; + HeapTuple tuple; + + if (SRF_IS_FIRSTCALL()) + { + RelFileNode rnode; + uint32 i; + BufferDesc *bufHdr; + + + funcctx = SRF_FIRSTCALL_INIT(); + + /* Switch context when allocating stuff to be used in later calls */ + oldcontext = MemoryContextSwitchTo(funcctx-multi_call_memory_ctx); + + /* construct a tuple to return */ + tupledesc = CreateTemplateTupleDesc(NUM_CACHE_DUMP_ELEM, false); + TupleDescInitEntry(tupledesc, (AttrNumber) 1, bufferid, + INT4OID, -1, 0); + TupleDescInitEntry(tupledesc, (AttrNumber) 2, relfilenode, + OIDOID, -1, 0); + TupleDescInitEntry(tupledesc, (AttrNumber) 3, reltablespace, + OIDOID, -1, 0); + TupleDescInitEntry(tupledesc, (AttrNumber) 4, reldatabase, + OIDOID, -1, 0); + TupleDescInitEntry(tupledesc, (AttrNumber) 5, isdirty, + BOOLOID, -1, 0); + TupleDescInitEntry(tupledesc, (AttrNumber) 6, refcount, +
[PATCHES] fork() refactoring
This patch moves all the common code that is usually invoked before doing a fork() into a single function, fork_process(). It is not aware of the EXEC_BACKEND machinery, so it should be used as fork() currently is -- inside an #ifndef EXEC_BACKEND block, if appropriate. I wasn't sure whether to put this under backend/port, postmaster.c, or in a separate .c file. This patch places the code in a new file postmaster/fork_process.c -- suggestions on whether this is the right location would be welcome. Barring any objections, I'll apply this to HEAD on Monday. -Neil Index: src/backend/port/beos/support.c === RCS file: /Users/neilc/local/cvs/pgsql/src/backend/port/beos/support.c,v retrieving revision 1.11 diff -c -r1.11 support.c *** src/backend/port/beos/support.c 25 Oct 2004 03:23:02 - 1.11 --- src/backend/port/beos/support.c 5 Mar 2005 06:23:42 - *** *** 265,271 ! /* The behavior of fork is borken on beos regarding shared memory. In fact all shared memory areas are clones in copy on write mode in the new process. We need to do a remapping of these areas. Just afer the fork we performe the --- 265,271 ! /* The behavior of fork is broken on beos regarding shared memory. In fact all shared memory areas are clones in copy on write mode in the new process. We need to do a remapping of these areas. Just afer the fork we performe the Index: src/backend/postmaster/Makefile === RCS file: /Users/neilc/local/cvs/pgsql/src/backend/postmaster/Makefile,v retrieving revision 1.19 diff -c -r1.19 Makefile *** src/backend/postmaster/Makefile 5 Aug 2004 23:32:10 - 1.19 --- src/backend/postmaster/Makefile 5 Mar 2005 06:37:28 - *** *** 12,18 top_builddir = ../../.. include $(top_builddir)/src/Makefile.global ! OBJS = postmaster.o bgwriter.o pgstat.o pgarch.o syslogger.o all: SUBSYS.o --- 12,18 top_builddir = ../../.. include $(top_builddir)/src/Makefile.global ! OBJS = bgwriter.o fork_process.o pgarch.o pgstat.o postmaster.o syslogger.o all: SUBSYS.o Index: src/backend/postmaster/fork_process.c === RCS file: src/backend/postmaster/fork_process.c diff -N src/backend/postmaster/fork_process.c *** /dev/null 1 Jan 1970 00:00:00 - --- src/backend/postmaster/fork_process.c 5 Mar 2005 09:02:35 - *** *** 0 --- 1,80 + /* + * fork_process.c + * A simple wrapper on top of fork(). This does not handle the + * EXEC_BACKEND case; it might be extended to do so, but it would be + * considerably more complex. + * + * Copyright (c) 1996-2005, PostgreSQL Global Development Group + * + * IDENTIFICATION + * $PostgreSQL$ + */ + #include postgres.h + #include postmaster/fork_process.h + + #include unistd.h + + /* + * Wrapper for fork(). Return values are the same as those for fork(): + * -1 if the fork failed, 0 in the child process, and the PID of the + * child in the parent process. + */ + pid_t + fork_process(void) + { + pid_t result; + #ifdef LINUX_PROFILE + struct itimerval prof_itimer; + #endif + + /* + * Flush stdio channels just before fork, to avoid double-output + * problems. Ideally we'd use fflush(NULL) here, but there are still a + * few non-ANSI stdio libraries out there (like SunOS 4.1.x) that + * coredump if we do. Presently stdout and stderr are the only stdio + * output channels used by the postmaster, so fflush'ing them should + * be sufficient. + */ + fflush(stdout); + fflush(stderr); + + #ifdef LINUX_PROFILE + /* + * Linux's fork() resets the profiling timer in the child process. If + * we want to profile child processes then we need to save and restore + * the timer setting. This is a waste of time if not profiling, + * however, so only do it if commanded by specific -DLINUX_PROFILE + * switch. + */ + getitimer(ITIMER_PROF, prof_itimer); + #endif + + #ifdef __BEOS__ + /* Specific beos actions before backend startup */ + beos_before_backend_startup(); + #endif + + result = fork(); + if (result == (pid_t) -1) + { + /* fork failed */ + #ifdef __BEOS__ + /* Specific beos backend startup actions */ + beos_backend_startup_failed(); + #endif + } + else if (result == 0) + { + /* fork succeeded, in child */ + #ifdef LINUX_PROFILE + setitimer(ITIMER_PROF, prof_itimer, NULL); + #endif + + #ifdef __BEOS__ + /* Specific beos backend startup actions */ + beos_backend_startup(); + #endif + } + + return result; + } Index: src/backend/postmaster/pgarch.c === RCS file: /Users/neilc/local/cvs/pgsql/src/backend/postmaster/pgarch.c,v retrieving revision 1.14 diff -c -r1.14 pgarch.c *** src/backend/postmaster/pgarch.c 31 Dec 2004 22:00:40 - 1.14 ---
Re: [PATCHES] fork() refactoring
Neil Conway [EMAIL PROTECTED] writes: This patch moves all the common code that is usually invoked before doing a fork() into a single function, fork_process(). It is not aware of the EXEC_BACKEND machinery, so it should be used as fork() currently is -- inside an #ifndef EXEC_BACKEND block, if appropriate. I'm worried about whether this doesn't break the EXEC_BACKEND case. Most of the code you've moved out isn't applicable to Windows, but the fflushes probably are --- and they are certainly applicable when testing EXEC_BACKEND mode on a Unix machine, which is a case you may *not* break because it will render Windows completely unsupportable. Barring any objections, I'll apply this to HEAD on Monday. Please do not apply without some further portability testing. I think it would be better to continue with your original thought of passing a token into this code so that the EXEC_BACKEND case could be handled too (the token would tell it which forkexec function to call). That would probably mean that the function has to stay within postmaster.c, but as long as it consolidates the N cases of fork decoration into one, we're still ahead of the game. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PATCHES] Display Pg buffer cache (WIP)
Mark Kirkwood [EMAIL PROTECTED] writes: In addition to holding the BufMappingLock, each buffer header is (spin) locked before examining it, hopefully this is correct - BTW, I like the new buffer lock design. It'd be possible to dispense with the per-buffer spinlocks so long as you look only at the tag (and perhaps the TAG_VALID flag bit). The tags can't be changing while you hold the BufMappingLock. I'm dubious that there's any point in recording information as transient as the refcounts and dirtybits, and reducing the amount of time you hold the lock would be a Good Thing. Also, you're still not dealing with the case of a not-valid buffer in a reasonable way. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PATCHES] Harmless space allocation typo
Here's a tiny fix for a harmless typo in catalog.c: Too much space is allocated for tablespace file path, I guess the directory name used to be pg_tablespaces instead of pg_tblspc at some point. - HeikkiIndex: catalog.c === RCS file: /projects/cvsroot/pgsql/src/backend/catalog/catalog.c,v retrieving revision 1.57 diff -c -r1.57 catalog.c *** catalog.c 31 Dec 2004 21:59:38 - 1.57 --- catalog.c 5 Mar 2005 22:12:21 - *** *** 58,64 else { /* All other tablespaces are accessed via symlinks */ ! pathlen = strlen(DataDir) + 16 + OIDCHARS + 1 + OIDCHARS + 1 + OIDCHARS + 1; path = (char *) palloc(pathlen); snprintf(path, pathlen, %s/pg_tblspc/%u/%u/%u, DataDir, rnode.spcNode, rnode.dbNode, rnode.relNode); --- 58,64 else { /* All other tablespaces are accessed via symlinks */ ! pathlen = strlen(DataDir) + 11 + OIDCHARS + 1 + OIDCHARS + 1 + OIDCHARS + 1; path = (char *) palloc(pathlen); snprintf(path, pathlen, %s/pg_tblspc/%u/%u/%u, DataDir, rnode.spcNode, rnode.dbNode, rnode.relNode); *** *** 99,105 else { /* All other tablespaces are accessed via symlinks */ ! pathlen = strlen(DataDir) + 16 + OIDCHARS + 1 + OIDCHARS + 1; path = (char *) palloc(pathlen); snprintf(path, pathlen, %s/pg_tblspc/%u/%u, DataDir, spcNode, dbNode); --- 99,105 else { /* All other tablespaces are accessed via symlinks */ ! pathlen = strlen(DataDir) + 11 + OIDCHARS + 1 + OIDCHARS + 1; path = (char *) palloc(pathlen); snprintf(path, pathlen, %s/pg_tblspc/%u/%u, DataDir, spcNode, dbNode); ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PATCHES] Cleaning up unreferenced table files
Here's a patch for the TODO item Remove unreferenced table files created by transactions that were in-progress when the server terminated abruptly. It adds a new function, CleanupStaleRelFiles, that scans through the data directory and removes all table files that are not mentioned in pg_class of the corresponding database. CleanupStaleRelFiles is called after WAL recovery. Actually, the patch doesn't currently delete the files, just issues a warning. Testing is easier if the files don't keep getting deleted :). The patch also adds a GetTablespacePath function similar to GetDatabasePath that constructs the path to a tablespace symbolic link. commands/tablespace.c is modified to use it, in addition to the new CleanupStaleRelFiles function. - HeikkiIndex: src/backend/access/transam/xlog.c === RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xlog.c,v retrieving revision 1.181 diff -c -r1.181 xlog.c *** src/backend/access/transam/xlog.c 12 Feb 2005 23:53:37 - 1.181 --- src/backend/access/transam/xlog.c 5 Mar 2005 22:33:56 - *** *** 43,48 --- 43,55 #include utils/guc.h #include utils/relcache.h + #include catalog/pg_tablespace.h + #include catalog/catalog.h + #include access/skey.h + #include utils/fmgroids.h + #include access/relscan.h + #include access/heapam.h + #include utils/resowner.h /* * This chunk of hackery attempts to determine which file sync methods *** *** 465,470 --- 472,480 static bool read_backup_label(XLogRecPtr *checkPointLoc); static void remove_backup_label(void); + static void CleanupStaleRelFilesFrom(Oid tablespaceoid, Oid dboid); + static void CleanupStaleRelFilesFromTablespace(Oid tablespaceoid); + static void CleanupStaleRelFiles(void); /* * Insert an XLOG record having the specified RMID and info bytes, *** *** 4483,4488 --- 4493,4500 CreateCheckPoint(true, true); + CleanupStaleRelFiles(); + /* * Close down recovery environment */ *** *** 5656,5658 --- 5668,5852 errmsg(could not remove file \%s\: %m, labelfilepath))); } + + /* Like AllocateDir, but ereports on failure */ + static DIR * + AllocateDirChecked(char *path) + { + DIR *dirdesc = AllocateDir(path); + if (dirdesc == NULL) + ereport(ERROR, + (errcode_for_file_access(), +errmsg(could not open directory \%s\: %m, + path))); + + return dirdesc; + } + + /* + * Scan through all tablespaces for relations left over + * by aborted transactions. + * + * For example, if a transaction issues + * BEGIN; CREATE TABLE foobar (); + * and then the backend crashes, the file is left in the + * tablespace until CleanupStaleRelFiles deletes it. + */ + static void + CleanupStaleRelFiles(void) + { + DIR*dirdesc; + struct dirent *de; + char *path; + int pathlen; + + pathlen = strlen(DataDir) + 11 + 1; + path = (char *) palloc(pathlen); + snprintf(path, pathlen, %s/pg_tblspc/, DataDir); + + dirdesc = AllocateDirChecked(path); + + while ((de = readdir(dirdesc)) != NULL) + { + char *invalid; + Oid tablespaceoid; + + tablespaceoid = (Oid) strtol(de-d_name, invalid, 10); + if(invalid[0] == '\0') + CleanupStaleRelFilesFromTablespace(tablespaceoid); + } + pfree(path); + + CleanupStaleRelFilesFromTablespace(DEFAULTTABLESPACE_OID); + } + + /* Scan a specific tablespace for stale relations */ + static void + CleanupStaleRelFilesFromTablespace(Oid tablespaceoid) + { + DIR*dirdesc; + struct dirent *de; + char *path; + + path = GetTablespacePath(tablespaceoid); + + dirdesc = AllocateDirChecked(path); + while ((de = readdir(dirdesc)) != NULL) + { + char *invalid; + Oid dboid; + + dboid = (Oid) strtol(de-d_name, invalid, 10); + if(invalid[0] == '\0') + CleanupStaleRelFilesFrom(tablespaceoid, dboid); + } + pfree(path); + } + + /* Scan a specific database in a specific tablespace for stale relations. + * + * First, pg_class for the database is opened, and the relfilenodes of all + * relations mentioned there are stored in a hash table. + * + * Then the directory is scanned. Every file in the directory that's not + * found in pg_class (the hash table) is deleted. + */ + static void + CleanupStaleRelFilesFrom(Oid tablespaceoid, Oid dboid) + { + DIR *dirdesc; +
Re: [PATCHES] Cleaning up unreferenced table files
Heikki Linnakangas [EMAIL PROTECTED] writes: Here's a patch for the TODO item Remove unreferenced table files created by transactions that were in-progress when the server terminated abruptly. xlog.c is a fairly random place to put that functionality. Didn't it strike any warning bells for you when you had to add so many new #includes? I'm not entirely sure where this should go, but not there. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PATCHES] Cleaning up unreferenced table files
On Sat, 5 Mar 2005, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Here's a patch for the TODO item Remove unreferenced table files created by transactions that were in-progress when the server terminated abruptly. xlog.c is a fairly random place to put that functionality. Didn't it strike any warning bells for you when you had to add so many new #includes? I'm not entirely sure where this should go, but not there. Yeah actually it did, but I forgot about it along the way. How about putting it in a file of its own in backend/catalog? There's some code that also deals with the data directories. Or straight into backend/storage. - Heikki ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster