Re: [PATCHES] Cleaning up unreferenced table files
Maybe we should take a different approach to the problem: 1. Create new file with an extension to mark that it's not yet committed (eg. 1234.notcommitted) 2. ... 3. Take CheckpointStartLock 4. Write commit record to WAL, with list of created files. 5. rename created file (1234.notcommitted - 1234). 6. Release CheckpointStartLock This would guarantee that after successful WAL replay, all files in the data directory with .notcommitted extension can be safely deleted. No need to read pg_database or pg_class. We would take a performance hit because of the additional rename and fsync step. Also, we must somehow make sure that the new file or the directory it's in is fsynced on checkpoint to make sure that the rename is flushed to disk. A variant of the scheme would be to create two files on step 1. One would be the actual relfile (1234) and the other would an empty marker file (1234.notcommitted). That way the smgr code wouldn't have to care it the file is new or not when opening it. - Heikki On Thu, 5 May 2005, Tom Lane wrote: Bruce Momjian pgman@candle.pha.pa.us writes: Applied. Now that I've had a chance to look at it, this patch is thoroughly broken. Problems observed in a quick review: 1. It doesn't work at all for non-default tablespaces: it will claim that every file in such a tablespace is stale. The fact that it does that rather than failing entirely is accidental. It tries to read the database's pg_class in the target tablespace whether it's there or not. Because the system is still in recovery mode, the low-level routines allow the access to the nonexistent pg_class table to pass --- in fact they think they should create the file, so after it runs there's a bogus empty 1259 file in each such tablespace (which of course it complains about, too). The code then proceeds to think that pg_class is empty so of course everything draws a warning. 2. It's not robust against stale subdirectories of a tablespace (ie, subdirs corresponding to a nonexistent database) --- again, it'll try to read a nonexistent pg_class. Then it'll produce a bunch of off-target complaint messages. 3. It's assuming that relfilenode is unique database-wide, when no such assumption is safe. We only have a guarantee that it's unique tablespace-wide. 4. It fails to examine table segment files (such as nnn.1). These should be complained of when the nnn doesn't match any hash entry. 5. It will load every relfilenode value in pg_class into the hashtable whether it's meaningful or not. There should be a check on relkind. 6. I don't think relying on strtol to decide if a filename is entirely numeric is very safe. Note all the extra defenses in pg_atoi against various platform-specific misbehaviors of strtol. Personally I'd use a strspn test instead. 7. There are no checks for readdir failure (compare any other readdir loop in the backend). See also Simon Riggs' complaints that the circumstances under which it's done are pretty randomly selected. (One particular thing that I think is a bad idea is to do this in a standalone backend. Any sort of corruption in any db's pg_class would render it impossible to start up.) To fix the first three problems, and also avoid the performance problem of multiply rescanning a database's pg_class for each of its tablespaces, I would suggest that the hashtable entries be widened to RelFileNode structs (ie, db oid, tablespace oid, relfilenode oid). Then there should be one iteration over pg_database to learn the OIDs and default tablespaces of each database; with that you can read pg_class from its correct location for each database and load all the entries into the hashtable. Then you iterate through the tablespaces looking for stuff not present in the hashtable. You might also want to build a list or hashtable of known database OIDs, so that you can recognize a stale subdirectory immediately and issue a direct complaint about it without even recursing into it. regards, tom lane - Heikki ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Cleaning up unreferenced table files
Heikki Linnakangas [EMAIL PROTECTED] writes: Maybe we should take a different approach to the problem: 1. Create new file with an extension to mark that it's not yet committed (eg. 1234.notcommitted) This is pushing the problem into the wrong place, viz the lowest-level file access routines, which will now all have to know about .notcommitted status. It also creates race conditions --- think about backend A trying to commit file 1234 at about the same time that backend B is trying to flush some dirty buffers belonging to that file. But most importantly, it doesn't handle the file-deletion case. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PATCHES] Cleaning up unreferenced table files
On Thu, 5 May 2005, Tom Lane wrote: Bruce Momjian pgman@candle.pha.pa.us writes: Applied. Now that I've had a chance to look at it, this patch is thoroughly broken. Problems observed in a quick review: 1. It doesn't work at all for non-default tablespaces: it will claim that every file in such a tablespace is stale. The fact that it does that rather than failing entirely is accidental. It tries to read the database's pg_class in the target tablespace whether it's there or not. Because the system is still in recovery mode, the low-level routines allow the access to the nonexistent pg_class table to pass --- in fact they think they should create the file, so after it runs there's a bogus empty 1259 file in each such tablespace (which of course it complains about, too). The code then proceeds to think that pg_class is empty so of course everything draws a warning. 2. It's not robust against stale subdirectories of a tablespace (ie, subdirs corresponding to a nonexistent database) --- again, it'll try to read a nonexistent pg_class. Then it'll produce a bunch of off-target complaint messages. 3. It's assuming that relfilenode is unique database-wide, when no such assumption is safe. We only have a guarantee that it's unique tablespace-wide. 4. It fails to examine table segment files (such as nnn.1). These should be complained of when the nnn doesn't match any hash entry. 5. It will load every relfilenode value in pg_class into the hashtable whether it's meaningful or not. There should be a check on relkind. 6. I don't think relying on strtol to decide if a filename is entirely numeric is very safe. Note all the extra defenses in pg_atoi against various platform-specific misbehaviors of strtol. Personally I'd use a strspn test instead. I'll fix 1-6 according to your suggestions, and send another patch. It shows how little experience I have with multiple database and tablespace management. 7. There are no checks for readdir failure (compare any other readdir loop in the backend). I couldn't figure out what you meant. The readdir code is the same as anywhere else. Also, man page (Linux) says that readdir returns NULL on error, and that is checked. See also Simon Riggs' complaints that the circumstances under which it's done are pretty randomly selected. (One particular thing that I think is a bad idea is to do this in a standalone backend. Any sort of corruption in any db's pg_class would render it impossible to start up.) I'd agree with Simons complaints if we actually deleted the files. But since we only report them, it's a good idea to report them on every startup, otherwise the DBA might think that the stale files are not there anymore since the system isn't complaining about them anymore. The original patch only ran the check on crash recovery, but Bruce changed it to run on startup as well, for the above reason. I agree, though, that it's a bad idea to do it in standalone mode. I'll add a check for that. Also it probably shouldn't stop the startup even if some pg_class is corrupt. Other databases could be fine. To fix the first three problems, and also avoid the performance problem of multiply rescanning a database's pg_class for each of its tablespaces, I would suggest that the hashtable entries be widened to RelFileNode structs (ie, db oid, tablespace oid, relfilenode oid). Then there should be one iteration over pg_database to learn the OIDs and default tablespaces of each database; with that you can read pg_class from its correct location for each database and load all the entries into the hashtable. Then you iterate through the tablespaces looking for stuff not present in the hashtable. You might also want to build a list or hashtable of known database OIDs, so that you can recognize a stale subdirectory immediately and issue a direct complaint about it without even recursing into it. regards, tom lane - Heikki ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Cleaning up unreferenced table files
Bruce Momjian pgman@candle.pha.pa.us writes: Applied. Now that I've had a chance to look at it, this patch is thoroughly broken. Problems observed in a quick review: 1. It doesn't work at all for non-default tablespaces: it will claim that every file in such a tablespace is stale. The fact that it does that rather than failing entirely is accidental. It tries to read the database's pg_class in the target tablespace whether it's there or not. Because the system is still in recovery mode, the low-level routines allow the access to the nonexistent pg_class table to pass --- in fact they think they should create the file, so after it runs there's a bogus empty 1259 file in each such tablespace (which of course it complains about, too). The code then proceeds to think that pg_class is empty so of course everything draws a warning. 2. It's not robust against stale subdirectories of a tablespace (ie, subdirs corresponding to a nonexistent database) --- again, it'll try to read a nonexistent pg_class. Then it'll produce a bunch of off-target complaint messages. 3. It's assuming that relfilenode is unique database-wide, when no such assumption is safe. We only have a guarantee that it's unique tablespace-wide. 4. It fails to examine table segment files (such as nnn.1). These should be complained of when the nnn doesn't match any hash entry. 5. It will load every relfilenode value in pg_class into the hashtable whether it's meaningful or not. There should be a check on relkind. 6. I don't think relying on strtol to decide if a filename is entirely numeric is very safe. Note all the extra defenses in pg_atoi against various platform-specific misbehaviors of strtol. Personally I'd use a strspn test instead. 7. There are no checks for readdir failure (compare any other readdir loop in the backend). See also Simon Riggs' complaints that the circumstances under which it's done are pretty randomly selected. (One particular thing that I think is a bad idea is to do this in a standalone backend. Any sort of corruption in any db's pg_class would render it impossible to start up.) To fix the first three problems, and also avoid the performance problem of multiply rescanning a database's pg_class for each of its tablespaces, I would suggest that the hashtable entries be widened to RelFileNode structs (ie, db oid, tablespace oid, relfilenode oid). Then there should be one iteration over pg_database to learn the OIDs and default tablespaces of each database; with that you can read pg_class from its correct location for each database and load all the entries into the hashtable. Then you iterate through the tablespaces looking for stuff not present in the hashtable. You might also want to build a list or hashtable of known database OIDs, so that you can recognize a stale subdirectory immediately and issue a direct complaint about it without even recursing into it. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] Cleaning up unreferenced table files
Heikki, Good patch. ...The patch makes no mention of temporary files, which are left there after a crash. Temp files are only removed on a normal startup, whereas the patch invokes removal on both normal startup and crash recovery. That doesn't make much sense... Also, temp file deletion happens in the postmaster, not at the end of recovery in xlog.c. Could we rationalise those two? One place, one action for both? I'd rather have this also as an administrator command rather than as something automatically run after crash recovery. That way I have my debugging opportunity, but the admin can remove them without restarting the server. Same code, just a Function instead... Best Regards, Simon Riggs reference from fd.c: (this is not a patch) /* * Remove temporary files left over from a prior postmaster session * * This should be called during postmaster startup. It will forcibly * remove any leftover files created by OpenTemporaryFile. * * NOTE: we could, but don't, call this during a post-backend-crash restart * cycle. The argument for not doing it is that someone might want to examine * the temp files for debugging purposes. This does however mean that * OpenTemporaryFile had better allow for collision with an existing temp * file name. */ void RemovePgTempFiles(void) ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PATCHES] Cleaning up unreferenced table files
FYI, his patch is in CVS and now only _reports_ unreferenced files, and it happens on recovery and normal startup. --- Simon Riggs wrote: Heikki, Good patch. ...The patch makes no mention of temporary files, which are left there after a crash. Temp files are only removed on a normal startup, whereas the patch invokes removal on both normal startup and crash recovery. That doesn't make much sense... Also, temp file deletion happens in the postmaster, not at the end of recovery in xlog.c. Could we rationalise those two? One place, one action for both? I'd rather have this also as an administrator command rather than as something automatically run after crash recovery. That way I have my debugging opportunity, but the admin can remove them without restarting the server. Same code, just a Function instead... Best Regards, Simon Riggs reference from fd.c: (this is not a patch) /* * Remove temporary files left over from a prior postmaster session * * This should be called during postmaster startup. It will forcibly * remove any leftover files created by OpenTemporaryFile. * * NOTE: we could, but don't, call this during a post-backend-crash restart * cycle. The argument for not doing it is that someone might want to examine * the temp files for debugging purposes. This does however mean that * OpenTemporaryFile had better allow for collision with an existing temp * file name. */ void RemovePgTempFiles(void) ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PATCHES] Cleaning up unreferenced table files
Uh, you forgot to add cleanup.h. --- Heikki Linnakangas wrote: On Tue, 26 Apr 2005, Alvaro Herrera wrote: You forgot the attachment? Damn. It happens time after time... - Heikki Content-Description: [ Attachment, skipping... ] -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] Cleaning up unreferenced table files
How embarrasing... I hope it's all there now. On Thu, 28 Apr 2005, Bruce Momjian wrote: Uh, you forgot to add cleanup.h. --- Heikki Linnakangas wrote: On Tue, 26 Apr 2005, Alvaro Herrera wrote: You forgot the attachment? Damn. It happens time after time... - Heikki Content-Description: [ Attachment, skipping... ] -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 - HeikkiIndex: doc/src/sgml/maintenance.sgml === RCS file: /projects/cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v retrieving revision 1.41 diff -c -r1.41 maintenance.sgml *** doc/src/sgml/maintenance.sgml 20 Feb 2005 02:21:26 - 1.41 --- doc/src/sgml/maintenance.sgml 28 Apr 2005 17:32:41 - *** *** 474,479 --- 474,496 /para /sect1 + sect1 id=cleanup-after-crash + titleCleanup after crash/title + + indexterm zone=cleanup-after-crash +primarystale file/primary + /indexterm + + para +productnamePostgreSQL/productname recovers automatically after crash +using the write-ahead log (see xref linkend=wal) and no manual +operations are normally needed. However, if there was a transaction running +when the crash occured that created or dropped a relation, the +transaction might have left a stale file in the data directory. If this +happens, you will get a notice in the log file stating which files can be +deleted. + /para + /sect1 sect1 id=logfile-maintenance titleLog File Maintenance/title Index: src/backend/access/transam/xlog.c === RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xlog.c,v retrieving revision 1.188 diff -c -r1.188 xlog.c *** src/backend/access/transam/xlog.c 23 Apr 2005 18:49:54 - 1.188 --- src/backend/access/transam/xlog.c 28 Apr 2005 17:32:41 - *** *** 42,47 --- 42,48 #include utils/builtins.h #include utils/guc.h #include utils/relcache.h + #include utils/cleanup.h /* *** *** 4520,4525 --- 4521,4528 CreateCheckPoint(true, true); + CleanupStaleRelFiles(); + /* * Close down recovery environment */ Index: src/backend/catalog/catalog.c === RCS file: /projects/cvsroot/pgsql/src/backend/catalog/catalog.c,v retrieving revision 1.59 diff -c -r1.59 catalog.c *** src/backend/catalog/catalog.c 14 Apr 2005 20:03:23 - 1.59 --- src/backend/catalog/catalog.c 28 Apr 2005 17:32:41 - *** *** 106,111 --- 106,144 return path; } + /* + * GetTablespacePath - construct path to a tablespace symbolic link + * + * Result is a palloc'd string. + * + * XXX this must agree with relpath and GetDatabasePath! + */ + char * + GetTablespacePath(Oid spcNode) + { + int pathlen; + char *path; + + Assert(spcNode != GLOBALTABLESPACE_OID); + + if (spcNode == DEFAULTTABLESPACE_OID) + { + /* The default tablespace is {datadir}/base */ + pathlen = strlen(DataDir) + 5 + 1; + path = (char *) palloc(pathlen); + snprintf(path, pathlen, %s/base, +DataDir); + } + else + { + /* All other tablespaces have symlinks in pg_tblspc */ + pathlen = strlen(DataDir) + 11 + OIDCHARS + 1; + path = (char *) palloc(pathlen); + snprintf(path, pathlen, %s/pg_tblspc/%u, +DataDir, spcNode); + } + return path; + } /* * IsSystemRelation Index: src/backend/commands/tablespace.c === RCS file: /projects/cvsroot/pgsql/src/backend/commands/tablespace.c,v retrieving revision 1.17 diff -c -r1.17 tablespace.c *** src/backend/commands/tablespace.c 14 Apr 2005 20:03:24 - 1.17 --- src/backend/commands/tablespace.c 28 Apr 2005 17:32:41 - *** *** 341,348 /* * All seems well, create the symlink */ ! linkloc = (char *) palloc(strlen(DataDir) + 11 + 10 + 1); ! sprintf(linkloc, %s/pg_tblspc/%u, DataDir, tablespaceoid); if (symlink(location, linkloc) 0) ereport(ERROR, --- 341,347 /* * All seems well, create the symlink */ ! linkloc = GetTablespacePath(tablespaceoid); if (symlink(location, linkloc) 0) ereport(ERROR,
Re: [PATCHES] Cleaning up unreferenced table files
On Tue, 26 Apr 2005, Alvaro Herrera wrote: You forgot the attachment? Damn. It happens time after time... - HeikkiIndex: doc/src/sgml/maintenance.sgml === RCS file: /projects/cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v retrieving revision 1.41 diff -c -r1.41 maintenance.sgml *** doc/src/sgml/maintenance.sgml 20 Feb 2005 02:21:26 - 1.41 --- doc/src/sgml/maintenance.sgml 26 Apr 2005 22:02:22 - *** *** 474,479 --- 474,496 /para /sect1 + sect1 id=cleanup-after-crash + titleCleanup after crash/title + + indexterm zone=cleanup-after-crash +primarystale file/primary + /indexterm + + para +productnamePostgreSQL/productname recovers automatically after crash +using the write-ahead log (see xref linkend=wal) and no manual +operations are normally needed. However, if there was a transaction running +when the crash occured that created or dropped a relation, the +transaction might have left a stale file in the data directory. If this +happens, you will get a notice in the log file stating which files can be +deleted. + /para + /sect1 sect1 id=logfile-maintenance titleLog File Maintenance/title Index: src/backend/access/transam/xlog.c === RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xlog.c,v retrieving revision 1.188 diff -c -r1.188 xlog.c *** src/backend/access/transam/xlog.c 23 Apr 2005 18:49:54 - 1.188 --- src/backend/access/transam/xlog.c 26 Apr 2005 22:02:22 - *** *** 42,47 --- 42,48 #include utils/builtins.h #include utils/guc.h #include utils/relcache.h + #include utils/cleanup.h /* *** *** 4520,4525 --- 4521,4528 CreateCheckPoint(true, true); + CleanupStaleRelFiles(); + /* * Close down recovery environment */ Index: src/backend/catalog/catalog.c === RCS file: /projects/cvsroot/pgsql/src/backend/catalog/catalog.c,v retrieving revision 1.59 diff -c -r1.59 catalog.c *** src/backend/catalog/catalog.c 14 Apr 2005 20:03:23 - 1.59 --- src/backend/catalog/catalog.c 26 Apr 2005 22:02:22 - *** *** 106,111 --- 106,144 return path; } + /* + * GetTablespacePath - construct path to a tablespace symbolic link + * + * Result is a palloc'd string. + * + * XXX this must agree with relpath and GetDatabasePath! + */ + char * + GetTablespacePath(Oid spcNode) + { + int pathlen; + char *path; + + Assert(spcNode != GLOBALTABLESPACE_OID); + + if (spcNode == DEFAULTTABLESPACE_OID) + { + /* The default tablespace is {datadir}/base */ + pathlen = strlen(DataDir) + 5 + 1; + path = (char *) palloc(pathlen); + snprintf(path, pathlen, %s/base, +DataDir); + } + else + { + /* All other tablespaces have symlinks in pg_tblspc */ + pathlen = strlen(DataDir) + 11 + OIDCHARS + 1; + path = (char *) palloc(pathlen); + snprintf(path, pathlen, %s/pg_tblspc/%u, +DataDir, spcNode); + } + return path; + } /* * IsSystemRelation Index: src/backend/commands/tablespace.c === RCS file: /projects/cvsroot/pgsql/src/backend/commands/tablespace.c,v retrieving revision 1.17 diff -c -r1.17 tablespace.c *** src/backend/commands/tablespace.c 14 Apr 2005 20:03:24 - 1.17 --- src/backend/commands/tablespace.c 26 Apr 2005 22:02:23 - *** *** 341,348 /* * All seems well, create the symlink */ ! linkloc = (char *) palloc(strlen(DataDir) + 11 + 10 + 1); ! sprintf(linkloc, %s/pg_tblspc/%u, DataDir, tablespaceoid); if (symlink(location, linkloc) 0) ereport(ERROR, --- 341,347 /* * All seems well, create the symlink */ ! linkloc = GetTablespacePath(tablespaceoid); if (symlink(location, linkloc) 0) ereport(ERROR, *** *** 495,502 char *subfile; struct stat st; ! location = (char *) palloc(strlen(DataDir) + 11 + 10 + 1); ! sprintf(location, %s/pg_tblspc/%u, DataDir, tablespaceoid); /* * Check if the tablespace still contains any files. We try to rmdir --- 494,500 char *subfile; struct stat st; ! location = GetTablespacePath(tablespaceoid); /* * Check if the tablespace still contains any files. We try to rmdir
Re: [PATCHES] Cleaning up unreferenced table files
On Tue, 26 Apr 2005, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: I feel that crashes that leaves behind stale files are rare. Indeed, and getting more so all the time ... How so? Have changes been made in those parts of the code? which makes me question the value of doing anything about this at all. What bothers me is that we currently have no means of knowing if it happens and how often it happens, so we're just guessing that it's rare. How rare? We don't know. If nobody ever runs into this issue in production, and this whole exercise turns out to be completely unnecessary, at least we'll know. That alone makes me feel better. The only drawback of the patch that I can see is the performance impact on recovery. And I think the time it takes to scan the data directories isn't much compared to WAL replay. - Heikki ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Cleaning up unreferenced table files
Heikki Linnakangas [EMAIL PROTECTED] writes: On Tue, 26 Apr 2005, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: I feel that crashes that leaves behind stale files are rare. Indeed, and getting more so all the time ... How so? Have changes been made in those parts of the code? No, just that the overall reliability of Postgres keeps improving. At the time that TODO entry was created, I don't think we even had the ability to roll back table creates/drops properly, so there were scenarios in which unreferenced files could be left behind without even assuming any software error. And the prevalence of backend crashes was way higher than it is now, too. So I think a good argument could be made that the TODO item isn't nearly as important as it was at the time. If nobody ever runs into this issue in production, and this whole exercise turns out to be completely unnecessary, at least we'll know. That alone makes me feel better. We will know no such thing, unless the patch is made to announce the problem so intrusively that people are certain to see it *and report it to us*. Which I don't think will be acceptable. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Cleaning up unreferenced table files
On Wed, 27 Apr 2005, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: On Tue, 26 Apr 2005, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: I feel that crashes that leaves behind stale files are rare. Indeed, and getting more so all the time ... How so? Have changes been made in those parts of the code? No, just that the overall reliability of Postgres keeps improving. At the time that TODO entry was created, I don't think we even had the ability to roll back table creates/drops properly, so there were scenarios in which unreferenced files could be left behind without even assuming any software error. And the prevalence of backend crashes was way higher than it is now, too. So I think a good argument could be made that the TODO item isn't nearly as important as it was at the time. *shrug*. I won't push any harder if you feel strongly against the patch. Should we just remove the TODO item? And/or document the issue? - Heikki ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Cleaning up unreferenced table files
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: On Tue, 26 Apr 2005, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: I feel that crashes that leaves behind stale files are rare. Indeed, and getting more so all the time ... How so? Have changes been made in those parts of the code? No, just that the overall reliability of Postgres keeps improving. At the time that TODO entry was created, I don't think we even had the ability to roll back table creates/drops properly, so there were scenarios in which unreferenced files could be left behind without even assuming any software error. And the prevalence of backend crashes was way higher than it is now, too. So I think a good argument could be made that the TODO item isn't nearly as important as it was at the time. If nobody ever runs into this issue in production, and this whole exercise turns out to be completely unnecessary, at least we'll know. That alone makes me feel better. We will know no such thing, unless the patch is made to announce the problem so intrusively that people are certain to see it *and report it to us*. Which I don't think will be acceptable. Well, if putting it in the server logs isn't enough, I don't know what is. I think we do need the patch, at least to find out if there is an issue we don't know about. I don't see the hard in it. Heikki, would you time startup and tell us what percentage of time is taken by the routines? Or I can do it. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PATCHES] Cleaning up unreferenced table files
Bruce Momjian pgman@candle.pha.pa.us writes: I think we do need the patch, at least to find out if there is an issue we don't know about. My point is that we won't find out anything, because we will have no idea if people are noticing the log entries at all, much less telling us about 'em. Of course if they do tell us then we'd know something, but what I am expecting is a vast silence. We will not be able to tell the difference between no problem and very infrequent problem any better than we can now. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PATCHES] Cleaning up unreferenced table files
Tom Lane wrote: Bruce Momjian pgman@candle.pha.pa.us writes: I think we do need the patch, at least to find out if there is an issue we don't know about. My point is that we won't find out anything, because we will have no idea if people are noticing the log entries at all, much less telling us about 'em. Of course if they do tell us then we'd know something, but what I am expecting is a vast silence. We will not be able to tell the difference between no problem and very infrequent problem any better than we can now. I think people do look at their logs. We are certainly better off finding it than having no reporting at all. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] Cleaning up unreferenced table files
On Mon, 25 Apr 2005, Bruce Momjian wrote: Tom Lane wrote: ... I think though that we ought to first consider the question of whether we *want* this functionality. On reflection I'm fairly nervous about the idea of actually deleting anything during startup --- seems like a good recipe for turning small failures into large failures. But if we're not going to delete anything then it's questionable whether we need to code it like this at all. It'd certainly be easier and safer to examine these tables after the system is up and running normally. Let's discuss this. The patch as submitted checks for unreferenced files on bootup and reports them in the log file, but does not delete them. That seems like the proper behavior. I think we delete from pgsql_tmp on bootup, but we _know_ those aren't referenced. What other user interface would trigger this if we did it after startup? Wouldn't we have to lock pg_class against VACUUM while we scan the file system, and are we sure we do things in pg_class or the file system first consistently? It seems much more prone to error doing it while the system is running. I agree. Also, you can only have stale files after a backend crash, since they are normally cleaned up at the end of transaction. If it was a separate program or command, the administrator would have to be aware of the issue. Otherwise, he wouldn't know he needs to run it after a crash. I feel that crashes that leaves behind stale files are rare. You would need an application that creates/drops tables as part of normal operation. Some kind of a large batch load might do that: BEGIN, CREATE TABLE foo, COPY 1 GB of data, COMMIT. The nasty thing right now is, you might end up with 1 GB of wasted disk space, and never even know it. I guess I am happy with just reporting during startup like the patch does now. Ok. I'll fix the design issues Tom addressed earlier, add documentation, and resubmit. We can come back to this after a release or two, when we have more confidence in the feature. Maybe we'll also get some feedback on how often those stale files occur in practice. - Heikki ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] Cleaning up unreferenced table files
On Tue, 26 Apr 2005, Heikki Linnakangas wrote: On Mon, 25 Apr 2005, Bruce Momjian wrote: ... I guess I am happy with just reporting during startup like the patch does now. Ok. I'll fix the design issues Tom addressed earlier, add documentation, and resubmit. Here you go. The new functionality is now separated in a new file backend/utils/init/cleanup.c. There was code in many places that constructs the path to a tablespace directory. I refactored that into a new function called GetTablespacePath and put it next to GetDatabasePath in catalog.c. I added a section under the Routine Database Maintenance Tasks that basically gives a heads up that these notifications can appear in the log after a crash. - Heikki ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Cleaning up unreferenced table files
Heikki Linnakangas [EMAIL PROTECTED] writes: I feel that crashes that leaves behind stale files are rare. Indeed, and getting more so all the time ... which makes me question the value of doing anything about this at all. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] Cleaning up unreferenced table files
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: On Sat, 5 Mar 2005, Tom Lane wrote: xlog.c is a fairly random place to put that functionality. Didn't it strike any warning bells for you when you had to add so many new #includes? I'm not entirely sure where this should go, but not there. Yeah actually it did, but I forgot about it along the way. How about putting it in a file of its own in backend/catalog? There's some code that also deals with the data directories. Or straight into backend/storage. Actually, you could make some case for putting it in utils/init/ beside flatfiles.c, which has got much the same sort of issues to deal with. I think though that we ought to first consider the question of whether we *want* this functionality. On reflection I'm fairly nervous about the idea of actually deleting anything during startup --- seems like a good recipe for turning small failures into large failures. But if we're not going to delete anything then it's questionable whether we need to code it like this at all. It'd certainly be easier and safer to examine these tables after the system is up and running normally. Let's discuss this. The patch as submitted checks for unreferenced files on bootup and reports them in the log file, but does not delete them. That seems like the proper behavior. I think we delete from pgsql_tmp on bootup, but we _know_ those aren't referenced. What other user interface would trigger this if we did it after startup? Wouldn't we have to lock pg_class against VACUUM while we scan the file system, and are we sure we do things in pg_class or the file system first consistently? It seems much more prone to error doing it while the system is running. I guess I am happy with just reporting during startup like the patch does now. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PATCHES] Cleaning up unreferenced table files
Heikki Linnakangas [EMAIL PROTECTED] writes: On Sat, 5 Mar 2005, Tom Lane wrote: xlog.c is a fairly random place to put that functionality. Didn't it strike any warning bells for you when you had to add so many new #includes? I'm not entirely sure where this should go, but not there. Yeah actually it did, but I forgot about it along the way. How about putting it in a file of its own in backend/catalog? There's some code that also deals with the data directories. Or straight into backend/storage. Actually, you could make some case for putting it in utils/init/ beside flatfiles.c, which has got much the same sort of issues to deal with. I think though that we ought to first consider the question of whether we *want* this functionality. On reflection I'm fairly nervous about the idea of actually deleting anything during startup --- seems like a good recipe for turning small failures into large failures. But if we're not going to delete anything then it's questionable whether we need to code it like this at all. It'd certainly be easier and safer to examine these tables after the system is up and running normally. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Cleaning up unreferenced table files
Heikki Linnakangas [EMAIL PROTECTED] writes: Here's a patch for the TODO item Remove unreferenced table files created by transactions that were in-progress when the server terminated abruptly. xlog.c is a fairly random place to put that functionality. Didn't it strike any warning bells for you when you had to add so many new #includes? I'm not entirely sure where this should go, but not there. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PATCHES] Cleaning up unreferenced table files
On Sat, 5 Mar 2005, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Here's a patch for the TODO item Remove unreferenced table files created by transactions that were in-progress when the server terminated abruptly. xlog.c is a fairly random place to put that functionality. Didn't it strike any warning bells for you when you had to add so many new #includes? I'm not entirely sure where this should go, but not there. Yeah actually it did, but I forgot about it along the way. How about putting it in a file of its own in backend/catalog? There's some code that also deals with the data directories. Or straight into backend/storage. - Heikki ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster