Re: [HACKERS] pg_upgrade bug found!
bricklen wrote: On Fri, Apr 8, 2011 at 10:01 PM, bricklen brick...@gmail.com wrote: Update on the status of the steps we took, which were: - test on a hot standby by bringing it live, running the script, determing the missing clog files, copying them into the live (hot standby) pg_clog dir Now, on the master, copied the same old clog files into the production *master*, ran vacuumdb -a -v -F. The step I should have taken on the master before the vacuumdb -F would have been to run the http://wiki.postgresql.org/wiki/20110408pg_upgrade_fix script to see if I was missing any clog files on the master. That vacuum freeze step pointed out a clog file, I copied that into the master pg_clog dir, ran the aforementioned script. It didn't fail on any of the clog files this time, so now I am rerunning the vacuum freeze command and hoping like hell it works! If the current run of the vacuum freeze fails, I'll report back. Thanks again for everyone's help. The vacuumdb -a -v F completed successfully this time. YEA! -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Alvaro Herrera wrote: Why is it important to have the original pg_clog files around? Since the transactions in question are below the freeze horizon, surely the tuples that involve those transaction have all been visited by vacuum and thus removed if they were leftover from aborted transactions or deleted, no? So you could just fill those files with the 0x55 pattern (signalling all transactions are committed) and the net result should be the same. No? Forgive me if I'm missing something. I haven't been following this thread and I'm more than a little tired (but wanted to shoot this today because I'm gonna be able to, until Monday). Well, in most cases vacuum (or SELECT?) is going to set those xids as committed on the tuple, but if there have been few deletes in the toast table, it is possible vacuum did not run. I think the fact we only have three report query error cases is because in most cases vacuum is already taking care of this as part of space reuse. relfrozenxid is not going to cause freeze to run and therefore those xids, even though marked as committed, still are on the tuple, so we need this script to be run. In fact, if the tuple is marked as committed, do we even bother to mark the xids as fixed via vacuum freeze? Seems we don't have to. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Bruce Momjian wrote: Alvaro Herrera wrote: Why is it important to have the original pg_clog files around? Since the transactions in question are below the freeze horizon, surely the tuples that involve those transaction have all been visited by vacuum and thus removed if they were leftover from aborted transactions or deleted, no? So you could just fill those files with the 0x55 pattern (signalling all transactions are committed) and the net result should be the same. No? Forgive me if I'm missing something. I haven't been following this thread and I'm more than a little tired (but wanted to shoot this today because I'm gonna be able to, until Monday). To answer your other question, it is true we _probably_ could assume all the rows were committed, except that again, vacuum might not have run and the pages might not be full so single-page cleanup wasn't done either. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Sat, Apr 9, 2011 at 7:03 AM, Bruce Momjian br...@momjian.us wrote: Bruce Momjian wrote: Alvaro Herrera wrote: Why is it important to have the original pg_clog files around? Since the transactions in question are below the freeze horizon, surely the tuples that involve those transaction have all been visited by vacuum and thus removed if they were leftover from aborted transactions or deleted, no? So you could just fill those files with the 0x55 pattern (signalling all transactions are committed) and the net result should be the same. No? Forgive me if I'm missing something. I haven't been following this thread and I'm more than a little tired (but wanted to shoot this today because I'm gonna be able to, until Monday). To answer your other question, it is true we _probably_ could assume all the rows were committed, except that again, vacuum might not have run and the pages might not be full so single-page cleanup wasn't done either. OK, continuing the thought of just making all the old clog files as all committed... Since it only affects toast tables, the only time the system (with normal queries) would check for a particular toast tuple, the tuple referring to it would have been committed, right? So forcing all transactions committed for the older clog segments might mean a scan on a *toast* heap might return tuples as committed when they might have been aborted, but the real table heap would never refer to those, right? a. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Sat, Apr 09, 2011 at 09:05:42AM -0400, Aidan Van Dyk wrote: On Sat, Apr 9, 2011 at 7:03 AM, Bruce Momjian br...@momjian.us wrote: Bruce Momjian wrote: Alvaro Herrera wrote: Why is it important to have the original pg_clog files around? ?Since the transactions in question are below the freeze horizon, surely the tuples that involve those transaction have all been visited by vacuum and thus removed if they were leftover from aborted transactions or deleted, no? ?So you could just fill those files with the 0x55 pattern (signalling all transactions are committed) and the net result should be the same. ?No? Forgive me if I'm missing something. ?I haven't been following this thread and I'm more than a little tired (but wanted to shoot this today because I'm gonna be able to, until Monday). To answer your other question, it is true we _probably_ could assume all the rows were committed, except that again, vacuum might not have run and the pages might not be full so single-page cleanup wasn't done either. OK, continuing the thought of just making all the old clog files as all committed... Since it only affects toast tables, the only time the system (with normal queries) would check for a particular toast tuple, the tuple referring to it would have been committed, right? So forcing all transactions committed for the older clog segments might mean a scan on a *toast* heap might return tuples as committed when they might have been aborted, but the real table heap would never refer to those, right? Yes; it would be relatively harmless to retain some unreferenced TOAST chunks. However, all xacts committed is not equivalent to all tuples visible. If the user rolled back a DELETE shortly before the pg_upgrade run, we need to recognize that outcome to keep any deleted TOAST entries visible. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Aidan Van Dyk wrote: On Sat, Apr 9, 2011 at 7:03 AM, Bruce Momjian br...@momjian.us wrote: Bruce Momjian wrote: Alvaro Herrera wrote: Why is it important to have the original pg_clog files around? ?Since the transactions in question are below the freeze horizon, surely the tuples that involve those transaction have all been visited by vacuum and thus removed if they were leftover from aborted transactions or deleted, no? ?So you could just fill those files with the 0x55 pattern (signalling all transactions are committed) and the net result should be the same. ?No? Forgive me if I'm missing something. ?I haven't been following this thread and I'm more than a little tired (but wanted to shoot this today because I'm gonna be able to, until Monday). To answer your other question, it is true we _probably_ could assume all the rows were committed, except that again, vacuum might not have run and the pages might not be full so single-page cleanup wasn't done either. OK, continuing the thought of just making all the old clog files as all committed... Since it only affects toast tables, the only time the system (with normal queries) would check for a particular toast tuple, the tuple referring to it would have been committed, right? So forcing all transactions committed for the older clog segments might mean a scan on a *toast* heap might return tuples as committed when they might have been aborted, but the real table heap would never refer to those, right? Uh, good point. I think you are right that you only get to a toast row from a non-aborted heap row. I think the problem might be in following the toast chain but even then I am unclear how that works. Anyone? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Thu, Apr 07, 2011 at 10:21:06PM -0400, Bruce Momjian wrote: Noah Misch wrote: 1) The pg_class.relfrozenxid that the TOAST table should have received (true relfrozenxid) is still covered by available clog files. Fixable with some combination of pg_class.relfrozenxid twiddling and SET vacuum_freeze_table_age = 0; VACUUM toasttbl. Right, VACUUM FREEZE. I now see I don't need to set vacuum_freeze_table_age if I use the FREEZE keyword, e.g. gram.y has: if (n-options VACOPT_FREEZE) n-freeze_min_age = n-freeze_table_age = 0; True; it just performs more work than strictly necessary. We don't actually need earlier-than-usual freezing. We need only ensure that the relfrozenxid will guide future VACUUMs to do that freezing early enough. However, I'm not sure how to do that without directly updating relfrozenxid, so it's probably just as well to cause some extra work and stick to the standard interface. 2) The true relfrozenxid is no longer covered by available clog files. The fix for case 1 will get file foo doesn't exist, reading as zeroes log messages, and we will treat all transactions as uncommitted. Uh, are you sure? I think it would return an error message about a missing clog file for the query; here is a report of a case not related to pg_upgrade: http://archives.postgresql.org/pgsql-admin/2010-09/msg00109.php My statement was indeed incorrect. (Was looking at the reading as zeroes message in slru.c, but it only applies during recovery.) Not generally fixable after that has happened. We could probably provide a recipe for checking whether it could have happened given access to a backup from just before the upgrade. The IRC folks pulled the clog files off of backups. Since we do get the error after all, that should always be enough. One concern I have is that existing heap tables are protecting clog files, but once those are frozen, the system might remove clog files not realizing it has to freeze the heap tables too. Yes. On a similar note, would it be worth having your prototype fixup script sort the VACUUM FREEZE calls in descending relfrozenxid order? Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Noah Misch wrote: On Thu, Apr 07, 2011 at 10:21:06PM -0400, Bruce Momjian wrote: Noah Misch wrote: 1) The pg_class.relfrozenxid that the TOAST table should have received (true relfrozenxid) is still covered by available clog files. Fixable with some combination of pg_class.relfrozenxid twiddling and SET vacuum_freeze_table_age = 0; VACUUM toasttbl. Right, VACUUM FREEZE. I now see I don't need to set vacuum_freeze_table_age if I use the FREEZE keyword, e.g. gram.y has: if (n-options VACOPT_FREEZE) n-freeze_min_age = n-freeze_table_age = 0; True; it just performs more work than strictly necessary. We don't actually need earlier-than-usual freezing. We need only ensure that the relfrozenxid will guide future VACUUMs to do that freezing early enough. However, I'm not sure how to do that without directly updating relfrozenxid, so it's probably just as well to cause some extra work and stick to the standard interface. Looking at the code, it seems VACUUM FREEZE will freeze any row it can, and because we restarted the cluster after the upgrade, all the rows we care about are visible or dead to all transactions. pg_upgrade certainly relies on that fact already. 2) The true relfrozenxid is no longer covered by available clog files. The fix for case 1 will get file foo doesn't exist, reading as zeroes log messages, and we will treat all transactions as uncommitted. Uh, are you sure? I think it would return an error message about a missing clog file for the query; here is a report of a case not related to pg_upgrade: http://archives.postgresql.org/pgsql-admin/2010-09/msg00109.php My statement was indeed incorrect. (Was looking at the reading as zeroes message in slru.c, but it only applies during recovery.) No problem. Thanks for the review. Not generally fixable after that has happened. We could probably provide a recipe for checking whether it could have happened given access to a backup from just before the upgrade. The IRC folks pulled the clog files off of backups. Since we do get the error after all, that should always be enough. That was my thought too. One concern I have is that existing heap tables are protecting clog files, but once those are frozen, the system might remove clog files not realizing it has to freeze the heap tables too. Yes. On a similar note, would it be worth having your prototype fixup script sort the VACUUM FREEZE calls in descending relfrozenxid order? Good question. I don't think the relfrozenxid ordering would make sense because I think it is unlikely problems will happen while they are running the script. The other problem is that because of wraparound it is unclear which xid is earliest. What I did add was to order by the oid, so at least the toast numbers are in order and people can more easily track its progress. New version; I made some other small adjustments: -- This script fixes data in pre-PG 9.0.4 and pre-8.4.8 -- servers that was upgraded by pg_upgrade and pg_migrator. -- Run the script using psql for every database in the cluster -- except 'template0', e.g.: -- psql -U postgres -a -f pg_upgrade_fix.sql dbname -- This must be run from a writable directory. -- It will not lock any tables but will generate I/O. -- CREATE TEMPORARY TABLE pg_upgrade_fix AS SELECT 'VACUUM FREEZE pg_toast.' || c.relname || ';' FROMpg_class c, pg_namespace n WHERE c.relnamespace = n.oid AND n.nspname = 'pg_toast' AND c.relkind = 't' ORDER by c.oid; \copy pg_upgrade_fix TO 'pg_upgrade_tmp.sql'; \i pg_upgrade_tmp.sql -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Thu, 2011-04-07 at 22:21 -0400, Bruce Momjian wrote: One concern I have is that existing heap tables are protecting clog files, but once those are frozen, the system might remove clog files not realizing it has to freeze the heap tables too. I don't understand. Can you elaborate? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Fri, 2011-04-08 at 07:08 -0400, Noah Misch wrote: Right, VACUUM FREEZE. I now see I don't need to set vacuum_freeze_table_age if I use the FREEZE keyword, e.g. gram.y has: if (n-options VACOPT_FREEZE) n-freeze_min_age = n-freeze_table_age = 0; True; it just performs more work than strictly necessary. We don't actually need earlier-than-usual freezing. We need only ensure that the relfrozenxid will guide future VACUUMs to do that freezing early enough. However, I'm not sure how to do that without directly updating relfrozenxid, so it's probably just as well to cause some extra work and stick to the standard interface. If there are tuples in a toast table containing xids that are older than the toast table's relfrozenxid, then there are only two options: 1. Make relfrozenxid go backward to the right value. There is currently no mechanism to do this without compiling C code into the server, because (a) VACUUM FREEZE will never move the relfrozenxid backward; and (b) there is no way to find the oldest xid in a table with a normal snapshot. 2. Get rid of those xids older than relfrozenxid (i.e. VACUUM FREEZE). I don't know what you mean about VACUUM FREEZE doing extra work. I suppose you could set the vacuum_freeze_min_age to be exactly the right value such that it freezes everything before the existing (and wrong) relfrozenxid, but in practice I think it would be the same amount of work. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Bruce Momjian wrote: Bruce Momjian wrote: OK, thanks to RhodiumToad on IRC, I was able to determine the cause of the two reported pg_upgrade problems he saw via IRC. It seems toast tables have xids and pg_dump is not preserving the toast relfrozenxids as it should. Heap tables have preserved relfrozenxids, but if you update a heap row but don't change the toast value, and the old heap row is later removed, the toast table can have an older relfrozenxids than the heap table. The fix for this is to have pg_dump preserve toast relfrozenxids, which can be easily added and backpatched. We might want to push a 9.0.4 for this. Second, we need to find a way for people to detect and fix existing systems that have this problem, perhaps looming when the pg_class relfrozenxid passes the toast relfrozenxid, and thirdly, we need to figure out how to get this information to users. Perhaps the communication comes through the 9.0.4 release announcement. I am not sure how to interpret the lack of replies to this email. Either it is confidence, shock, or we told you so. ;-) Anyway, the attached patch fixes the problem. The fix is for pg_dump's binary upgrade mode. This would need to be backpatched back to 8.4 because pg_migrator needs this too. OK, I have applied the attached three patches to 8.4, 9.0, and 9.1. They are all slightly different because of code drift, and I created a unified diff which I find is clearer for single-line changes. I was very careful about the patching of queries because many of these queries are only activated when dumping an older database, and are therefore hard to test for SQL query errors. I included all the version patches in case someone sees something I missed. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 3842895..c5057f7 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -3185,6 +3185,8 @@ getTables(int *numTables) int i_relhasrules; int i_relhasoids; int i_relfrozenxid; + int i_toastoid; + int i_toastfrozenxid; int i_owning_tab; int i_owning_col; int i_reltablespace; @@ -3226,7 +3228,8 @@ getTables(int *numTables) (%s c.relowner) AS rolname, c.relchecks, c.relhastriggers, c.relhasindex, c.relhasrules, c.relhasoids, - c.relfrozenxid, + c.relfrozenxid, tc.oid AS toid, + tc.relfrozenxid AS tfrozenxid, d.refobjid AS owning_tab, d.refobjsubid AS owning_col, (SELECT spcname FROM pg_tablespace t WHERE t.oid = c.reltablespace) AS reltablespace, @@ -3259,6 +3262,8 @@ getTables(int *numTables) relchecks, (reltriggers 0) AS relhastriggers, relhasindex, relhasrules, relhasoids, relfrozenxid, + 0 AS toid, + 0 AS tfrozenxid, d.refobjid AS owning_tab, d.refobjsubid AS owning_col, (SELECT spcname FROM pg_tablespace t WHERE t.oid = c.reltablespace) AS reltablespace, @@ -3290,6 +3295,8 @@ getTables(int *numTables) relchecks, (reltriggers 0) AS relhastriggers, relhasindex, relhasrules, relhasoids, 0 AS relfrozenxid, + 0 AS toid, + 0 AS tfrozenxid, d.refobjid AS owning_tab, d.refobjsubid AS owning_col, (SELECT spcname FROM pg_tablespace t WHERE t.oid = c.reltablespace) AS reltablespace, @@ -3321,6 +3328,8 @@ getTables(int *numTables) relchecks, (reltriggers 0) AS relhastriggers, relhasindex, relhasrules, relhasoids, 0 AS relfrozenxid, + 0 AS toid, + 0 AS tfrozenxid, d.refobjid AS owning_tab, d.refobjsubid AS owning_col, NULL AS reltablespace, @@ -3348,6 +3357,8 @@ getTables(int *numTables) relchecks, (reltriggers 0) AS relhastriggers, relhasindex, relhasrules, relhasoids, 0 AS relfrozenxid, + 0 AS toid, + 0 AS tfrozenxid, NULL::oid AS owning_tab, NULL::int4 AS owning_col, NULL AS reltablespace, @@ -3370,6 +3381,8 @@ getTables(int *numTables) relhasindex, relhasrules, 't'::bool AS relhasoids, 0 AS relfrozenxid, + 0 AS toid, + 0 AS tfrozenxid, NULL::oid AS owning_tab, NULL::int4 AS owning_col, NULL AS reltablespace, @@ -3446,6 +3459,8 @@ getTables(int *numTables) i_relhasrules = PQfnumber(res, relhasrules); i_relhasoids = PQfnumber(res, relhasoids); i_relfrozenxid = PQfnumber(res, relfrozenxid); + i_toastoid = PQfnumber(res, toid); + i_toastfrozenxid = PQfnumber(res, tfrozenxid); i_owning_tab = PQfnumber(res, owning_tab); i_owning_col = PQfnumber(res, owning_col); i_reltablespace = PQfnumber(res, reltablespace); @@ -3484,6 +3499,8
Re: [HACKERS] pg_upgrade bug found!
Jeff Davis wrote: On Thu, 2011-04-07 at 22:21 -0400, Bruce Momjian wrote: One concern I have is that existing heap tables are protecting clog files, but once those are frozen, the system might remove clog files not realizing it has to freeze the heap tables too. I don't understand. Can you elaborate? Well, when you initially run pg_upgrade, your heap relfrozenxid is preserved, and we only remove clog files when _all_ relations in all database do not need them, so for a time the heap tables will keep the clogs around. Over time, the heap files will be vacuum frozen, and their relfrozenxid advanced. Once that happens to all heaps, the system thinks it can remove clog files, and doesn't realize the toast tables also need vacuuming. This is the it might become more of a problem in the future concern I have. The script I posted does fix this, and the code changes prevent it from happening completely. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Jeff Davis wrote: On Fri, 2011-04-08 at 07:08 -0400, Noah Misch wrote: Right, VACUUM FREEZE. I now see I don't need to set vacuum_freeze_table_age if I use the FREEZE keyword, e.g. gram.y has: if (n-options VACOPT_FREEZE) n-freeze_min_age = n-freeze_table_age = 0; True; it just performs more work than strictly necessary. We don't actually need earlier-than-usual freezing. We need only ensure that the relfrozenxid will guide future VACUUMs to do that freezing early enough. However, I'm not sure how to do that without directly updating relfrozenxid, so it's probably just as well to cause some extra work and stick to the standard interface. If there are tuples in a toast table containing xids that are older than the toast table's relfrozenxid, then there are only two options: 1. Make relfrozenxid go backward to the right value. There is currently no mechanism to do this without compiling C code into the server, because (a) VACUUM FREEZE will never move the relfrozenxid backward; and (b) there is no way to find the oldest xid in a table with a normal snapshot. Right, this is all to complicated. 2. Get rid of those xids older than relfrozenxid (i.e. VACUUM FREEZE). I don't know what you mean about VACUUM FREEZE doing extra work. I suppose you could set the vacuum_freeze_min_age to be exactly the right value such that it freezes everything before the existing (and wrong) relfrozenxid, but in practice I think it would be the same amount of work. We don't know how far back to go with freezing, so we just have to freeze it all. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Fri, Apr 08, 2011 at 10:05:01AM -0700, Jeff Davis wrote: On Fri, 2011-04-08 at 07:08 -0400, Noah Misch wrote: Right, VACUUM FREEZE. I now see I don't need to set vacuum_freeze_table_age if I use the FREEZE keyword, e.g. gram.y has: if (n-options VACOPT_FREEZE) n-freeze_min_age = n-freeze_table_age = 0; True; it just performs more work than strictly necessary. We don't actually need earlier-than-usual freezing. We need only ensure that the relfrozenxid will guide future VACUUMs to do that freezing early enough. However, I'm not sure how to do that without directly updating relfrozenxid, so it's probably just as well to cause some extra work and stick to the standard interface. If there are tuples in a toast table containing xids that are older than the toast table's relfrozenxid, then there are only two options: 1. Make relfrozenxid go backward to the right value. There is currently no mechanism to do this without compiling C code into the server, because (a) VACUUM FREEZE will never move the relfrozenxid backward; and (b) there is no way to find the oldest xid in a table with a normal snapshot. Couldn't you set relfrozenxid and datfrozenxid to txid_current() - 11 (the highest possible vacuum_freeze_min_age, plus some slop), then run SET vacuum_freeze_table_age = 0; VACUUM tbl on all tables for which you did this? There's no need to set relfrozenxid back to a particular right value. Not suggesting we recommend this, but I can't think offhand why it wouldn't suffice. 2. Get rid of those xids older than relfrozenxid (i.e. VACUUM FREEZE). I don't know what you mean about VACUUM FREEZE doing extra work. I suppose you could set the vacuum_freeze_min_age to be exactly the right value such that it freezes everything before the existing (and wrong) relfrozenxid, but in practice I think it would be the same amount of work. Suppose that your next xid at pg_upgrade time was 500M, and it's now 505M. If you're using the default vacuum_freeze_min_age = 50M, SET vacuum_freeze_table_age = 0; VACUUM tbl will only freeze tuples covering 5M transaction ids. VACUUM FREEZE tbl (a.k.a SET vacuum_freeze_table_age = 0; SET vacuum_freeze_min_age = 0; VACUUM tbl) will freeze tuples covering 55M transaction ids. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Bruce Momjian wrote: New version; I made some other small adjustments: -- This script fixes data in pre-PG 9.0.4 and pre-8.4.8 -- servers that was upgraded by pg_upgrade and pg_migrator. -- Run the script using psql for every database in the cluster -- except 'template0', e.g.: -- psql -U postgres -a -f pg_upgrade_fix.sql dbname -- This must be run from a writable directory. -- It will not lock any tables but will generate I/O. -- CREATE TEMPORARY TABLE pg_upgrade_fix AS SELECT 'VACUUM FREEZE pg_toast.' || c.relname || ';' FROMpg_class c, pg_namespace n WHERE c.relnamespace = n.oid AND n.nspname = 'pg_toast' AND c.relkind = 't' ORDER by c.oid; \copy pg_upgrade_fix TO 'pg_upgrade_tmp.sql'; \i pg_upgrade_tmp.sql OK, now that I have committed the fixes to git, I think it is time to consider how we are going to handle this fix for people who have already used pg_upgrade, or are using it in currently released versions. I am thinking an announce list email with this query would be enough, and state that we are planning a minor release with this fix in the next few weeks. I can provide details on the cause and behavior of the bug. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Bruce Momjian wrote: OK, now that I have committed the fixes to git, I think it is time to consider how we are going to handle this fix for people who have already used pg_upgrade, or are using it in currently released versions. I am thinking an announce list email with this query would be enough, and state that we are planning a minor release with this fix in the next few weeks. I can provide details on the cause and behavior of the bug. OK, here is a draft email announcement: --- A bug has been discovered in all released Postgres versions that performed major version upgrades using pg_upgrade and (formerly) pg_migrator. The bug can cause queries to return the following error: ERROR: could not access status of transaction ## DETAIL: could not open file pg_clog/: No such file or directory This error prevents access to very wide values stored in the database. To prevent such failures, users are encourage to run the following psql script in all upgraded databases as soon as possible; a fix will be included in upcoming Postgres releases 8.4.8 and 9.0.4: -- This script fixes data in pre-PG 9.0.4 and pre-8.4.8 -- servers that was upgraded by pg_upgrade and pg_migrator. -- Run the script using psql for every database in the cluster -- except 'template0', e.g.: -- psql -U postgres -a -f pg_upgrade_fix.sql dbname -- This must be run from a writable directory. -- It will not lock any tables but will generate I/O. -- CREATE TEMPORARY TABLE pg_upgrade_fix AS SELECT 'VACUUM FREEZE pg_toast.' || c.relname || ';' FROMpg_class c, pg_namespace n WHERE c.relnamespace = n.oid AND n.nspname = 'pg_toast' AND c.relkind = 't' ORDER by c.oid; \copy pg_upgrade_fix TO 'pg_upgrade_tmp.sql'; \i pg_upgrade_tmp.sql -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Bruce, * Bruce Momjian (br...@momjian.us) wrote: OK, here is a draft email announcement: Couple suggestions (also on IRC): --- A bug has been discovered in all released versions of pg_upgrade and (formerly) pg_migrator. Anyone who has used pg_upgrade or pg_migrator should take the following corrective actions as soon as possible. This bug can cause queries to return the following error: ERROR: could not access status of transaction ## DETAIL: could not open file pg_clog/: No such file or directory This error prevents access to very wide values stored in the database. To prevent such failures users need to run the following psql script, as the superuser, in all upgraded databases as soon as possible: -- This script fixes data in pre-PG 9.0.4 and pre-8.4.8 -- servers that was upgraded by pg_upgrade and pg_migrator. -- Run the script using psql for every database in the cluster -- except 'template0', e.g.: -- psql -U postgres -a -f pg_upgrade_fix.sql dbname -- This must be run from a writable directory. -- It will not lock any tables but will generate I/O. -- CREATE TEMPORARY TABLE pg_upgrade_fix AS SELECT 'VACUUM FREEZE pg_toast.' || c.relname || ';' FROMpg_class c, pg_namespace n WHERE c.relnamespace = n.oid AND n.nspname = 'pg_toast' AND c.relkind = 't' ORDER by c.oid; \copy pg_upgrade_fix TO 'pg_upgrade_tmp.sql'; \i pg_upgrade_tmp.sql A fix will be included in upcoming Postgres releases 8.4.8 and 9.0.4. The fixed version of pg_uprade will remove the need for the above script by correctly updating the TOAST tables in the migrated database. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade bug found!
Stephen Frost wrote: -- Start of PGP signed section. Bruce, * Bruce Momjian (br...@momjian.us) wrote: OK, here is a draft email announcement: Couple suggestions (also on IRC): Yes, I like your version better; I did adjust the wording of the last sentence to mention it is really the release, not the new pg_upgrade, which fixes the problem because the fixes are in pg_dump, and hence a new pg_upgrade binary will not work; you need a new install. --- A bug has been discovered in all released versions of pg_upgrade and (formerly) pg_migrator. Anyone who has used pg_upgrade or pg_migrator should take the following corrective actions as soon as possible. This bug can cause queries to return the following error: ERROR: could not access status of transaction ## DETAIL: could not open file pg_clog/: No such file or directory=20 This error prevents access to very wide values stored in the database. To prevent such failures users need to run the following psql script, as the superuser, in all upgraded databases as soon as possible: -- This script fixes data in pre-PG 9.0.4 and pre-8.4.8 -- servers that was upgraded by pg_upgrade and pg_migrator. -- Run the script using psql for every database in the cluster -- except 'template0', e.g.: -- psql -U postgres -a -f pg_upgrade_fix.sql dbname -- This must be run from a writable directory. -- It will not lock any tables but will generate I/O. -- CREATE TEMPORARY TABLE pg_upgrade_fix AS SELECT 'VACUUM FREEZE pg_toast.' || c.relname || ';' FROMpg_class c, pg_namespace n=20 WHERE c.relnamespace =3D n.oid AND=20 n.nspname =3D 'pg_toast' AND c.relkind =3D 't' ORDER by c.oid; \copy pg_upgrade_fix TO 'pg_upgrade_tmp.sql'; \i pg_upgrade_tmp.sql A fix will be included in upcoming Postgres releases 8.4.8 and 9.0.4. These releases will remove the need for the above script by correctly updating all TOAST tables in the migrated databases. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
* Bruce Momjian (br...@momjian.us) wrote: Yes, I like your version better; I did adjust the wording of the last sentence to mention it is really the release, not the new pg_upgrade, which fixes the problem because the fixes are in pg_dump, and hence a new pg_upgrade binary will not work; you need a new install. Err, right, good point. You might even want to call that out specifically, so no one is confused. Also this: -- This script fixes data in pre-PG 9.0.4 and pre-8.4.8 -- servers that was upgraded by pg_upgrade and pg_migrator. 'that was' should be 'that were'. A fix will be included in upcoming Postgres releases 8.4.8 and 9.0.4. These releases will remove the need for the above script by correctly updating all TOAST tables in the migrated databases. My suggestion: A fix will be included in upcoming Postgres releases 8.4.8 and 9.0.4. These releases will include an updated pg_dump which will remove the need for the above script by correctly dumping all TOAST tables in the migrated databases. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade bug found!
On Fri, 2011-04-08 at 13:35 -0400, Noah Misch wrote: 1. Make relfrozenxid go backward to the right value. There is currently no mechanism to do this without compiling C code into the server, because (a) VACUUM FREEZE will never move the relfrozenxid backward; and (b) there is no way to find the oldest xid in a table with a normal snapshot. Couldn't you set relfrozenxid and datfrozenxid to txid_current() - 11 (the highest possible vacuum_freeze_min_age, plus some slop), then run SET vacuum_freeze_table_age = 0; VACUUM tbl on all tables for which you did this? There's no need to set relfrozenxid back to a particular right value. That's a good point that we don't need relfrozenxid to really be the right value; we just need it to be less than or equal to the right value. I don't think you need to mess around with vacuum_freeze_table_age though -- that looks like it's taken care of in the logic for deciding when to do a full table vacuum. This has the additional merit that transaction IDs are not needlessly removed; therefore leaving some forensic information if there are further problems. Suppose that your next xid at pg_upgrade time was 500M, and it's now 505M. If you're using the default vacuum_freeze_min_age = 50M, SET vacuum_freeze_table_age = 0; VACUUM tbl will only freeze tuples covering 5M transaction ids. If the pg_upgrade time was at txid 500M, then the relfrozenxid of the toast table will be about 500M. That means you need to get rid of all xids less than about 500M (unless you already fixed relfrozenxid, perhaps using the process you mention above). So if you only freeze tuples less than about 455M (505M - 50M), then that is wrong. The only difference really is that you don't really need to freeze those last 5M transactions since the upgrade happened. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Stephen Frost wrote: -- Start of PGP signed section. * Bruce Momjian (br...@momjian.us) wrote: Yes, I like your version better; I did adjust the wording of the last sentence to mention it is really the release, not the new pg_upgrade, which fixes the problem because the fixes are in pg_dump, and hence a new pg_upgrade binary will not work; you need a new install. Err, right, good point. You might even want to call that out specifically, so no one is confused. Also this: -- This script fixes data in pre-PG 9.0.4 and pre-8.4.8 -- servers that was upgraded by pg_upgrade and pg_migrator. 'that was' should be 'that were'. A fix will be included in upcoming Postgres releases 8.4.8 and 9.0.4. These releases will remove the need for the above script by correctly updating all TOAST tables in the migrated databases. My suggestion: A fix will be included in upcoming Postgres releases 8.4.8 and 9.0.4. These releases will include an updated pg_dump which will remove the need for the above script by correctly dumping all TOAST tables in the migrated databases. I am worried if I mention pg_dump that people will think pg_dump is broken, when in fact it is only the --binary-upgrade mode of pg_dump that is broken. I adjusted the wording of the last paragraph slighly to be clearer, but hopefully not confuse. We don't actually check the pg_dump version and I am hesistant to add such a check. I was thinking of sending this out on Monday, but now think people might like to have the weekend to fix this so I am thinking of sending it to announce tonight, in 8 hours. OK? --- A bug has been discovered in all released versions of pg_upgrade and (formerly) pg_migrator. Anyone who has used pg_upgrade or pg_migrator should take the following corrective actions as soon as possible. This bug can cause queries to return the following error: ERROR: could not access status of transaction ## DETAIL: could not open file pg_clog/: No such file or directory=20 This error prevents access to very wide values stored in the database. To prevent such failures users need to run the following psql script, as the superuser, in all upgraded databases as soon as possible: -- This script fixes data in pre-PG 9.0.4 and pre-8.4.8 -- servers that was upgraded by pg_upgrade and pg_migrator. -- Run the script using psql for every database in the cluster -- except 'template0', e.g.: -- psql -U postgres -a -f pg_upgrade_fix.sql dbname -- This must be run from a writable directory. -- It will not lock any tables but will generate I/O. -- CREATE TEMPORARY TABLE pg_upgrade_fix AS SELECT 'VACUUM FREEZE pg_toast.' || c.relname || ';' FROMpg_class c, pg_namespace n=20 WHERE c.relnamespace =3D n.oid AND=20 n.nspname =3D 'pg_toast' AND c.relkind =3D 't' ORDER by c.oid; \copy pg_upgrade_fix TO 'pg_upgrade_tmp.sql'; \i pg_upgrade_tmp.sql A fix will be included in upcoming Postgres releases 8.4.8 and 9.0.4. These releases will remove the need for the above script by correctly dumping all TOAST tables in the migrated databases. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Bruce Momjian wrote: I am worried if I mention pg_dump that people will think pg_dump is broken, when in fact it is only the --binary-upgrade mode of pg_dump that is broken. I adjusted the wording of the last paragraph slighly to be clearer, but hopefully not confuse. We don't actually check the pg_dump version and I am hesistant to add such a check. I was thinking of sending this out on Monday, but now think people might like to have the weekend to fix this so I am thinking of sending it to announce tonight, in 8 hours. OK? Updated version with IRC user suggestions: --- Critical Fix for pg_upgrade/pg_migrator Users - A bug has been discovered in all released versions of pg_upgrade and (formerly) pg_migrator. Anyone who has used pg_upgrade or pg_migrator should take the following corrective actions as soon as possible. This bug can cause queries to return the following error: ERROR: could not access status of transaction ## DETAIL: could not open file pg_clog/: No such file or directory=20 This error prevents access to very wide values stored in the database. To prevent such failures users need to run the following psql script, as the superuser, in all upgraded databases as soon as possible: -- This script fixes data in pre-PG 9.0.4 and pre-8.4.8 -- servers that were upgraded by pg_upgrade and pg_migrator. -- Run the script using psql for every database in the cluster -- except 'template0', e.g.: -- psql -U postgres -a -f pg_upgrade_fix.sql dbname -- This must be run from a writable directory. -- It will not lock any tables but will generate I/O. -- CREATE TEMPORARY TABLE pg_upgrade_fix AS SELECT 'VACUUM FREEZE pg_toast.' || c.relname || ';' FROMpg_class c, pg_namespace n WHERE c.relnamespace = n.oid AND n.nspname = 'pg_toast' AND c.relkind = 't' ORDER by c.oid; \copy pg_upgrade_fix TO 'pg_upgrade_tmp.sql'; \i pg_upgrade_tmp.sql A fix will be included in upcoming Postgres releases 8.4.8 and 9.0.4. These releases will remove the need for the above script by correctly restoring all TOAST tables in the migrated databases. 2011-04-08 -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Fri, Apr 08, 2011 at 12:16:50PM -0700, Jeff Davis wrote: On Fri, 2011-04-08 at 13:35 -0400, Noah Misch wrote: 1. Make relfrozenxid go backward to the right value. There is currently no mechanism to do this without compiling C code into the server, because (a) VACUUM FREEZE will never move the relfrozenxid backward; and (b) there is no way to find the oldest xid in a table with a normal snapshot. Couldn't you set relfrozenxid and datfrozenxid to txid_current() - 11 (the highest possible vacuum_freeze_min_age, plus some slop), then run SET vacuum_freeze_table_age = 0; VACUUM tbl on all tables for which you did this? There's no need to set relfrozenxid back to a particular right value. That's a good point that we don't need relfrozenxid to really be the right value; we just need it to be less than or equal to the right value. I don't think you need to mess around with vacuum_freeze_table_age though -- that looks like it's taken care of in the logic for deciding when to do a full table vacuum. Actually, I think the only reason to VACUUM at all after hacking relfrozenxid is to visit every tuple and see whether you need to restore any clog segments from backup. Suppose your postgresql.conf overrides vacuum_freeze_table_age to the maximum of 2B. If you hacked relfrozenxid and just VACUUMed without modifying vacuum_freeze_table_age, you wouldn't get a full table scan. In another ~1B transactions, you'll get that full-table VACUUM, and it might then discover missing clog segments. Though you wouldn't risk any new clog loss in the mean time, by doing the VACUUM with vacuum_freeze_table_age=0 on each affected table, you can go away confident that any clog restoration is behind you. This has the additional merit that transaction IDs are not needlessly removed; therefore leaving some forensic information if there are further problems. Suppose that your next xid at pg_upgrade time was 500M, and it's now 505M. If you're using the default vacuum_freeze_min_age = 50M, SET vacuum_freeze_table_age = 0; VACUUM tbl will only freeze tuples covering 5M transaction ids. If the pg_upgrade time was at txid 500M, then the relfrozenxid of the toast table will be about 500M. That means you need to get rid of all xids less than about 500M (unless you already fixed relfrozenxid, perhaps using the process you mention above). So if you only freeze tuples less than about 455M (505M - 50M), then that is wrong. Agreed. If you don't fix relfrozenxid, you can't win much in that example. The only difference really is that you don't really need to freeze those last 5M transactions since the upgrade happened. But change the numbers somewhat. Say you ran pg_upgrade at xid 110M. Your TOAST table had relfrozenxid = 100M before pg_upgrade and 110M+epsilon after. The next xid now sits at 170M. Without any manual relfrozenxid changes, any full-table VACUUM will bump the relfrozenxid to 120M. A VACUUM FREEZE would freeze tuples covering 70M transactions, while a VACUUM with vacuum_freeze_table_age = 0 would freeze tuples across only 20M transactions. An unadorned VACUUM wouldn't even perform a full-table scan. All that being said, recommending VACUUM FREEZE seems sensibly conservative. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
-- It will not lock any tables but will generate I/O. add: IMPORTANT: Depending on the size and configuration of your database, this script may generate a lot of I/O and degrade database performance. Users should execute this script during a low traffic period and watch the database load. -- CREATE TEMPORARY TABLE pg_upgrade_fix AS SELECT 'VACUUM FREEZE pg_toast.' || c.relname || ';' FROMpg_class c, pg_namespace n WHERE c.relnamespace = n.oid AND n.nspname = 'pg_toast' AND c.relkind = 't' ORDER by c.oid; \copy pg_upgrade_fix TO 'pg_upgrade_tmp.sql'; \i pg_upgrade_tmp.sql A fix will be included in upcoming Postgres releases 8.4.8 and 9.0.4. These releases will remove the need for the above script by correctly restoring all TOAST tables in the migrated databases. add: However, users of databases which have been already migrated still need to run the script, even if they upgrade to 9.0.4. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Josh Berkus j...@agliodbs.com writes: -- It will not lock any tables but will generate I/O. add: IMPORTANT: Depending on the size and configuration of your database, this script may generate a lot of I/O and degrade database performance. Users should execute this script during a low traffic period and watch the database load. It might be worth suggesting that people can adjust their vacuum delay parameters to control the I/O load. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: -- It will not lock any tables but will generate I/O. add: IMPORTANT: Depending on the size and configuration of your database, this script may generate a lot of I/O and degrade database performance. Users should execute this script during a low traffic period and watch the database load. It might be worth suggesting that people can adjust their vacuum delay parameters to control the I/O load. I talked to Tom about this and I am worried people will adjust it so it takes days to complete. Is that a valid concern? Does anyone have a suggested value? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Bruce Momjian wrote: Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: -- It will not lock any tables but will generate I/O. add: IMPORTANT: Depending on the size and configuration of your database, this script may generate a lot of I/O and degrade database performance. Users should execute this script during a low traffic period and watch the database load. It might be worth suggesting that people can adjust their vacuum delay parameters to control the I/O load. I talked to Tom about this and I am worried people will adjust it so it takes days to complete. Is that a valid concern? Does anyone have a suggested value? Josh Berkus helped me get lots more details on a wiki page for this: http://wiki.postgresql.org/wiki/20110408pg_upgrade_fix I will reference the wiki in the email announcement. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Fri, 2011-04-08 at 15:03 -0400, Bruce Momjian wrote: A fix will be included in upcoming Postgres releases 8.4.8 and 9.0.4. These releases will remove the need for the above script by correctly updating all TOAST tables in the migrated databases. You might want to clarify that the fix may be required if you ever used pg_upgrade before. Using the new version of pg_upgrade/dump when you still have a bad relfrozenxid doesn't help. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Fri, Apr 8, 2011 at 4:00 PM, Jeff Davis pg...@j-davis.com wrote: On Fri, 2011-04-08 at 15:03 -0400, Bruce Momjian wrote: A fix will be included in upcoming Postgres releases 8.4.8 and 9.0.4. These releases will remove the need for the above script by correctly updating all TOAST tables in the migrated databases. You might want to clarify that the fix may be required if you ever used pg_upgrade before. Using the new version of pg_upgrade/dump when you still have a bad relfrozenxid doesn't help. Regards, Jeff Davis I've been noticing in my logs for the past few days the message you note in the wiki. It seems to occur during a vacuum around 7:30am every day. I will be running the suggested script shortly, but can anyone tell me in how bad of shape my db is in? This is our production db with two hot standby's running off it. grep -i 'could not access status of transaction' postgresql-2011-04*.log postgresql-2011-04-06.log:2011-04-06 07:28:27 PDT [15882]: [1-1] (user=postgres) (rhost=[local]) ERROR: could not access status of transaction 1273385235 postgresql-2011-04-07.log:2011-04-07 07:27:14 PDT [29790]: [1-1] (user=postgres) (rhost=[local]) ERROR: could not access status of transaction 1273385235 postgresql-2011-04-08.log:2011-04-08 07:26:35 PDT [2402]: [1-1] (user=postgres) (rhost=[local]) ERROR: could not access status of transaction 1273385235 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Fri, Apr 8, 2011 at 4:51 PM, bricklen brick...@gmail.com wrote: I've been noticing in my logs for the past few days the message you note in the wiki. It seems to occur during a vacuum around 7:30am every day. I will be running the suggested script shortly, but can anyone tell me in how bad of shape my db is in? This is our production db with two hot standby's running off it. grep -i 'could not access status of transaction' postgresql-2011-04*.log postgresql-2011-04-06.log:2011-04-06 07:28:27 PDT [15882]: [1-1] (user=postgres) (rhost=[local]) ERROR: could not access status of transaction 1273385235 postgresql-2011-04-07.log:2011-04-07 07:27:14 PDT [29790]: [1-1] (user=postgres) (rhost=[local]) ERROR: could not access status of transaction 1273385235 postgresql-2011-04-08.log:2011-04-08 07:26:35 PDT [2402]: [1-1] (user=postgres) (rhost=[local]) ERROR: could not access status of transaction 1273385235 version 9.03, if that helps -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
bricklen, * bricklen (brick...@gmail.com) wrote: I've been noticing in my logs for the past few days the message you note in the wiki. It seems to occur during a vacuum around 7:30am every day. I will be running the suggested script shortly, but can anyone tell me in how bad of shape my db is in? This is our production db with two hot standby's running off it. Unfortunately, I don't think the script that Bruce posted will help if the clog files have been removed (which appears to be the case here). Do you have a backup which includes older files which existed under the 'pg_clog' directory under your database's root? Hopefully you do and can restore those and restart the database. If you restore and then restart then Bruce's script could be run and hopefully would clear out these errors. Bruce, please correct me if I got any part of this wrong. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade bug found!
Hi Stephen, On Fri, Apr 8, 2011 at 6:57 PM, Stephen Frost sfr...@snowman.net wrote: bricklen, * bricklen (brick...@gmail.com) wrote: I've been noticing in my logs for the past few days the message you note in the wiki. It seems to occur during a vacuum around 7:30am every day. I will be running the suggested script shortly, but can anyone tell me in how bad of shape my db is in? This is our production db with two hot standby's running off it. Unfortunately, I don't think the script that Bruce posted will help if the clog files have been removed (which appears to be the case here). Do you have a backup which includes older files which existed under the 'pg_clog' directory under your database's root? Hopefully you do and can restore those and restart the database. If you restore and then restart then Bruce's script could be run and hopefully would clear out these errors. Bruce, please correct me if I got any part of this wrong. Thanks, Stephen I looked deeper into our backup archives, and it appears that I do have the clog file reference in the error message DETAIL: Could not open file pg_clog/04BE: No such file or directory. It exists in an untouched backup directory that I originally made when I set up the backup and ran pg_upgrade. I'm not sure if it is from version 8.4 or 9.0.2 though. Is it safe to just copy it into my production pg_clog dir and restart? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
bricklen, * bricklen (brick...@gmail.com) wrote: I looked deeper into our backup archives, and it appears that I do have the clog file reference in the error message DETAIL: Could not open file pg_clog/04BE: No such file or directory. Great! And there's no file in pg_clog which matches that name (or exist which are smaller in value), right? It exists in an untouched backup directory that I originally made when I set up the backup and ran pg_upgrade. I'm not sure if it is from version 8.4 or 9.0.2 though. Is it safe to just copy it into my production pg_clog dir and restart? It should be, provided you're not overwriting any files or putting a clog file in place which is greater than the other clog files in that directory. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade bug found!
On Fri, Apr 8, 2011 at 7:11 PM, Stephen Frost sfr...@snowman.net wrote: bricklen, * bricklen (brick...@gmail.com) wrote: I looked deeper into our backup archives, and it appears that I do have the clog file reference in the error message DETAIL: Could not open file pg_clog/04BE: No such file or directory. Great! And there's no file in pg_clog which matches that name (or exist which are smaller in value), right? It exists in an untouched backup directory that I originally made when I set up the backup and ran pg_upgrade. I'm not sure if it is from version 8.4 or 9.0.2 though. Is it safe to just copy it into my production pg_clog dir and restart? It should be, provided you're not overwriting any files or putting a clog file in place which is greater than the other clog files in that directory. It appears that there are no files lower. Missing clog: 04BE production pg_clog dir: ls -lhrt 9.0/data/pg_clog total 38M -rw--- 1 postgres postgres 256K Jan 25 21:04 04BF -rw--- 1 postgres postgres 256K Jan 26 12:35 04C0 -rw--- 1 postgres postgres 256K Jan 26 20:58 04C1 -rw--- 1 postgres postgres 256K Jan 27 13:02 04C2 -rw--- 1 postgres postgres 256K Jan 28 01:00 04C3 ... old backup pg_clog dir (possibly v8.4) ... -rw--- 1 postgres postgres 256K Jan 23 21:11 04BB -rw--- 1 postgres postgres 256K Jan 24 08:56 04BC -rw--- 1 postgres postgres 256K Jan 25 06:32 04BD -rw--- 1 postgres postgres 256K Jan 25 10:58 04BE -rw--- 1 postgres postgres 256K Jan 25 20:44 04BF -rw--- 1 postgres postgres 8.0K Jan 25 20:54 04C0 So, if I have this right, my steps to take are: - copy the backup 04BE to production pg_clog dir - restart the database - run Bruce's script Does that sound right? Has anyone else experienced this? I'm leery of testing this on my production db, as our last pg_dump was from early this morning, so I apologize for being so cautious. Thanks, Bricklen -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Fri, Apr 8, 2011 at 7:20 PM, bricklen brick...@gmail.com wrote: On Fri, Apr 8, 2011 at 7:11 PM, Stephen Frost sfr...@snowman.net wrote: bricklen, * bricklen (brick...@gmail.com) wrote: I looked deeper into our backup archives, and it appears that I do have the clog file reference in the error message DETAIL: Could not open file pg_clog/04BE: No such file or directory. Great! And there's no file in pg_clog which matches that name (or exist which are smaller in value), right? It exists in an untouched backup directory that I originally made when I set up the backup and ran pg_upgrade. I'm not sure if it is from version 8.4 or 9.0.2 though. Is it safe to just copy it into my production pg_clog dir and restart? It should be, provided you're not overwriting any files or putting a clog file in place which is greater than the other clog files in that directory. It appears that there are no files lower. Missing clog: 04BE production pg_clog dir: ls -lhrt 9.0/data/pg_clog total 38M -rw--- 1 postgres postgres 256K Jan 25 21:04 04BF -rw--- 1 postgres postgres 256K Jan 26 12:35 04C0 -rw--- 1 postgres postgres 256K Jan 26 20:58 04C1 -rw--- 1 postgres postgres 256K Jan 27 13:02 04C2 -rw--- 1 postgres postgres 256K Jan 28 01:00 04C3 ... old backup pg_clog dir (possibly v8.4) ... -rw--- 1 postgres postgres 256K Jan 23 21:11 04BB -rw--- 1 postgres postgres 256K Jan 24 08:56 04BC -rw--- 1 postgres postgres 256K Jan 25 06:32 04BD -rw--- 1 postgres postgres 256K Jan 25 10:58 04BE -rw--- 1 postgres postgres 256K Jan 25 20:44 04BF -rw--- 1 postgres postgres 8.0K Jan 25 20:54 04C0 So, if I have this right, my steps to take are: - copy the backup 04BE to production pg_clog dir - restart the database - run Bruce's script Does that sound right? Has anyone else experienced this? I'm leery of testing this on my production db, as our last pg_dump was from early this morning, so I apologize for being so cautious. Thanks, Bricklen What I've tested and current status: When I saw the announcement a few hours ago, I started setting up a 9.0.3 hot standby. I brought it live a few minutes ago. - I copied over the 04BE clog from the original backup, - restarted the standby cluster - ran the script against the main database and turned up a bunch of other transactions that were missing: psql:pg_upgrade_tmp.sql:539: ERROR: could not access status of transaction 1248683931 DETAIL: Could not open file pg_clog/04A6: No such file or directory. psql:pg_upgrade_tmp.sql:540: ERROR: could not access status of transaction 1249010987 DETAIL: Could not open file pg_clog/04A7: No such file or directory. psql:pg_upgrade_tmp.sql:541: ERROR: could not access status of transaction 1250325059 DETAIL: Could not open file pg_clog/04A8: No such file or directory. psql:pg_upgrade_tmp.sql:542: ERROR: could not access status of transaction 1252759918 DETAIL: Could not open file pg_clog/04AA: No such file or directory. psql:pg_upgrade_tmp.sql:543: ERROR: could not access status of transaction 1254527871 DETAIL: Could not open file pg_clog/04AC: No such file or directory. psql:pg_upgrade_tmp.sql:544: ERROR: could not access status of transaction 1256193334 DETAIL: Could not open file pg_clog/04AD: No such file or directory. psql:pg_upgrade_tmp.sql:556: ERROR: could not access status of transaction 1268739471 DETAIL: Could not open file pg_clog/04B9: No such file or directory. I checked, and found that each one of those files exists in the original backup location. - scp'd those files to the hot standby clog directory, - pg_ctl stop -m fast - pg_ctl start - ran the script Hit a bunch of missing clog file errors like above, repeated the scp + bounce + script process 4 or 5 more times until no more missing clog file messages surfaced. Now, is this safe to run against my production database? **Those steps again, to run against prod: cp the clog files from the original backup to dir to my production pg_clog dir bounce the database run the script against all database in the cluster Anyone have any suggestions or changes before I commit myself to this course of action? Thanks, Bricklen -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
bricklen, * bricklen (brick...@gmail.com) wrote: Now, is this safe to run against my production database? Yes, with a few caveats. One recommendation is to also increase autovacuum_freeze_max_age to 5 (500m), which will hopefully prevent autovacuum from 'butting in' and causing issues during the process. Also, a database-wide 'VACUUM FREEZE;' should be lower-risk, if you can afford it (it will cause a lot of i/o on the system). The per-table 'VACUUM FREEZE table;' that the script does can end up removing clog files prematurely. Anyone have any suggestions or changes before I commit myself to this course of action? If you run into problems, and perhaps even before starting, you may want to pop in to #postgresql on irc.freenode.net, there are people there who can help you with this process who are very familiar with PG. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade bug found!
Stephen Frost wrote: -- Start of PGP signed section. bricklen, * bricklen (brick...@gmail.com) wrote: Now, is this safe to run against my production database? Yes, with a few caveats. One recommendation is to also increase autovacuum_freeze_max_age to 5 (500m), which will hopefully prevent autovacuum from 'butting in' and causing issues during the process. Also, a database-wide 'VACUUM FREEZE;' should be lower-risk, if you can afford it (it will cause a lot of i/o on the system). The per-table 'VACUUM FREEZE table;' that the script does can end up removing clog files prematurely. Anyone have any suggestions or changes before I commit myself to this course of action? If you run into problems, and perhaps even before starting, you may want to pop in to #postgresql on irc.freenode.net, there are people there who can help you with this process who are very familiar with PG. Stephen is 100% correct and we have updated the wiki to explain recovery details: http://wiki.postgresql.org/wiki/20110408pg_upgrade_fix -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Fri, Apr 8, 2011 at 8:07 PM, Bruce Momjian br...@momjian.us wrote: Stephen Frost wrote: -- Start of PGP signed section. bricklen, * bricklen (brick...@gmail.com) wrote: Now, is this safe to run against my production database? Yes, with a few caveats. One recommendation is to also increase autovacuum_freeze_max_age to 5 (500m), which will hopefully prevent autovacuum from 'butting in' and causing issues during the process. Also, a database-wide 'VACUUM FREEZE;' should be lower-risk, if you can afford it (it will cause a lot of i/o on the system). The per-table 'VACUUM FREEZE table;' that the script does can end up removing clog files prematurely. Anyone have any suggestions or changes before I commit myself to this course of action? If you run into problems, and perhaps even before starting, you may want to pop in to #postgresql on irc.freenode.net, there are people there who can help you with this process who are very familiar with PG. Stephen is 100% correct and we have updated the wiki to explain recovery details: http://wiki.postgresql.org/wiki/20110408pg_upgrade_fix Thanks guys, I really appreciate your help. For the vacuum freeze, you say database-wide, should I run vacuumdb -a -v -F ? Will freezing the other tables in the cluster help (not sure how that works with template0/1 databases?) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
bricklen, * bricklen (brick...@gmail.com) wrote: Thanks guys, I really appreciate your help. For the vacuum freeze, you say database-wide, should I run vacuumdb -a -v -F ? Will freezing the other tables in the cluster help (not sure how that works with template0/1 databases?) Yes, using the command-line 'vacuumdb -a -v -F' would work. It won't try to vacuum template0, and doing template1 is correct. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade bug found!
bricklen wrote: On Fri, Apr 8, 2011 at 8:07 PM, Bruce Momjian br...@momjian.us wrote: Stephen Frost wrote: -- Start of PGP signed section. bricklen, * bricklen (brick...@gmail.com) wrote: Now, is this safe to run against my production database? Yes, with a few caveats. ?One recommendation is to also increase autovacuum_freeze_max_age to 5 (500m), which will hopefully prevent autovacuum from 'butting in' and causing issues during the process. ?Also, a database-wide 'VACUUM FREEZE;' should be lower-risk, if you can afford it (it will cause a lot of i/o on the system). ?The per-table 'VACUUM FREEZE table;' that the script does can end up removing clog files prematurely. Anyone have any suggestions or changes before I commit myself to this course of action? If you run into problems, and perhaps even before starting, you may want to pop in to #postgresql on irc.freenode.net, there are people there who can help you with this process who are very familiar with PG. Stephen is 100% correct and we have updated the wiki to explain recovery details: ? ? ? ?http://wiki.postgresql.org/wiki/20110408pg_upgrade_fix Thanks guys, I really appreciate your help. For the vacuum freeze, you say database-wide, should I run vacuumdb -a -v -F ? Will freezing the other tables in the cluster help (not sure how that works with template0/1 databases?) Exactly. Internally pg_upgrade uses: vacuumdb --all --freeze in the empty new cluster to remove any links to pg_clog files (because we copy the old pg_clog files into the new cluster directory). (This is proof that the old and new clog files are the same format.) If you run vacuumdb as above in the new cluster, it will again remove any requirement on pg_clog, which is our goal. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Why is it important to have the original pg_clog files around? Since the transactions in question are below the freeze horizon, surely the tuples that involve those transaction have all been visited by vacuum and thus removed if they were leftover from aborted transactions or deleted, no? So you could just fill those files with the 0x55 pattern (signalling all transactions are committed) and the net result should be the same. No? Forgive me if I'm missing something. I haven't been following this thread and I'm more than a little tired (but wanted to shoot this today because I'm gonna be able to, until Monday). -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Fri, Apr 8, 2011 at 9:28 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Why is it important to have the original pg_clog files around? Since the transactions in question are below the freeze horizon, surely the tuples that involve those transaction have all been visited by vacuum and thus removed if they were leftover from aborted transactions or deleted, no? So you could just fill those files with the 0x55 pattern (signalling all transactions are committed) and the net result should be the same. No? Forgive me if I'm missing something. I haven't been following this thread and I'm more than a little tired (but wanted to shoot this today because I'm gonna be able to, until Monday). -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support Update on the status of the steps we took, which were: - test on a hot standby by bringing it live, running the script, determing the missing clog files, copying them into the live (hot standby) pg_clog dir Now, on the master, copied the same old clog files into the production *master*, ran vacuumdb -a -v -F. The step I should have taken on the master before the vacuumdb -F would have been to run the http://wiki.postgresql.org/wiki/20110408pg_upgrade_fix script to see if I was missing any clog files on the master. That vacuum freeze step pointed out a clog file, I copied that into the master pg_clog dir, ran the aforementioned script. It didn't fail on any of the clog files this time, so now I am rerunning the vacuum freeze command and hoping like hell it works! If the current run of the vacuum freeze fails, I'll report back. Thanks again for everyone's help. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Fri, Apr 8, 2011 at 10:01 PM, bricklen brick...@gmail.com wrote: Update on the status of the steps we took, which were: - test on a hot standby by bringing it live, running the script, determing the missing clog files, copying them into the live (hot standby) pg_clog dir Now, on the master, copied the same old clog files into the production *master*, ran vacuumdb -a -v -F. The step I should have taken on the master before the vacuumdb -F would have been to run the http://wiki.postgresql.org/wiki/20110408pg_upgrade_fix script to see if I was missing any clog files on the master. That vacuum freeze step pointed out a clog file, I copied that into the master pg_clog dir, ran the aforementioned script. It didn't fail on any of the clog files this time, so now I am rerunning the vacuum freeze command and hoping like hell it works! If the current run of the vacuum freeze fails, I'll report back. Thanks again for everyone's help. The vacuumdb -a -v F completed successfully this time. Cheers! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Bruce Momjian wrote: OK, thanks to RhodiumToad on IRC, I was able to determine the cause of the two reported pg_upgrade problems he saw via IRC. It seems toast tables have xids and pg_dump is not preserving the toast relfrozenxids as it should. Heap tables have preserved relfrozenxids, but if you update a heap row but don't change the toast value, and the old heap row is later removed, the toast table can have an older relfrozenxids than the heap table. The fix for this is to have pg_dump preserve toast relfrozenxids, which can be easily added and backpatched. We might want to push a 9.0.4 for this. Second, we need to find a way for people to detect and fix existing systems that have this problem, perhaps looming when the pg_class relfrozenxid passes the toast relfrozenxid, and thirdly, we need to figure out how to get this information to users. Perhaps the communication comes through the 9.0.4 release announcement. I am not sure how to interpret the lack of replies to this email. Either it is confidence, shock, or we told you so. ;-) Anyway, the attached patch fixes the problem. The fix is for pg_dump's binary upgrade mode. This would need to be backpatched back to 8.4 because pg_migrator needs this too. I have added a personal regression test to show which pg_class.relfrozenxid values are not preserved, and with this patch the only ones not preserved are toast tables used by system tables, which are not copied from the old cluster (FirstNormalObjectId = 16384). I am attaching that old/new pg_class.relfrozenxid diff as well. Any idea how to correct existing systems? Would VACUUM FREEZE of just the toast tables work? I perhaps could create a short DO block that would vacuum freeze just toast tables; it would have to be run in every database. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c new file mode 100644 index 3f6e77b..1ccdb4d *** a/src/bin/pg_dump/pg_dump.c --- b/src/bin/pg_dump/pg_dump.c *** getTables(int *numTables) *** 3812,3817 --- 3812,3819 int i_relhasrules; int i_relhasoids; int i_relfrozenxid; + int i_toastoid; + int i_toastfrozenxid; int i_relpersistence; int i_owning_tab; int i_owning_col; *** getTables(int *numTables) *** 3855,3861 (%s c.relowner) AS rolname, c.relchecks, c.relhastriggers, c.relhasindex, c.relhasrules, c.relhasoids, ! c.relfrozenxid, c.relpersistence, CASE WHEN c.reloftype 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, d.refobjid AS owning_tab, d.refobjsubid AS owning_col, --- 3857,3865 (%s c.relowner) AS rolname, c.relchecks, c.relhastriggers, c.relhasindex, c.relhasrules, c.relhasoids, ! c.relfrozenxid, tc.oid AS toid, ! tc.relfrozenxid AS tfrozenxid, ! c.relpersistence, CASE WHEN c.reloftype 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, d.refobjid AS owning_tab, d.refobjsubid AS owning_col, *** getTables(int *numTables) *** 3889,3895 (%s c.relowner) AS rolname, c.relchecks, c.relhastriggers, c.relhasindex, c.relhasrules, c.relhasoids, ! c.relfrozenxid, 'p' AS relpersistence, CASE WHEN c.reloftype 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, d.refobjid AS owning_tab, d.refobjsubid AS owning_col, --- 3893,3901 (%s c.relowner) AS rolname, c.relchecks, c.relhastriggers, c.relhasindex, c.relhasrules, c.relhasoids, ! c.relfrozenxid, tc.oid AS toid, ! tc.relfrozenxid AS tfrozenxid, ! 'p' AS relpersistence, CASE WHEN c.reloftype 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, d.refobjid AS owning_tab, d.refobjsubid AS owning_col, *** getTables(int *numTables) *** 3922,3928 (%s c.relowner) AS rolname, c.relchecks, c.relhastriggers, c.relhasindex, c.relhasrules, c.relhasoids, ! c.relfrozenxid, 'p' AS relpersistence, NULL AS reloftype, d.refobjid AS owning_tab, d.refobjsubid AS owning_col, --- 3928,3936 (%s c.relowner) AS rolname, c.relchecks, c.relhastriggers, c.relhasindex, c.relhasrules, c.relhasoids, ! c.relfrozenxid, tc.oid AS toid, ! tc.relfrozenxid AS tfrozenxid, ! 'p' AS relpersistence, NULL AS reloftype, d.refobjid AS owning_tab, d.refobjsubid AS owning_col, *** getTables(int *numTables) *** 3955,3961 (%s relowner)
Re: [HACKERS] pg_upgrade bug found!
On 4/7/11 9:16 AM, Bruce Momjian wrote: OK, thanks to RhodiumToad on IRC, I was able to determine the cause of the two reported pg_upgrade problems he saw via IRC. BTW, just for the release notes, RhodiumToad == Andrew Gierth. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Thu, 2011-04-07 at 12:16 -0400, Bruce Momjian wrote: Bruce Momjian wrote: OK, thanks to RhodiumToad on IRC, I was able to determine the cause of the two reported pg_upgrade problems he saw via IRC. It seems toast tables have xids and pg_dump is not preserving the toast relfrozenxids as it should. Heap tables have preserved relfrozenxids, but if you update a heap row but don't change the toast value, and the old heap row is later removed, the toast table can have an older relfrozenxids than the heap table. The fix for this is to have pg_dump preserve toast relfrozenxids, which can be easily added and backpatched. We might want to push a 9.0.4 for this. Second, we need to find a way for people to detect and fix existing systems that have this problem, perhaps looming when the pg_class relfrozenxid passes the toast relfrozenxid, and thirdly, we need to figure out how to get this information to users. Perhaps the communication comes through the 9.0.4 release announcement. I am not sure how to interpret the lack of replies to this email. Either it is confidence, shock, or we told you so. ;-) Anyway, the attached patch fixes the problem. The fix is for pg_dump's binary upgrade mode. This would need to be backpatched back to 8.4 because pg_migrator needs this too. I have added a personal regression test to show which pg_class.relfrozenxid values are not preserved, and with this patch the only ones not preserved are toast tables used by system tables, which are not copied from the old cluster (FirstNormalObjectId = 16384). I am attaching that old/new pg_class.relfrozenxid diff as well. Any idea how to correct existing systems? Would VACUUM FREEZE of just the toast tables work? VACUUM FREEZE will never set the relfrozenxid backward. If it was never preserved to begin with, I assume that the existing value could be arbitrarily before or after, so it might not be updated. I think that after you VACUUM FREEZE the toast table, then the real oldest frozen xid (as opposed to the bad value in relfrozenxid for the toast table) would have to be the same or newer than that of the heap. Right? That means you could safely copy the heap's relfrozenxid to the relfrozenxid of its toast table. I perhaps could create a short DO block that would vacuum freeze just toast tables; it would have to be run in every database. Well, that won't work, because VACUUM can't be executed in a transaction block or function. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Thu, Apr 07, 2011 at 12:16:55PM -0400, Bruce Momjian wrote: Bruce Momjian wrote: OK, thanks to RhodiumToad on IRC, I was able to determine the cause of the two reported pg_upgrade problems he saw via IRC. It seems toast tables have xids and pg_dump is not preserving the toast relfrozenxids as it should. Heap tables have preserved relfrozenxids, but if you update a heap row but don't change the toast value, and the old heap row is later removed, the toast table can have an older relfrozenxids than the heap table. The fix for this is to have pg_dump preserve toast relfrozenxids, which can be easily added and backpatched. We might want to push a 9.0.4 for this. Second, we need to find a way for people to detect and fix existing systems that have this problem, perhaps looming when the pg_class relfrozenxid passes the toast relfrozenxid, and thirdly, we need to figure out how to get this information to users. Perhaps the communication comes through the 9.0.4 release announcement. I am not sure how to interpret the lack of replies to this email. Either it is confidence, shock, or we told you so. ;-) Your explanation and patch make sense. Seems all too clear in retrospect. Any idea how to correct existing systems? Would VACUUM FREEZE of just the toast tables work? I perhaps could create a short DO block that would vacuum freeze just toast tables; it would have to be run in every database. I see three cases: 1) The pg_class.relfrozenxid that the TOAST table should have received (true relfrozenxid) is still covered by available clog files. Fixable with some combination of pg_class.relfrozenxid twiddling and SET vacuum_freeze_table_age = 0; VACUUM toasttbl. 2) The true relfrozenxid is no longer covered by available clog files. The fix for case 1 will get file foo doesn't exist, reading as zeroes log messages, and we will treat all transactions as uncommitted. Not generally fixable after that has happened. We could probably provide a recipe for checking whether it could have happened given access to a backup from just before the upgrade. 3) Enough transaction xids have elapsed such that the true relfrozenxid is again covered by clog files, but those records are unrelated to the original transactions. Actually, I don't think this can happen, even with the maximum autovacuum_freeze_max_age. I haven't tested those, so I'm sure there's some error in that assessment. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Jeff Davis wrote: I have added a personal regression test to show which pg_class.relfrozenxid values are not preserved, and with this patch the only ones not preserved are toast tables used by system tables, which are not copied from the old cluster (FirstNormalObjectId = 16384). I am attaching that old/new pg_class.relfrozenxid diff as well. Any idea how to correct existing systems? Would VACUUM FREEZE of just the toast tables work? VACUUM FREEZE will never set the relfrozenxid backward. If it was never preserved to begin with, I assume that the existing value could be arbitrarily before or after, so it might not be updated. I think that after you VACUUM FREEZE the toast table, then the real oldest frozen xid (as opposed to the bad value in relfrozenxid for the toast table) would have to be the same or newer than that of the heap. Right? That means you could safely copy the heap's relfrozenxid to the relfrozenxid of its toast table. OK, so the only other idea I have is to write some pretty complicated query function that does a sequential scan of each toast table and pulls the earliest xmin/xmax from the tables and use that to set the relfrozenxid (pretty complicated because it has to deal with the freeze horizon and wraparound). I perhaps could create a short DO block that would vacuum freeze just toast tables; it would have to be run in every database. Well, that won't work, because VACUUM can't be executed in a transaction block or function. Good point. The only bright part of this is that missing clog will throw an error so we are not returning incorrect data, and hopefully people will report problems to us when it happens. Ideally I would like to get this patch and correction code out into the field in case more people run into this problem. I know some will, I just don't know how many. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Thu, Apr 7, 2011 at 3:46 PM, Bruce Momjian br...@momjian.us wrote: Jeff Davis wrote: I have added a personal regression test to show which pg_class.relfrozenxid values are not preserved, and with this patch the only ones not preserved are toast tables used by system tables, which are not copied from the old cluster (FirstNormalObjectId = 16384). I am attaching that old/new pg_class.relfrozenxid diff as well. Any idea how to correct existing systems? Would VACUUM FREEZE of just the toast tables work? VACUUM FREEZE will never set the relfrozenxid backward. If it was never preserved to begin with, I assume that the existing value could be arbitrarily before or after, so it might not be updated. I think that after you VACUUM FREEZE the toast table, then the real oldest frozen xid (as opposed to the bad value in relfrozenxid for the toast table) would have to be the same or newer than that of the heap. Right? That means you could safely copy the heap's relfrozenxid to the relfrozenxid of its toast table. OK, so the only other idea I have is to write some pretty complicated query function that does a sequential scan of each toast table and pulls the earliest xmin/xmax from the tables and use that to set the relfrozenxid (pretty complicated because it has to deal with the freeze horizon and wraparound). I perhaps could create a short DO block that would vacuum freeze just toast tables; it would have to be run in every database. Well, that won't work, because VACUUM can't be executed in a transaction block or function. Good point. The only bright part of this is that missing clog will throw an error so we are not returning incorrect data, and hopefully people will report problems to us when it happens. Ideally I would like to get this patch and correction code out into the field in case more people run into this problem. I know some will, I just don't know how many. ISTM we need to force a minor release once we are sure this has been corrected. We had also probably put out an announcement warning people that have already used pg_upgrade of possible data corruption. I'm not sure exactly what the language around that should be, but this does seem pretty bad. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Robert Haas robertmh...@gmail.com wrote: ISTM we need to force a minor release once we are sure this has been corrected. We had also probably put out an announcement warning people that have already used pg_upgrade of possible data corruption. I'm not sure exactly what the language around that should be, but this does seem pretty bad. We just used this to upgrade all of our databases to 9.0. Most of those (particularly the databases where data originates) have VACUUM FREEZE ANALYZE run nightly, and we ran this against all databases right after each pg_upgrade. Will that have offered us some protection from this bug? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Thu, 2011-04-07 at 15:46 -0400, Bruce Momjian wrote: OK, so the only other idea I have is to write some pretty complicated query function that does a sequential scan of each toast table and pulls the earliest xmin/xmax from the tables and use that to set the relfrozenxid (pretty complicated because it has to deal with the freeze horizon and wraparound). That sounds like the correct way to fix the situation, although it's a little more work to install another function just for this one-time purpose. TransactionIdPrecedes() should already account for wraparound, so I don't think that it will be too complicated (make sure to read every tuple though, not just the ones currently visible). Stepping back a second to make sure I understand the problem: the only problem is that relfrozenxid on the toast table after an upgrade is wrong. Correct? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Jeff Davis wrote: On Thu, 2011-04-07 at 15:46 -0400, Bruce Momjian wrote: OK, so the only other idea I have is to write some pretty complicated query function that does a sequential scan of each toast table and pulls the earliest xmin/xmax from the tables and use that to set the relfrozenxid (pretty complicated because it has to deal with the freeze horizon and wraparound). That sounds like the correct way to fix the situation, although it's a little more work to install another function just for this one-time purpose. TransactionIdPrecedes() should already account for wraparound, so I don't think that it will be too complicated (make sure to read every tuple though, not just the ones currently visible). I want to avoid anything that requires a compile because they are hard for many sites to install so TransactionIdPrecedes() is out. We will need to do this in PL/pgSQL probably. Stepping back a second to make sure I understand the problem: the only problem is that relfrozenxid on the toast table after an upgrade is wrong. Correct? Yes, it was not restored from the old cluster. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Jeff Davis wrote: On Thu, 2011-04-07 at 15:46 -0400, Bruce Momjian wrote: OK, so the only other idea I have is to write some pretty complicated query function that does a sequential scan of each toast table and pulls the earliest xmin/xmax from the tables and use that to set the relfrozenxid (pretty complicated because it has to deal with the freeze horizon and wraparound). That sounds like the correct way to fix the situation, although it's a little more work to install another function just for this one-time purpose. TransactionIdPrecedes() should already account for wraparound, so I don't think that it will be too complicated (make sure to read every tuple though, not just the ones currently visible). Stepping back a second to make sure I understand the problem: the only problem is that relfrozenxid on the toast table after an upgrade is wrong. Correct? One minimal solution might be to set the toast relfozenxid to match the heap frozenxid? Ideas? It is not 100% accurate but it might help. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Robert Haas wrote: Well, that won't work, because VACUUM can't be executed in a transaction block or function. Good point. The only bright part of this is that missing clog will throw an error so we are not returning incorrect data, and hopefully people will report problems to us when it happens. Ideally I would like to get this patch and correction code out into the field in case more people run into this problem. ?I know some will, I just don't know how many. ISTM we need to force a minor release once we are sure this has been corrected. We had also probably put out an announcement warning people that have already used pg_upgrade of possible data corruption. I'm not sure exactly what the language around that should be, but this does seem pretty bad. Yep, pretty bad it is. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Bruce Momjian wrote: Robert Haas wrote: Well, that won't work, because VACUUM can't be executed in a transaction block or function. Good point. The only bright part of this is that missing clog will throw an error so we are not returning incorrect data, and hopefully people will report problems to us when it happens. Ideally I would like to get this patch and correction code out into the field in case more people run into this problem. ?I know some will, I just don't know how many. ISTM we need to force a minor release once we are sure this has been corrected. We had also probably put out an announcement warning people that have already used pg_upgrade of possible data corruption. I'm not sure exactly what the language around that should be, but this does seem pretty bad. Yep, pretty bad it is. The bug exists because I did not realize that the toast relfrozenxid is tracked independently of the heap, until the IRC report diagnosis. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Thu, 2011-04-07 at 17:06 -0400, Bruce Momjian wrote: I want to avoid anything that requires a compile because they are hard for many sites to install so TransactionIdPrecedes() is out. We will need to do this in PL/pgSQL probably. PL/pgSQL can't see dead rows, so that would not be correct. It's guaranteed to be the same value you see from the heap or newer; because if it's not visible in the heap, it's not going to be visible in the toast table. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Jeff Davis wrote: On Thu, 2011-04-07 at 17:06 -0400, Bruce Momjian wrote: I want to avoid anything that requires a compile because they are hard for many sites to install so TransactionIdPrecedes() is out. We will need to do this in PL/pgSQL probably. PL/pgSQL can't see dead rows, so that would not be correct. It's guaranteed to be the same value you see from the heap or newer; because if it's not visible in the heap, it's not going to be visible in the toast table. Well, frankly all we need to do is set those hint bits before the clog gets remove, so maybe just a SELECT * would do the trick! That and maybe set the relfrozenxid to match the heap. It is there now or more people would be reporting problems. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Bruce Momjian wrote: Jeff Davis wrote: On Thu, 2011-04-07 at 17:06 -0400, Bruce Momjian wrote: I want to avoid anything that requires a compile because they are hard for many sites to install so TransactionIdPrecedes() is out. We will need to do this in PL/pgSQL probably. PL/pgSQL can't see dead rows, so that would not be correct. It's guaranteed to be the same value you see from the heap or newer; because if it's not visible in the heap, it's not going to be visible in the toast table. Well, frankly all we need to do is set those hint bits before the clog gets remove, so maybe just a SELECT * would do the trick! That and maybe set the relfrozenxid to match the heap. It is there now or more people would be reporting problems. Clarification, the clog is there now or more people would be reporting problems. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Bruce Momjian br...@momjian.us wrote: all we need to do is set those hint bits before the clog gets remove, so maybe just a SELECT * would do the trick! Does that mean that those experiencing the problem are failing to do the vacuumdb run which is recommended in the pg_upgrade instructions? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Kevin Grittner wrote: Bruce Momjian br...@momjian.us wrote: all we need to do is set those hint bits before the clog gets remove, so maybe just a SELECT * would do the trick! Does that mean that those experiencing the problem are failing to do the vacuumdb run which is recommended in the pg_upgrade instructions? You know, I looked at that, but I don't think that is going to save me. :-( It says: Upgrade complete | Optimizer statistics are not transferred by pg_upgrade | so consider running: | vacuumdb --all --analyze-only | on the newly-upgraded cluster. | Running this script will delete the old cluster's data files: | /usr/var/local/pgdev/pgfoundry/pg_migrator/pg_migrator/delete_old_cluster.sh We recommend 'vacuumdb --all --analyze-only' which I assume only samples random pages and does not set all the hint bits. In fact, you can't even analyze TOAST tables: test= ANALYZE pg_toast.pg_toast_3596; WARNING: skipping pg_toast_3596 --- cannot analyze non-tables or special system tables ANALYZE but you can SELECT from them: chunk_id | chunk_seq | chunk_data --+---+ (0 rows) Also, if we force VACUUM FREEZE on the toast tables we would have no need to advance their relfrozenxids because all the xids would be fixed. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Thu, 2011-04-07 at 12:38 -0700, Jeff Davis wrote: Any idea how to correct existing systems? Would VACUUM FREEZE of just the toast tables work? VACUUM FREEZE will never set the relfrozenxid backward. If it was never preserved to begin with, I assume that the existing value could be arbitrarily before or after, so it might not be updated. Now that I understand the problem a little better, I think VACUUM FREEZE might work, after all. Originally, I thought that the toast table's relfrozenxid could be some arbitrarily wrong value. But actually, the CREATE TABLE is issued after the xid of the new cluster has already been advanced to the xid of the old cluster, so it should be a somewhat reasonable value. That means that VACUUM FREEZE of the toast table, if there are no concurrent transactions, will freeze all of the tuples; and the newFrozenXid should always be seen as newer than the existing (and wrong) relfrozenxid. Then, it will set relfrozenxid to newFrozenXid and everything should be fine. Right? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Jeff Davis wrote: On Thu, 2011-04-07 at 12:38 -0700, Jeff Davis wrote: Any idea how to correct existing systems? Would VACUUM FREEZE of just the toast tables work? VACUUM FREEZE will never set the relfrozenxid backward. If it was never preserved to begin with, I assume that the existing value could be arbitrarily before or after, so it might not be updated. Now that I understand the problem a little better, I think VACUUM FREEZE might work, after all. Good. I don't want to be inventing something complex if I can avoid it. Simple is good, espeically if admins panic. I would rather simple and longer than short but complex :-) Originally, I thought that the toast table's relfrozenxid could be some arbitrarily wrong value. But actually, the CREATE TABLE is issued after the xid of the new cluster has already been advanced to the xid of the old cluster, so it should be a somewhat reasonable value. Yes, it will be reasonable. That means that VACUUM FREEZE of the toast table, if there are no concurrent transactions, will freeze all of the tuples; and the newFrozenXid should always be seen as newer than the existing (and wrong) relfrozenxid. Then, it will set relfrozenxid to newFrozenXid and everything should be fine. Right? Right. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Thu, Apr 7, 2011 at 5:52 PM, Bruce Momjian br...@momjian.us wrote: Jeff Davis wrote: On Thu, 2011-04-07 at 12:38 -0700, Jeff Davis wrote: Any idea how to correct existing systems? Would VACUUM FREEZE of just the toast tables work? VACUUM FREEZE will never set the relfrozenxid backward. If it was never preserved to begin with, I assume that the existing value could be arbitrarily before or after, so it might not be updated. Now that I understand the problem a little better, I think VACUUM FREEZE might work, after all. Good. I don't want to be inventing something complex if I can avoid it. Simple is good, espeically if admins panic. I would rather simple and longer than short but complex :-) Originally, I thought that the toast table's relfrozenxid could be some arbitrarily wrong value. But actually, the CREATE TABLE is issued after the xid of the new cluster has already been advanced to the xid of the old cluster, so it should be a somewhat reasonable value. Yes, it will be reasonable. That means that VACUUM FREEZE of the toast table, if there are no concurrent transactions, will freeze all of the tuples; and the newFrozenXid should always be seen as newer than the existing (and wrong) relfrozenxid. Then, it will set relfrozenxid to newFrozenXid and everything should be fine. Right? Right. This depends on how soon after the upgrade VACUUM FREEZE is run, doesn't it? If the XID counter has advanced too far... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Robert Haas wrote: On Thu, Apr 7, 2011 at 5:52 PM, Bruce Momjian br...@momjian.us wrote: Jeff Davis wrote: On Thu, 2011-04-07 at 12:38 -0700, Jeff Davis wrote: Any idea how to correct existing systems? ?Would VACUUM FREEZE of just the toast tables work? VACUUM FREEZE will never set the relfrozenxid backward. If it was never preserved to begin with, I assume that the existing value could be arbitrarily before or after, so it might not be updated. Now that I understand the problem a little better, I think VACUUM FREEZE might work, after all. Good. ?I don't want to be inventing something complex if I can avoid it. Simple is good, espeically if admins panic. ?I would rather simple and longer than short but complex ?:-) Originally, I thought that the toast table's relfrozenxid could be some arbitrarily wrong value. But actually, the CREATE TABLE is issued after the xid of the new cluster has already been advanced to the xid of the old cluster, so it should be a somewhat reasonable value. Yes, it will be reasonable. That means that VACUUM FREEZE of the toast table, if there are no concurrent transactions, will freeze all of the tuples; and the newFrozenXid should always be seen as newer than the existing (and wrong) relfrozenxid. Then, it will set relfrozenxid to newFrozenXid and everything should be fine. Right? Right. This depends on how soon after the upgrade VACUUM FREEZE is run, doesn't it? If the XID counter has advanced too far... Well, I assume VACUUM FREEZE is going to sequential scan the table and replace every xid. If the clog is gone, well, we have problems. I think the IRC reporter pulled the clog files from a backup. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Bruce Momjian wrote: Yes, it will be reasonable. That means that VACUUM FREEZE of the toast table, if there are no concurrent transactions, will freeze all of the tuples; and the newFrozenXid should always be seen as newer than the existing (and wrong) relfrozenxid. Then, it will set relfrozenxid to newFrozenXid and everything should be fine. Right? Right. This depends on how soon after the upgrade VACUUM FREEZE is run, doesn't it? If the XID counter has advanced too far... Well, I assume VACUUM FREEZE is going to sequential scan the table and replace every xid. If the clog is gone, well, we have problems. I think the IRC reporter pulled the clog files from a backup. So I think we have four possible approaches to correct databases: 1) SELECT * to set the hint bits 2) VACUUM to set the hint bits 3) VACUUM FREEZE to remove the old xids 4) some complicated function I don't like #4, and I think I can script #2 and #3 in psql by using COPY to create a VACUUM script and then run it with \i. #1 is easy in a DO block with PL/pgSQL. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
On Thu, 2011-04-07 at 20:14 -0400, Bruce Momjian wrote: So I think we have four possible approaches to correct databases: 1) SELECT * to set the hint bits 2) VACUUM to set the hint bits 3) VACUUM FREEZE to remove the old xids 4) some complicated function I don't like #4, and I think I can script #2 and #3 in psql by using COPY to create a VACUUM script and then run it with \i. #1 is easy in a DO block with PL/pgSQL. The only one that sounds very reasonable to me is #3. If there are any xids older than the relfrozenxid, we need to get rid of them. If there is some reason that doesn't work, I suppose we can consider the alternatives. But I don't like the hint-bit-setting approach much. What if the xmax is really a transaction that got an exclusive lock on the tuple, rather than actually deleting it? Are you sure that a SELECT (or even a normal VACUUM) would get rid of that xid, or might something still try to look it up in the clog later? Not only that, but hint-bit-setting is not WAL-logged, so you'd really have to do a checkpoint afterward. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Jeff Davis wrote: On Thu, 2011-04-07 at 20:14 -0400, Bruce Momjian wrote: So I think we have four possible approaches to correct databases: 1) SELECT * to set the hint bits 2) VACUUM to set the hint bits 3) VACUUM FREEZE to remove the old xids 4) some complicated function I don't like #4, and I think I can script #2 and #3 in psql by using COPY to create a VACUUM script and then run it with \i. #1 is easy in a DO block with PL/pgSQL. The only one that sounds very reasonable to me is #3. If there are any xids older than the relfrozenxid, we need to get rid of them. If there is some reason that doesn't work, I suppose we can consider the alternatives. But I don't like the hint-bit-setting approach much. What if the xmax is really a transaction that got an exclusive lock on the tuple, rather than actually deleting it? Are you sure that a SELECT (or even a normal VACUUM) would get rid of that xid, or might something still try to look it up in the clog later? Not only that, but hint-bit-setting is not WAL-logged, so you'd really have to do a checkpoint afterward. Glad you said that! Here is a script which does what we want: -- This script fixes data in pre-PG 9.0.4 and pre-8.4.8 -- servers that were upgraded by pg_upgrade and pg_migrator. -- Run the script using psql for every database in the cluster, -- except 'template0', e.g. -- psql -f pg_upgrade_fix dbname -- It will not lock any tables but will generate I/O. -- SET vacuum_freeze_min_age = 0; SET vacuum_freeze_table_age = 0; CREATE TEMPORARY TABLE pg_upgrade_fix AS SELECT 'VACUUM FREEZE pg_toast.' || quote_ident(relname) || ';' FROMpg_class c, pg_namespace n WHERE c.relnamespace = n.oid AND n.nspname = 'pg_toast' AND c.relkind = 't'; \copy pg_upgrade_fix TO 'pg_upgrade_fix.sql'; \i pg_upgrade_fix.sql DROP TABLE pg_upgrade_fix; Looks pretty simple to copy/paste and use. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bug found!
Noah Misch wrote: On Thu, Apr 07, 2011 at 12:16:55PM -0400, Bruce Momjian wrote: Bruce Momjian wrote: OK, thanks to RhodiumToad on IRC, I was able to determine the cause of the two reported pg_upgrade problems he saw via IRC. It seems toast tables have xids and pg_dump is not preserving the toast relfrozenxids as it should. Heap tables have preserved relfrozenxids, but if you update a heap row but don't change the toast value, and the old heap row is later removed, the toast table can have an older relfrozenxids than the heap table. The fix for this is to have pg_dump preserve toast relfrozenxids, which can be easily added and backpatched. We might want to push a 9.0.4 for this. Second, we need to find a way for people to detect and fix existing systems that have this problem, perhaps looming when the pg_class relfrozenxid passes the toast relfrozenxid, and thirdly, we need to figure out how to get this information to users. Perhaps the communication comes through the 9.0.4 release announcement. I am not sure how to interpret the lack of replies to this email. Either it is confidence, shock, or we told you so. ;-) Your explanation and patch make sense. Seems all too clear in retrospect. Yeah, like duh for me. Any idea how to correct existing systems? Would VACUUM FREEZE of just the toast tables work? I perhaps could create a short DO block that would vacuum freeze just toast tables; it would have to be run in every database. I see three cases: 1) The pg_class.relfrozenxid that the TOAST table should have received (true relfrozenxid) is still covered by available clog files. Fixable with some combination of pg_class.relfrozenxid twiddling and SET vacuum_freeze_table_age = 0; VACUUM toasttbl. Right, VACUUM FREEZE. I now see I don't need to set vacuum_freeze_table_age if I use the FREEZE keyword, e.g. gram.y has: if (n-options VACOPT_FREEZE) n-freeze_min_age = n-freeze_table_age = 0; 2) The true relfrozenxid is no longer covered by available clog files. The fix for case 1 will get file foo doesn't exist, reading as zeroes log messages, and we will treat all transactions as uncommitted. Uh, are you sure? I think it would return an error message about a missing clog file for the query; here is a report of a case not related to pg_upgrade: http://archives.postgresql.org/pgsql-admin/2010-09/msg00109.php Not generally fixable after that has happened. We could probably provide a recipe for checking whether it could have happened given access to a backup from just before the upgrade. The IRC folks pulled the clog files off of backups. 3) Enough transaction xids have elapsed such that the true relfrozenxid is again covered by clog files, but those records are unrelated to the original transactions. Actually, I don't think this can happen, even with the maximum autovacuum_freeze_max_age. Yes, I don't think that can happen either. One concern I have is that existing heap tables are protecting clog files, but once those are frozen, the system might remove clog files not realizing it has to freeze the heap tables too. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_upgrade bug found!
OK, thanks to RhodiumToad on IRC, I was able to determine the cause of the two reported pg_upgrade problems he saw via IRC. It seems toast tables have xids and pg_dump is not preserving the toast relfrozenxids as it should. Heap tables have preserved relfrozenxids, but if you update a heap row but don't change the toast value, and the old heap row is later removed, the toast table can have an older relfrozenxids than the heap table. The fix for this is to have pg_dump preserve toast relfrozenxids, which can be easily added and backpatched. We might want to push a 9.0.4 for this. Second, we need to find a way for people to detect and fix existing systems that have this problem, perhaps looming when the pg_class relfrozenxid passes the toast relfrozenxid, and thirdly, we need to figure out how to get this information to users. Perhaps the communication comes through the 9.0.4 release announcement. Yes, this is not good! :-( I will still add a special flag to postgres to turn off autovacuum, but as we suspected, this is only a marginal improvement and not the cause of the 9.0.X failures. The good news is that only two people have seen this problem and it only happens when the hint bits have not been set on the toast rows and the oldest heap rows have been updated. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers