Re: [HACKERS] pg_upgrade - link mode and transaction-wraparound data loss

2010-05-19 Thread Jesper Krogh

On 2010-05-18 18:57, Bruce Momjian wrote:

jes...@krogh.cc wrote:
   

Hi

I tried running pg_upgrade from the current snapshot of postgresql and
upgrading from 8.4.4 to the snapshot version. Everything seem to look fine
in the process and all that came out was only ok's but when I tried a
simple query on the databse it keeps throwing these message out of the back
side.

DETAIL:  You might have already suffered transaction-wraparound data loss.
WARNING:  some databases have not been vacuumed in over 2 billion
transactions


The database was around 600GB and it took a couple of minutes to run
pg_upgrade after I had all the binaries in the correct place.

It is not really an easy task to throw around 600GB of data, so I cannot
gaurantee that the above is reproducible, but I'll see if I can get time
and try to reproduce it.
 

This certainly should never have happened, so I am guessing it is a bug.
pg_upgrade tries hard to make sure all your datfrozenxid and
relfrozenxid are properly migrated from the old server, and the
transaction id is set properly.  Unfortunately this is the first time I
have heard of such a problem, so I am unclear on its cause.
   


Other people are typically way faster than I am looking into it.
Depesz has produced a full trace to reproduce the problem here:
http://www.depesz.com/index.php/2010/05/19/waiting-for-9-0-pg_upgrade/

Jesper
--
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade - link mode and transaction-wraparound data loss

2010-05-19 Thread Bruce Momjian
Bruce Momjian wrote:
  This is the production system. I have absolutely no indications that
  anything should be wrong in there. It has run rock-solid since it got
  migrated (dump/restore) to 8.4 for about 7 months now. So I am a bit
  scared about you telling that it seems wrong. (but that cannot be
  attributed to pg_upgrade)
 
 I am on chat with Alvaro now and it seems we do somehow connect to
 template0 for transaction id wraparound.  I think Alvaro will post
 shortly on this.
 
   OK, thanks.  This does seem odd.  Frankly, having template0's
   datfrozenxid be wrong would not cause any kind of instability because
   template0 is used only by pg_dump, so I am wondering if something else
   is seriously wrong.
  
  I also think that something was seriously wrong with the pg_upgrade'd
  version. I'll try to reproduce and be a bit more carefull in tracking 
  the steps
  this time.
 
 Thanks, but I think the entire problem might be this template0 xid issue
 that Alvaro and I are researching.  I can now see how invalid template0
 xids could cause the instability you saw in the new database.  Odd no
 one has seen this bug before.

OK, after talking to Alvaro and Heikki, the problem is that while you
cannot connect to template0, it is accessed by autovacuum for vacuum
freeze, even if autovacuum is turned off.  I think the reason you are
seeing this bug is that your xid counter is near 2 billion (50% to
wraparound) and the original template0 xids are the maximum distance
from your counter.

I am attaching the newest patch which fixes this issue.  I did modify
this code yesterday with another patch, and I am unclear exactly if you
need that patch as well.  CVS now has all these changes.

If you could test with this and the earlier patch, I think it will now
work fine.  Thanks for the valuable testing, and quick feedback.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com
Index: contrib/pg_upgrade/pg_upgrade.c
===
RCS file: /cvsroot/pgsql/contrib/pg_upgrade/pg_upgrade.c,v
retrieving revision 1.3
diff -c -c -r1.3 pg_upgrade.c
*** contrib/pg_upgrade/pg_upgrade.c	18 May 2010 18:40:51 -	1.3
--- contrib/pg_upgrade/pg_upgrade.c	19 May 2010 18:20:03 -
***
*** 164,170 
  	check_ok(ctx);
  
  	/*
! 	 * We do freeze after analyze so pg_statistic is also frozen
  	 */
  	prep_status(ctx, Freezing all rows on the new cluster);
  	exec_prog(ctx, true,
--- 164,173 
  	check_ok(ctx);
  
  	/*
! 	 * We do freeze after analyze so pg_statistic is also frozen.
! 	 * template0 is not frozen here, but data rows were frozen by initdb,
! 	 * and we set its datfrozenxid and relfrozenxids later to match the
! 	 * new xid counter later.
  	 */
  	prep_status(ctx, Freezing all rows on the new cluster);
  	exec_prog(ctx, true,
***
*** 292,339 
  set_frozenxids(migratorContext *ctx)
  {
  	int			dbnum;
! 	PGconn	   *conn;
  	PGresult   *dbres;
  	int			ntups;
  
  	prep_status(ctx, Setting frozenxid counters in new cluster);
  
! 	conn = connectToServer(ctx, template1, CLUSTER_NEW);
  
  	/* set pg_database.datfrozenxid */
! 	PQclear(executeQueryOrDie(ctx, conn,
  			  UPDATE pg_catalog.pg_database 
! 			  SET	datfrozenxid = '%u' 
! 			  WHERE datallowconn = true,
  			  ctx-old.controldata.chkpnt_nxtxid));
  
  	/* get database names */
! 	dbres = executeQueryOrDie(ctx, conn,
! 			  SELECT	datname 
! 			  FROM	pg_catalog.pg_database 
! 			  WHERE datallowconn = true);
  
! 	/* free dbres below */
! 	PQfinish(conn);
  
  	ntups = PQntuples(dbres);
  	for (dbnum = 0; dbnum  ntups; dbnum++)
  	{
! 		conn = connectToServer(ctx, PQgetvalue(dbres, dbnum, 0), CLUSTER_NEW);
  
  		/* set pg_class.relfrozenxid */
  		PQclear(executeQueryOrDie(ctx, conn,
    UPDATE	pg_catalog.pg_class 
    SET	relfrozenxid = '%u' 
  		/* only heap and TOAST are vacuumed */
!   WHERE	relkind = 'r' OR 
!   		relkind = 't',
    ctx-old.controldata.chkpnt_nxtxid));
  		PQfinish(conn);
  	}
  
  	PQclear(dbres);
  
  	check_ok(ctx);
  }
  
--- 295,366 
  set_frozenxids(migratorContext *ctx)
  {
  	int			dbnum;
! 	PGconn	   *conn, *conn_template1;
  	PGresult   *dbres;
  	int			ntups;
+ 	int			i_datname;
+ 	int			i_datallowconn;
  
  	prep_status(ctx, Setting frozenxid counters in new cluster);
  
! 	conn_template1 = connectToServer(ctx, template1, CLUSTER_NEW);
  
  	/* set pg_database.datfrozenxid */
! 	PQclear(executeQueryOrDie(ctx, conn_template1,
  			  UPDATE pg_catalog.pg_database 
! 			  SET	datfrozenxid = '%u',
  			  ctx-old.controldata.chkpnt_nxtxid));
  
  	/* get database names */
! 	dbres = executeQueryOrDie(ctx, conn_template1,
! 			  SELECT	datname, datallowconn 
! 			  FROM	pg_catalog.pg_database);
  
! 	i_datname = PQfnumber(dbres, datname);
! 	i_datallowconn = 

Re: [HACKERS] pg_upgrade - link mode and transaction-wraparound data loss

2010-05-19 Thread Bruce Momjian
Jesper Krogh wrote:
 On 2010-05-18 18:57, Bruce Momjian wrote:
  jes...@krogh.cc wrote:
 
  Hi
 
  I tried running pg_upgrade from the current snapshot of postgresql and
  upgrading from 8.4.4 to the snapshot version. Everything seem to look fine
  in the process and all that came out was only ok's but when I tried a
  simple query on the databse it keeps throwing these message out of the back
  side.
 
  DETAIL:  You might have already suffered transaction-wraparound data loss.
  WARNING:  some databases have not been vacuumed in over 2 billion
  transactions
 
 
  The database was around 600GB and it took a couple of minutes to run
  pg_upgrade after I had all the binaries in the correct place.
 
  It is not really an easy task to throw around 600GB of data, so I cannot
  gaurantee that the above is reproducible, but I'll see if I can get time
  and try to reproduce it.
   
  This certainly should never have happened, so I am guessing it is a bug.
  pg_upgrade tries hard to make sure all your datfrozenxid and
  relfrozenxid are properly migrated from the old server, and the
  transaction id is set properly.  Unfortunately this is the first time I
  have heard of such a problem, so I am unclear on its cause.
 
 
 Other people are typically way faster than I am looking into it.
 Depesz has produced a full trace to reproduce the problem here:
 http://www.depesz.com/index.php/2010/05/19/waiting-for-9-0-pg_upgrade/

Thanks. I have commented on the blog to mention that we have fixed the
bug reported there.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_upgrade - link mode and transaction-wraparound data loss

2010-05-18 Thread jesper
Hi

I tried running pg_upgrade from the current snapshot of postgresql and
upgrading from 8.4.4 to the snapshot version. Everything seem to look fine
in the process and all that came out was only ok's but when I tried a
simple query on the databse it keeps throwing these message out of the back
side.

DETAIL:  You might have already suffered transaction-wraparound data loss.
WARNING:  some databases have not been vacuumed in over 2 billion
transactions


The database was around 600GB and it took a couple of minutes to run
pg_upgrade after I had all the binaries in the correct place.

It is not really an easy task to throw around 600GB of data, so I cannot
gaurantee that the above is reproducible, but I'll see if I can get time
and try to reproduce it.

Jesper


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade - link mode and transaction-wraparound data loss

2010-05-18 Thread Bruce Momjian
jes...@krogh.cc wrote:
 Hi
 
 I tried running pg_upgrade from the current snapshot of postgresql and
 upgrading from 8.4.4 to the snapshot version. Everything seem to look fine
 in the process and all that came out was only ok's but when I tried a
 simple query on the databse it keeps throwing these message out of the back
 side.
 
 DETAIL:  You might have already suffered transaction-wraparound data loss.
 WARNING:  some databases have not been vacuumed in over 2 billion
 transactions
 
 
 The database was around 600GB and it took a couple of minutes to run
 pg_upgrade after I had all the binaries in the correct place.
 
 It is not really an easy task to throw around 600GB of data, so I cannot
 gaurantee that the above is reproducible, but I'll see if I can get time
 and try to reproduce it.

This certainly should never have happened, so I am guessing it is a bug.
pg_upgrade tries hard to make sure all your datfrozenxid and
relfrozenxid are properly migrated from the old server, and the
transaction id is set properly.  Unfortunately this is the first time I
have heard of such a problem, so I am unclear on its cause.

The warning is issued from vacuum.c::vac_truncate_clog().  Can you run
this query and show us the output:

SELECT datname, datfrozenxid FROM pg_database;

It would be good to see these numbers on both the old and new servers.
I would also like to see:

SELECT txid_current();

on the old and new servers, but if you can only provide these values on
one of the two servers, it is still useful.  Thanks.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade - link mode and transaction-wraparound data loss

2010-05-18 Thread jesper
 jes...@krogh.cc wrote:
 Hi

 I tried running pg_upgrade from the current snapshot of postgresql and
 upgrading from 8.4.4 to the snapshot version. Everything seem to look
 fine
 in the process and all that came out was only ok's but when I tried a
 simple query on the databse it keeps throwing these message out of the
 back
 side.

 DETAIL:  You might have already suffered transaction-wraparound data
 loss.
 WARNING:  some databases have not been vacuumed in over 2 billion
 transactions


 The database was around 600GB and it took a couple of minutes to run
 pg_upgrade after I had all the binaries in the correct place.

 It is not really an easy task to throw around 600GB of data, so I cannot
 gaurantee that the above is reproducible, but I'll see if I can get time
 and try to reproduce it.

 This certainly should never have happened, so I am guessing it is a bug.
 pg_upgrade tries hard to make sure all your datfrozenxid and
 relfrozenxid are properly migrated from the old server, and the
 transaction id is set properly.  Unfortunately this is the first time I
 have heard of such a problem, so I am unclear on its cause.

 The warning is issued from vacuum.c::vac_truncate_clog().  Can you run
 this query and show us the output:

   SELECT datname, datfrozenxid FROM pg_database;

 It would be good to see these numbers on both the old and new servers.
 I would also like to see:

   SELECT txid_current();

 on the old and new servers, but if you can only provide these values on
 one of the two servers, it is still useful.  Thanks.

Hi Bruce, thanks for your prompt response.

First the new one..

j...@pal:~$ psql -p 5433
psql (9.0beta1)
Type help for help.

data=# SELECT datname, datfrozenxid FROM pg_database;
  datname  | datfrozenxid
---+--
 template0 |  654
 postgres  |   2374592801
 data  |   2023782337
 jk|   2023822188
 template1 |   2374592801
 workqueue |   2023822188
(6 rows)

data=# SELECT txid_current();
 txid_current
--
   2375384556
(1 row)

data=# \q

Then the old one.

j...@pal:~$ psql data
psql (9.0beta1, server 8.4.1)
WARNING: psql version 9.0, server version 8.4.
 Some psql features might not work.
Type help for help.

data# SELECT datname, datfrozenxid FROM pg_database;
  datname  | datfrozenxid
---+--
 template0 |   2073823552
 postgres  |   2023820521
 data  |   2023782337
 jk|   2023822188
 template1 |   2073823552
 workqueue |   2023822188
(6 rows)

data=# SELECT txid_current();
 txid_current
--
   2390524243
(1 row)


The old database has been copied over using rsync and
pg_start_backup()/pg_stop_backup() procecures and started up
using a recovery.conf file.

Jesper



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade - link mode and transaction-wraparound data loss

2010-05-18 Thread Bruce Momjian
jes...@krogh.cc wrote:
  on the old and new servers, but if you can only provide these values on
  one of the two servers, it is still useful.  Thanks.
 
 Hi Bruce, thanks for your prompt response.
 
 First the new one..

Great.

 j...@pal:~$ psql -p 5433
 psql (9.0beta1)
 Type help for help.
 
 data=# SELECT datname, datfrozenxid FROM pg_database;
   datname  | datfrozenxid
 ---+--
  template0 |  654
  postgres  |   2374592801
  data  |   2023782337
  jk|   2023822188
  template1 |   2374592801
  workqueue |   2023822188
 (6 rows)
 
 data=# SELECT txid_current();
  txid_current
 --
2375384556
 (1 row)
 
 data=# \q
 
 Then the old one.
 
 j...@pal:~$ psql data
 psql (9.0beta1, server 8.4.1)
 WARNING: psql version 9.0, server version 8.4.
  Some psql features might not work.
 Type help for help.
 
 data# SELECT datname, datfrozenxid FROM pg_database;
   datname  | datfrozenxid
 ---+--
  template0 |   2073823552

This line above looks very odd because I didn't think the template0
datfrozenxid could be advanced.  Can I see the output of this query:

SELECT datname, datfrozenxid, datallowconn FROM pg_database;

I am wondering if you set datallowconn for template0 to 'true'.

  postgres  |   2023820521
  data  |   2023782337
  jk|   2023822188
  template1 |   2073823552
  workqueue |   2023822188
 (6 rows)
 
 data=# SELECT txid_current();
  txid_current
 --
2390524243
 (1 row)
 
 
 The old database has been copied over using rsync and
 pg_start_backup()/pg_stop_backup() procecures and started up
 using a recovery.conf file.

My other idea is that somehow recovery touches datallowconn for
template0.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade - link mode and transaction-wraparound data loss

2010-05-18 Thread Jesper Krogh

On 2010-05-18 20:52, Bruce Momjian wrote:

This line above looks very odd because I didn't think the template0
datfrozenxid could be advanced.  Can I see the output of this query:

SELECT datname, datfrozenxid, datallowconn FROM pg_database;

   


Only from the old database:
data=# SELECT datname, datfrozenxid, datallowconn FROM pg_database;
  datname  | datfrozenxid | datallowconn
---+--+--
 template0 |   2073823552 | f
 postgres  |   2023820521 | t
 data  |   2023782337 | t
 jk|   2023822188 | t
 template1 |   2073823552 | t
 workqueue |   2023822188 | t
(6 rows)


I am wondering if you set datallowconn for template0 to 'true'.


From this database, I cannot give any more results, I ran some other
queries and then restarted postgres, subsequently it seemed
totally broken. I'm in the process of running the test over again, but
it'll take a while before data is in.  I'll report back.

--
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade - link mode and transaction-wraparound data loss

2010-05-18 Thread Bruce Momjian
Jesper Krogh wrote:
 On 2010-05-18 20:52, Bruce Momjian wrote:
  This line above looks very odd because I didn't think the template0
  datfrozenxid could be advanced.  Can I see the output of this query:
 
  SELECT datname, datfrozenxid, datallowconn FROM pg_database;
 
 
 
 Only from the old database:
 data=# SELECT datname, datfrozenxid, datallowconn FROM pg_database;
datname  | datfrozenxid | datallowconn
 ---+--+--
   template0 |   2073823552 | f
   postgres  |   2023820521 | t
   data  |   2023782337 | t
   jk|   2023822188 | t
   template1 |   2073823552 | t
   workqueue |   2023822188 | t
 (6 rows)

OK, datallowconn = false is right for template0, but I am still confused
how it got set to that high value.

  I am wondering if you set datallowconn for template0 to 'true'.
 
  From this database, I cannot give any more results, I ran some other
 queries and then restarted postgres, subsequently it seemed
 totally broken. I'm in the process of running the test over again, but
 it'll take a while before data is in.  I'll report back.

OK, thanks.  This does seem odd.  Frankly, having template0's
datfrozenxid be wrong would not cause any kind of instability because
template0 is used only by pg_dump, so I am wondering if something else
is seriously wrong.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade - link mode and transaction-wraparound data loss

2010-05-18 Thread Bruce Momjian
jes...@krogh.cc wrote:
 First the new one..
 
 j...@pal:~$ psql -p 5433
 psql (9.0beta1)
 Type help for help.
 
 data=# SELECT datname, datfrozenxid FROM pg_database;
   datname  | datfrozenxid
 ---+--
  template0 |  654
  postgres  |   2374592801
  data  |   2023782337
  jk|   2023822188
  template1 |   2374592801
  workqueue |   2023822188
 (6 rows)
 
 data=# SELECT txid_current();
  txid_current
 --
2375384556
 (1 row)

I just ran a test and all the datfrozenxids are less than the current
xid, so the only database that could be generating a wraparound warning
is 'template0'.  But, again, I though that template0 was not touched for
wraparound protection --- I am starting to think I am wrong.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade - link mode and transaction-wraparound data loss

2010-05-18 Thread Alvaro Herrera
Excerpts from jesper's message of mar may 18 13:22:12 -0400 2010:

 j...@pal:~$ psql data
 psql (9.0beta1, server 8.4.1)
 WARNING: psql version 9.0, server version 8.4.
  Some psql features might not work.
 Type help for help.
 
 data# SELECT datname, datfrozenxid FROM pg_database;
   datname  | datfrozenxid
 ---+--
  template0 |   2073823552
  postgres  |   2023820521
  data  |   2023782337
  jk|   2023822188
  template1 |   2073823552
  workqueue |   2023822188
 (6 rows)

Does the old server have pg_database.datallowconn = true for template0?
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade - link mode and transaction-wraparound data loss

2010-05-18 Thread Bruce Momjian
Alvaro Herrera wrote:
 Excerpts from jesper's message of mar may 18 13:22:12 -0400 2010:
 
  j...@pal:~$ psql data
  psql (9.0beta1, server 8.4.1)
  WARNING: psql version 9.0, server version 8.4.
   Some psql features might not work.
  Type help for help.
  
  data# SELECT datname, datfrozenxid FROM pg_database;
datname  | datfrozenxid
  ---+--
   template0 |   2073823552
   postgres  |   2023820521
   data  |   2023782337
   jk|   2023822188
   template1 |   2073823552
   workqueue |   2023822188
  (6 rows)
 
 Does the old server have pg_database.datallowconn = true for template0?

The user reported back that it did not:

Only from the old database:
data=# SELECT datname, datfrozenxid, datallowconn FROM pg_database;
   datname  | datfrozenxid | datallowconn
---+--+--
  template0 |   2073823552 | f
  postgres  |   2023820521 | t
  data  |   2023782337 | t
  jk|   2023822188 | t
  template1 |   2073823552 | t
  workqueue |   2023822188 | t
(6 rows)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade - link mode and transaction-wraparound data loss

2010-05-18 Thread Jesper Krogh

On 2010-05-18 21:56, Bruce Momjian wrote:

Jesper Krogh wrote:
   

On 2010-05-18 20:52, Bruce Momjian wrote:
 

This line above looks very odd because I didn't think the template0
datfrozenxid could be advanced.  Can I see the output of this query:

SELECT datname, datfrozenxid, datallowconn FROM pg_database;


   

Only from the old database:
data=# SELECT datname, datfrozenxid, datallowconn FROM pg_database;
datname  | datfrozenxid | datallowconn
---+--+--
   template0 |   2073823552 | f
   postgres  |   2023820521 | t
   data  |   2023782337 | t
   jk|   2023822188 | t
   template1 |   2073823552 | t
   workqueue |   2023822188 | t
(6 rows)
 

OK, datallowconn = false is right for template0, but I am still confused
how it got set to that high value.
   


This is the production system. I have absolutely no indications that
anything should be wrong in there. It has run rock-solid since it got
migrated (dump/restore) to 8.4 for about 7 months now. So I am a bit
scared about you telling that it seems wrong. (but that cannot be
attributed to pg_upgrade)


OK, thanks.  This does seem odd.  Frankly, having template0's
datfrozenxid be wrong would not cause any kind of instability because
template0 is used only by pg_dump, so I am wondering if something else
is seriously wrong.
   

I also think that something was seriously wrong with the pg_upgrade'd
version. I'll try to reproduce and be a bit more carefull in tracking 
the steps

this time.

--
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade - link mode and transaction-wraparound data loss

2010-05-18 Thread Bruce Momjian
Jesper Krogh wrote:
 On 2010-05-18 21:56, Bruce Momjian wrote:
  Jesper Krogh wrote:
 
  On 2010-05-18 20:52, Bruce Momjian wrote:
   
  This line above looks very odd because I didn't think the template0
  datfrozenxid could be advanced.  Can I see the output of this query:
 
SELECT datname, datfrozenxid, datallowconn FROM pg_database;
 
 
 
  Only from the old database:
  data=# SELECT datname, datfrozenxid, datallowconn FROM pg_database;
  datname  | datfrozenxid | datallowconn
  ---+--+--
 template0 |   2073823552 | f
 postgres  |   2023820521 | t
 data  |   2023782337 | t
 jk|   2023822188 | t
 template1 |   2073823552 | t
 workqueue |   2023822188 | t
  (6 rows)
   
  OK, datallowconn = false is right for template0, but I am still confused
  how it got set to that high value.
 
 
 This is the production system. I have absolutely no indications that
 anything should be wrong in there. It has run rock-solid since it got
 migrated (dump/restore) to 8.4 for about 7 months now. So I am a bit
 scared about you telling that it seems wrong. (but that cannot be
 attributed to pg_upgrade)

I am on chat with Alvaro now and it seems we do somehow connect to
template0 for transaction id wraparound.  I think Alvaro will post
shortly on this.

  OK, thanks.  This does seem odd.  Frankly, having template0's
  datfrozenxid be wrong would not cause any kind of instability because
  template0 is used only by pg_dump, so I am wondering if something else
  is seriously wrong.
 
 I also think that something was seriously wrong with the pg_upgrade'd
 version. I'll try to reproduce and be a bit more carefull in tracking 
 the steps
 this time.

Thanks, but I think the entire problem might be this template0 xid issue
that Alvaro and I are researching.  I can now see how invalid template0
xids could cause the instability you saw in the new database.  Odd no
one has seen this bug before.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers