Hello,
Apologies in advance for asking here and not on the greenplum forum, but  I'm 
asking here because I think the PG community understands pg better than 
Greenplum.

We run Greenplum 4.1 which is based on PG 8.2.15
We recently had a problem where the transaction ID overflowed an internal 
buffer,  causing queries to  hang and randomly fail.
Greenplum provided a fix where they increased the size of the buffer.
Greenplum has disabled the Autovacuum features. I suspect because they have no 
method of coordinating the autovacuums across the nodes.

I've reviewed the PG doc on transaction id and
I'm trying to understand if there's something we are doing wrong.
Practices concerning me

1.        We never vacuum the entire db - we're doing the system catalog tables 
3x/day and user tables at least 1x/w- although we don't have a method of 
identifying tables that have not been vacuumed so it's possible we're missing 
some table.

2.       In the pg_clog directories we have 25 files dating back to aug 2011.  
When the instances start, the logs are filled with twophase location message.  
Are these expected?

3.       Is the overflow really being caused by transaction wrap around?

Would someone comment on these concerns.
Thanks


Guk setting
vacuum_freeze_min_age            100,000,000        Minimum age at which VACUUM 
should freeze a table row.


Log failure message

2012-05-10 16:26:24.425750 
CDT,"cognos_cube","p1gp1",p25497,th2121975200,"172.28.8.250","18959",2012-05-10 
16:25:03 
CDT,100000245,con2170,,seg24,,,x100000245,sx1,"LOG","00000","FinishPreparedTransaction:
 found TwoPhaseState entry for 99999948",,,,,,,0,,"twophase.c",1368,
2012-05-10 16:26:24.739536 
CDT,"infa_mktldr","p1gp1",p27945,th2121975200,"172.28.8.250","27576",2012-05-10 
16:26:15 
CDT,100000246,con2240,,seg24,,,x100000246,sx1,"LOG","00000","FinishPreparedTransaction:
 found TwoPhaseState entry for 100000230",,,,,,,0,,"twophase.c",1368,
2012-05-10 16:26:33.859255 
CDT,"infa_read","p1gp1",p27981,th2121975200,"172.28.8.250","27866",2012-05-10 
16:26:18 
CDT,100000256,con2244,,seg24,,,x100000256,sx1,"LOG","00000","FinishPreparedTransaction:
 found TwoPhaseState entry for 100000236",,,,,,,0,,"twophase.c",1368,
2012-05-10 16:26:38.564872 
CDT,"ods","p1gp1",p27416,th2121975200,"172.28.8.250","23337",2012-05-10 
16:26:06 CDT,100000257,con2212,,seg24,,dx4860103,x100000257,sx1,"LOG","00000","
Unexpected internal error: Segment process 27416 received signal SIGSEGV

",,,,,,,0,,,,
2012-05-10 16:26:41.359615 
CDT,"ods","p1gp1",p27416,th2121975200,"172.28.8.250","23337",2012-05-10 
16:26:06 
CDT,100000257,con2212,,seg24,,dx4860103,x100000257,sx1,"PANIC","XX000","Unexpected
 internal error: Segment process received signal SIGSEGV 
(postgres.c:3384)",,,,,,,0,,"postgres.c",3384,"Stack trace:
1    0xa39555 postgres errstart (elog.c:454)
2    0x8e26e5 postgres <symbol not found> (postgres.c:3380)
3    0x38c70302d0 libc.so.6 <symbol not found> (??:0)
4    0x8b1578 postgres ProcArrayAdd (procarray.c:149)
5    0x5258a5 postgres EndPrepare (twophase.c:1240)
6    0x4f3462 postgres <symbol not found> (xact.c:3094)
7    0x4f3a75 postgres CommitTransactionCommand (xact.c:3523)
8    0xb944a5 postgres performDtxProtocolCommand (cdbtm.c:3891)
9    0x8e6552 postgres PostgresMain (postgres.c:1414)
10   0x8516b1 postgres <symbol not found> (postmaster.c:6443)
11   0x8589a5 postgres PostmasterMain (postmaster.c:2272)
12   0x76713a postgres main (main.c:212)
13   0x38c701d994 libc.so.6 __libc_start_main (??:0)
14   0x475089 postgres <symbol not found> (??:0)
"
2012-05-10 16:26:46.580196 
CDT,,,p25872,th2121975200,,,,0,,,seg-1,,,,,"LOG","00000","server process (PID 
27416) was terminated by signal 6: Aborted",,,,,,,0,,"postmaster.c",5275,

Log messages regarding twophase
2012-05-18 00:05:49.849381 
CDT,"infa_write","p1gp1",p15767,th-1760766560,"172.28.8.250","33445",2012-05-18 
00:05:29 
CDT,104342361,con385826,,seg0,,,x104342361,sx1,"LOG","00000","FinishPreparedTransaction:
 found TwoPhaseState entry for 104342242",,,,,,,0,,"twophase.c",1368,
2012-05-18 00:05:50.018299 
CDT,"infa_write","p1gp1",p15887,th-1760766560,"172.28.8.250","34675",2012-05-18 
00:05:32 
CDT,104342362,con385833,,seg0,,,x104342362,sx1,"LOG","00000","FinishPreparedTransaction:
 found TwoPhaseState entry for 104342257",,,,,,,0,,"twophase.c",1368,
2012-05-18 00:05:50.029302 
CDT,"infa_write","p1gp1",p15839,th-1760766560,"172.28.8.250","34335",2012-05-18 
00:05:30 
CDT,104342363,con385830,,seg0,,,x104342363,sx1,"LOG","00000","FinishPreparedTransaction:
 found TwoPhaseState entry for 104342250",,,,,,,0,,"twophase.c",1368,
2012-05-18 00:05:52.274591 
CDT,"infa_write","p1gp1",p15959,th-1760766560,"172.28.8.250","35303",2012-05-18 
00:05:36 
CDT,104342369,con385840,,seg0,,,x104342369,sx1,"LOG","00000","FinishPreparedTransaction:
 found TwoPhaseState entry for 104342277",,,,,,,0,,"twophase.c",1368,
2012-05-18 00:05:52.277127 
CDT,"infa_write","p1gp1",p15478,th-1760766560,"172.28.8.250","31184",2012-05-18 
00:05:24 
CDT,104342370,con385813,,seg0,,,x104342370,sx1,"LOG","00000","FinishPreparedTransaction:
 found TwoPhaseState entry for 104342346",,,,,,,0,,"twophase.c",1368,
2012-05-18 00:05:52.280285 
CDT,"infa_write","p1gp1",p15535,th-1760766560,"172.28.8.250","31304",2012-05-18 
00:05:25 
CDT,104342371,con385816,,seg0,,,x104342371,sx1,"LOG","00000","FinishPreparedTransaction:
 found TwoPhaseState entry for 104342347",,,,,,,0,,"twophase.c",1368,

Thanks
Doug Little


Reply via email to