On Thu, Jan 24, 2013 at 11:53 PM, MauMau <maumau...@gmail.com> wrote: > From: "Fujii Masao" <masao.fu...@gmail.com> >> >> On Thu, Jan 24, 2013 at 7:42 AM, MauMau <maumau...@gmail.com> wrote: >>> >>> I searched through PostgreSQL mailing lists with "WAL contains references >>> to >>> invalid pages", and i found 19 messages. Some people encountered similar >>> problem. There were some discussions regarding those problems (Tom and >>> Simon Riggs commented), but those discussions did not reach a solution. >>> >>> I also found a discussion which might relate to this problem. Does this >>> fix >>> the problem? >>> >>> [BUG] lag of minRecoveryPont in archive recovery >>> >>> http://www.postgresql.org/message-id/20121206.130458.170549097.horiguchi.kyot...@lab.ntt.co.jp >> >> >> Yes. Could you check whether you can reproduce the problem on the >> latest REL9_2_STABLE? > > > I tried to produce the problem by doing "pg_ctl stop -mi" against the > primary more than ten times on REL9_2_STABLE, but the problem did not > appear. However, I encountered the crash only once out of dozens of > failovers, possibly more than a hundred times, on PostgreSQL 9.1.6. So, I'm > not sure the problem is fixed in REL9_2_STABLE.
You can reproduce the problem in REL9_1_STABLE? Sorry, I pointed wrong version, i.e., REL9_2_STABLE. Since you encountered the problem in 9.1.6, you need to retry the test in REL9_1_STABLE. > > I'm wondering if the fix discussed in the above thread solves my problem. I > found the following differences between Horiguchi-san's case and my case: > > (1) > Horiguchi-san says the bug outputs the message: > > WARNING: page 0 of relation base/16384/16385 does not exist > > On the other hand, I got the message: > > > WARNING: page 506747 of relation base/482272/482304 was uninitialized > > > (2) > Horiguchi-san produced the problem when he shut the standby immediately and > restarted it. However, I saw the problem during failover. > > > (3) > Horiguchi-san did not use any index, but in my case the WARNING message > refers to an index. > > > But there's a similar point. Horiguchi-san says the problem occurs after > DELETE+VACUUM. In my case, I shut the primary down while the application > was doing INSERT/UPDATE. As the below messages show, some vacuuming was > running just before the immediate shutdown: > > ... > LOG: automatic vacuum of table "GOLD.scm1.tbl1": index scans: 0 > pages: 0 removed, 36743 remain > tuples: 0 removed, 73764 remain > system usage: CPU 0.09s/0.11u sec elapsed 0.66 sec > LOG: automatic analyze of table "GOLD.scm1.tbl1" system usage: CPU > 0.00s/0.14u sec elapsed 0.32 sec > LOG: automatic vacuum of table "GOLD.scm1.tbl2": index scans: 0 > pages: 0 removed, 12101 remain > tuples: 40657 removed, 44142 remain system usage: CPU 0.06s/0.06u sec > elapsed 0.30 sec > LOG: automatic analyze of table "GOLD.scm1.tbl2" system usage: CPU > 0.00s/0.06u sec elapsed 0.14 sec > LOG: received immediate shutdown request > ... > > > Could you tell me the details of the problem discussed and fixed in the > upcoming minor release? I would to like to know the phenomenon and its > conditions, and whether it applies to my case. http://www.postgresql.org/message-id/20121206.130458.170549097.horiguchi.kyot...@lab.ntt.co.jp Could you read the discussion in the above thread? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers