Re: [HACKERS] corrupt pages detected by enabling checksums

Andres Freund Wed, 03 Apr 2013 17:39:56 -0700

On 2013-04-04 02:28:32 +0200, Andres Freund wrote:
> On 2013-04-04 01:52:41 +0200, Andres Freund wrote:
> > On 2013-04-03 15:57:49 -0700, Jeff Janes wrote:
> > > I've changed the subject from "regression test failed when enabling
> > > checksum" because I now know they are totally unrelated.
> > > 
> > > My test case didn't need to depend on archiving being on, and so with a
> > > simple tweak I rendered the two issues orthogonal.
> > > 
> > > 
> > > On Wed, Apr 3, 2013 at 12:15 PM, Jeff Davis <pg...@j-davis.com> wrote:
> > > 
> > > > On Mon, 2013-04-01 at 19:51 -0700, Jeff Janes wrote:
> > > >
> > > > > What I would probably really want is the data as it existed after the
> > > > > crash but before recovery started, but since the postmaster
> > > > > immediately starts recovery after the crash, I don't know of a good
> > > > > way to capture this.
> > > >
> > > > Can you just turn off the restart_after_crash GUC? I had a chance to
> > > > look at this, and seeing the block before and after recovery would be
> > > > nice. I didn't see a log file in the data directory, but it didn't go
> > > > through recovery, so I assume it already did that.
> > > >
> > > 
> > > You don't know that the cluster is in the bad state until after it goes
> > > through recovery because most crashes recover perfectly fine.  So it would
> > > have to make a side-copy of the cluster after the crash, then recover the
> > > original and see how things go, then either retain or delete the 
> > > side-copy.
> > >  Unfortunately my testing harness can't do this at the moment, because the
> > > perl script storing the consistency info needs to survive over the crash
> > > and recovery.   It might take me awhile to figure out how to make it do
> > > this.
> > > 
> > > I have the server log just go to stderr, where it gets mingled together
> > > with messages from my testing harness.  I had not uploaded that file.
> > > 
> > > Here is a new upload. It contains both a data_dir tarball including xlog,
> > > and the mingled stderr ("do_new.out")
> > > 
> > > https://drive.google.com/folderview?id=0Bzqrh1SO9FcEQmVzSjlmdWZvUHc&usp=sharing
> > > 
> > > The other 3 files in it constitute the harness.  It is a quite a mess, 
> > > with
> > > some hard-coded paths.  The just-posted fix for wal_keep_segments will 
> > > also
> > > have to be applied.
> > > 
> > > 
> > > 
> > > >
> > > > The block is corrupt as far as I can tell. The first third is written,
> > > > and the remainder is all zeros. The header looks like this:
> > > >
> > > 
> > > Yes, that part is by my design.  Why it didn't get fixed from a FPI is not
> > > by my design, or course.
> > 
> > There are no FPIs (if you mean a full page image with that) afaics:
> > 
> > Your logs tell us about recovery:
> > 27692  2013-04-03 10:09:15.621 PDT:LOG:  redo starts at 1/31000090
> > 27692  2013-04-03 10:09:15.647 PDT:LOG:  incorrect resource manager data 
> > checksum in record at 1/31169A68
> > 27692  2013-04-03 10:09:15.647 PDT:LOG:  redo done at 1/31169A38


> Looking at the page lsn's with dd I noticed something peculiar:
> 
> page 0:
> 01 00 00 00 18 c2 00 31 => 1/3100C218
> page 1:
> 01 00 00 00 80 44 01 31 => 1/31014480
> page 10:
> 01 00 00 00 60 ce 05 31 => 1/3105ce60
> page 43:
> 01 00 00 00 58 7a 16 31 => 1/31167a58
> page 44:
> 01 00 00 00 f0 99 16 31 => 1/311699f0
> page 45:
> 00 00 00 00 00 00 00 00 => 0/0
> page 90:
> 01 00 00 00 90 17 1d 32 => 1/321d1790
> page 91:
> 01 00 00 00 38 ef 1b 32 => 1/321bef38
> 
> So we have written out pages that are after pages without a LSN that
> have an LSN thats *beyond* the point XLOG has successfully been written
> to disk (1/31169A38)?

By now I am pretty sure the issue is that the end-of-wal is detected too
early. Above the end is detected at 1/31169A38, but the next WAL file
contains valid data:
pg_xlogdump /tmp/tmp/data2/pg_xlog/000000010000000100000032 -n 10
rmgr: Heap2       len (rec/tot):     26/    58, tx:          0, lsn: 
1/32000030, prev 1/31FFFFD8, bkp: 0000, desc: clean: rel 1663/16384/835589; blk 
122 remxid 26328926
rmgr: Heap        len (rec/tot):     51/    83, tx:   26328964, lsn: 
1/32000070, prev 1/32000030, bkp: 0000, desc: update: rel 1663/16384/835589; 
tid 122/191 xmax 26328964 ; new tid 129/140 xmax 0
rmgr: Heap2       len (rec/tot):     30/    62, tx:          0, lsn: 
1/320000C8, prev 1/32000070, bkp: 0000, desc: clean: rel 1663/16384/835589; blk 
63 remxid 26328803
rmgr: Btree       len (rec/tot):     34/    66, tx:   26328964, lsn: 
1/32000108, prev 1/320000C8, bkp: 0000, desc: insert: rel 1663/16384/835590; 
tid 25/14

That would explains exactly those symptopms, wouldn't it? Whats causing
that - I am not sure yet.

Greetings,

Andres Freund

-- 
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] corrupt pages detected by enabling checksums

Reply via email to