On 2013-04-01 19:51:19 -0700, Jeff Janes wrote: > On Mon, Apr 1, 2013 at 10:37 AM, Jeff Janes <jeff.ja...@gmail.com> wrote: > > > On Tue, Mar 26, 2013 at 4:23 PM, Jeff Davis <pg...@j-davis.com> wrote: > > > >> > >> Patch attached. Only brief testing done, so I might have missed > >> something. I will look more closely later. > >> > > > > After applying your patch, I could run the stress test described here: > > > > http://archives.postgresql.org/pgsql-hackers/2012-02/msg01227.php > > > > But altered to make use of initdb -k, of course. > > > > Over 10,000 cycles of crash and recovery, I encountered two cases of > > checksum failures after recovery, example: > > ... > > > > > > Unfortunately I already cleaned up the data directory before noticing the > > problem, so I have nothing to post for forensic analysis. I'll try to > > reproduce the problem. > > > > > I've reproduced the problem, this time in block 74 of relation > base/16384/4931589, and a tarball of the data directory is here: > > https://docs.google.com/file/d/0Bzqrh1SO9FcELS1majlFcTZsR0k/edit?usp=sharing > > (the table is in database jjanes under role jjanes, the binary is commit > 9ad27c215362df436f8c) > > What I would probably really want is the data as it existed after the crash > but before recovery started, but since the postmaster immediately starts > recovery after the crash, I don't know of a good way to capture this. > > I guess one thing to do would be to extract from the WAL the most recent > FPW for block 74 of relation base/16384/4931589 (assuming it hasn't been > recycled already) and see if it matches what is actually in that block of > that data file, but I don't currently know how to do that.
Since I bragged somewhere else recently that it should be easy to do now that we have pg_xlogdump I hacked it up so it dumps all the full page writes into the directory specified by --dump-bkp=PATH. It currently overwrites previous full page writes to the same page but that should be trivial to change if you want by adding %X.%X for the lsn into the path sprintf. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
>From 09cdce611dc74082901ca1a646135a5ea1af709c Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Tue, 2 Apr 2013 11:43:23 +0200 Subject: [PATCH] pg_xlogdump: add option for dumping full page writes into a directory --- contrib/pg_xlogdump/pg_xlogdump.c | 56 ++++++++++++++++++++++++++++++++++++- 1 file changed, 55 insertions(+), 1 deletion(-) diff --git a/contrib/pg_xlogdump/pg_xlogdump.c b/contrib/pg_xlogdump/pg_xlogdump.c index d6d5498..a5e4186 100644 --- a/contrib/pg_xlogdump/pg_xlogdump.c +++ b/contrib/pg_xlogdump/pg_xlogdump.c @@ -40,6 +40,7 @@ typedef struct XLogDumpConfig bool bkp_details; int stop_after_records; int already_displayed_records; + char *dump_bkp; /* filter options */ int filter_by_rmgr; @@ -373,6 +374,46 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogRecPtr ReadRecPtr, XLogRecord bkpb.block, bkpb.hole_offset, bkpb.hole_length); } } + + if (config->dump_bkp != NULL) + { + int bkpnum; + char *blk = (char *) XLogRecGetData(record) + record->xl_len; + + for (bkpnum = 0; bkpnum < XLR_MAX_BKP_BLOCKS; bkpnum++) + { + BkpBlock bkpb; + char bkpbdata[BLCKSZ]; + int outfd; + char outpath[MAXPGPATH]; + + if (!(XLR_BKP_BLOCK(bkpnum) & record->xl_info)) + continue; + + memcpy(&bkpb, blk, sizeof(BkpBlock)); + blk += sizeof(BkpBlock); + + memcpy(bkpbdata, blk, bkpb.hole_offset); + memset(bkpbdata + bkpb.hole_offset, 0, bkpb.hole_length); + memcpy(bkpbdata + bkpb.hole_offset + bkpb.hole_length, + blk + bkpb.hole_offset, + BLCKSZ - bkpb.hole_offset - bkpb.hole_length); + + blk += BLCKSZ - bkpb.hole_length; + + sprintf(outpath, "%s/%u-%u-%u:%u_%s", config->dump_bkp, + bkpb.node.spcNode, bkpb.node.dbNode, bkpb.node.relNode, + bkpb.block, forkNames[bkpb.fork]); + outfd = open(outpath, O_WRONLY|O_TRUNC|O_CREAT, S_IRUSR|S_IWUSR); + if (outfd < 0) + fatal_error("could not open output file %s: %s", + outpath, strerror(errno)); + if (write(outfd, bkpbdata, BLCKSZ) != BLCKSZ) + fatal_error("could not successfully write output block %s: %s", + outpath, strerror(errno)); + close(outfd); + } + } } static void @@ -387,6 +428,7 @@ usage(void) printf(" -?, --help show this help, then exit\n"); printf("\nContent options:\n"); printf(" -b, --bkp-details output detailed information about backup blocks\n"); + printf(" -d, --dump-bkp=path dump all full page images into PATH\n"); printf(" -e, --end=RECPTR stop reading at log position RECPTR\n"); printf(" -n, --limit=N number of records to display\n"); printf(" -p, --path=PATH directory in which to find log segment files\n"); @@ -413,6 +455,7 @@ main(int argc, char **argv) static struct option long_options[] = { {"bkp-details", no_argument, NULL, 'b'}, + {"dump-bkp", required_argument, NULL, 'd'}, {"end", required_argument, NULL, 'e'}, {"help", no_argument, NULL, '?'}, {"limit", required_argument, NULL, 'n'}, @@ -438,6 +481,7 @@ main(int argc, char **argv) private.endptr = InvalidXLogRecPtr; config.bkp_details = false; + config.dump_bkp = NULL; config.stop_after_records = -1; config.already_displayed_records = 0; config.filter_by_rmgr = -1; @@ -450,7 +494,7 @@ main(int argc, char **argv) goto bad_argument; } - while ((option = getopt_long(argc, argv, "be:?n:p:r:s:t:Vx:", + while ((option = getopt_long(argc, argv, "bd:e:?n:p:r:s:t:Vx:", long_options, &optindex)) != -1) { switch (option) @@ -458,6 +502,16 @@ main(int argc, char **argv) case 'b': config.bkp_details = true; break; + case 'd': + config.dump_bkp = pg_strdup(optarg); + if (!verify_directory(config.dump_bkp)) + { + fprintf(stderr, + "%s: path \"%s\" cannot be opened: %s", + progname, config.dump_bkp, strerror(errno)); + goto bad_argument; + } + break; case 'e': if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2) { -- 1.7.10.4
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers