Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-08-29 Thread Fujii Masao
On Fri, Aug 26, 2016 at 10:03 PM, Fujii Masao  wrote:
> On Wed, Mar 23, 2016 at 7:04 PM, Pavan Deolasee
>  wrote:
>>
>>
>> On Wed, Mar 23, 2016 at 1:13 PM, Michael Paquier 
>> wrote:
>>>
>>>
>>> +   /*
>>> +* Compute targetRecOff. It should typically be greater than short
>>> +* page-header since a valid record can't , but can also be zero
>>> when
>>> +* caller has supplied a page-aligned address or when we are
>>> skipping
>>> +* multi-page continuation record. It doesn't matter though
>>> because
>>> +* ReadPageInternal() will read at least short page-header worth
>>> of
>>> +* data
>>> +*/
>>> This needs some polishing, there is an unfinished sentence here.
>>>
>>> +   targetRecOff = tmpRecPtr % XLOG_BLCKSZ;
>>> targetRecOff, pageHeaderSize and targetPagePtr could be declared
>>> inside directly the new while loop.
>>
>>
>> Thanks Michael for reviewing the patch. I've fixed these issues and new
>> version is attached.
>
> The patch looks good to me. Barring any objections,
> I'll push this and back-patch to 9.3 where pg_xlogdump was added.

Pushed. Thanks!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-08-26 Thread Fujii Masao
On Wed, Mar 23, 2016 at 7:04 PM, Pavan Deolasee
 wrote:
>
>
> On Wed, Mar 23, 2016 at 1:13 PM, Michael Paquier 
> wrote:
>>
>>
>> +   /*
>> +* Compute targetRecOff. It should typically be greater than short
>> +* page-header since a valid record can't , but can also be zero
>> when
>> +* caller has supplied a page-aligned address or when we are
>> skipping
>> +* multi-page continuation record. It doesn't matter though
>> because
>> +* ReadPageInternal() will read at least short page-header worth
>> of
>> +* data
>> +*/
>> This needs some polishing, there is an unfinished sentence here.
>>
>> +   targetRecOff = tmpRecPtr % XLOG_BLCKSZ;
>> targetRecOff, pageHeaderSize and targetPagePtr could be declared
>> inside directly the new while loop.
>
>
> Thanks Michael for reviewing the patch. I've fixed these issues and new
> version is attached.

The patch looks good to me. Barring any objections,
I'll push this and back-patch to 9.3 where pg_xlogdump was added.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-04-05 Thread Craig Ringer
On 1 April 2016 at 21:30, Craig Ringer  wrote:


> I'll attach the new testcase once I either get it to reproduce this bug or
> give up and leave the basic xlogdump testcase alone.
>

I had another bash at this and I still can't reproduce it on master using
the giant commit record approach Andres suggested. In fact I generated a
commit record larger than an entire xlog segment and it was still fine.

The DO procedure I posted upthread, when run on 9.4, reliably produces
segments that the xlogreader cannot decode with the symptoms Pavan
reported. It's fine on 9.6.

So I can't reproduce this on 9.6, but there might be a separate bug on 9.4.

I've attached a patch with the simple tests I added for pg_xlogdump as part
of this. I doubt it'd be desirable to commit the ridiculous commit record
part, but that's trivially removed, and I left it in place in case someone
else wanted to fiddle with other ways to reproduce this.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From bf05b552dd9eb5d2e9b50c77928426817033685d Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Wed, 6 Apr 2016 13:57:32 +0800
Subject: [PATCH] Tests for pg_xlogdump

---
 src/bin/pg_xlogdump/Makefile |   6 +
 src/bin/pg_xlogdump/t/010_pg_xlogdump.pl | 249 +++
 2 files changed, 255 insertions(+)
 create mode 100644 src/bin/pg_xlogdump/t/010_pg_xlogdump.pl

diff --git a/src/bin/pg_xlogdump/Makefile b/src/bin/pg_xlogdump/Makefile
index 11df47d..c1138d2 100644
--- a/src/bin/pg_xlogdump/Makefile
+++ b/src/bin/pg_xlogdump/Makefile
@@ -38,3 +38,9 @@ uninstall:
 
 clean distclean maintainer-clean:
 	rm -f pg_xlogdump$(X) $(OBJS) $(RMGRDESCSOURCES) xlogreader.c
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
diff --git a/src/bin/pg_xlogdump/t/010_pg_xlogdump.pl b/src/bin/pg_xlogdump/t/010_pg_xlogdump.pl
new file mode 100644
index 000..93b2063
--- /dev/null
+++ b/src/bin/pg_xlogdump/t/010_pg_xlogdump.pl
@@ -0,0 +1,249 @@
+use strict;
+use warnings;
+use bigint qw(hex);
+use Cwd;
+use Config;
+use File::Basename qw(basename);
+use List::Util qw(minstr maxstr);
+use PostgresNode;
+use TestLib;
+use Test::More;
+use Data::Dumper;
+
+my $verbose = 1;
+
+sub slurp_xlogdump_lines
+{
+	my ($node, $timeline, $firstwal, $lastwal) = @_;
+	my ($stdout, $stderr) = ('', '');
+
+	if (defined($firstwal) && $firstwal =~ /^[[:xdigit:]]{24}$/)
+	{
+		$firstwal = $node->data_dir . "/pg_xlog/" . $firstwal;
+	}
+
+	if (defined($lastwal) && $lastwal !~ /^[[:xdigit:]]{24}$/)
+	{
+		die("pg_xlogdump expects the last WAL seg to be a bare filename, not '$lastwal'");
+	}
+
+	if (!defined($firstwal) || !defined($lastwal))
+	{
+		my $wal_pattern = sprintf("%s/pg_xlog/%08X" . ("?" x 16), $node->data_dir, $timeline);
+		my @wal = glob $wal_pattern;
+		$firstwal = List::Util::minstr(@wal) if !defined($firstwal);
+		$lastwal = basename(List::Util::maxstr(@wal)) if !defined($lastwal);
+	}
+
+	diag("decoding from " . $firstwal . " to " . $lastwal)
+		if $verbose;
+
+	IPC::Run::run ['pg_xlogdump', $firstwal, $lastwal], '>', \$stdout, '2>', \$stderr;
+
+	like($stderr, qr/(?:invalid record length at [0-9A-F]+\/[0-9A-F]+: wanted 24, got 0|^\s*$)/,
+		'pg_xlogdump exits with expected error or silence');
+
+	diag "xlogdump exited with: '$stderr'"
+		if $verbose;
+
+	my $lineno = 1;
+
+	my @xld_lines = split("\n", $stdout);
+
+	return \@xld_lines;
+}
+
+sub match_xlogdump_lines
+{
+	my ($node, $timeline, $firstwal, $lastwal) = @_;
+
+	my $xld_lines = slurp_xlogdump_lines($node, $timeline, $firstwal, $lastwal);
+
+	my @records;
+	my $lineno = 1;
+
+	for my $xld_line (@$xld_lines)
+	{
+		chomp $xld_line;
+
+		# We don't use Test::More's 'like' here because it'd run a variable number
+		# of tests. Instead do our own diagnostics and fail.
+		if ($xld_line =~ /^rmgr: (\w+)\s+len \(rec\/tot\):\s*([[:digit:]]+)\/\s*([[:digit:]]+), tx:\s*([[:digit:]]*), lsn: ([[:xdigit:]]{1,8})\/([[:xdigit:]]{1,8}), prev ([[:xdigit:]]{1,8})\/([[:xdigit:]]{1,8}), desc: (.*)$/)
+		{
+			my %record = (
+'_raw' => length($xld_line) >= 100 ? substr($xld_line,0,100) . "..." : $xld_line,
+'rmgr' => $1,
+'len_rec' => int($2),
+'len_tot' => int($3),
+'tx' => defined($4) ? int($4) : undef,
+'lsn' => (hex($5)<<32) + hex($6),
+'prev' => (hex($7)<<32) + hex($8),
+'lsn_str' => "$5/$6",
+'prev_str' => "$7/$8",
+'desc' => length($9) >= 100 ? substr($9,0,100) . "..." : $9,
+			);
+
+			if ($record{'prev'} >= $record{'lsn'})
+			{
+diag("in xlog record on line $lineno:\n$xld_line\n   ... prev ptr $record{prev_str} is greater than or equal to lsn ptr $record{lsn_str}");
+cmp_ok($record{prev}, 'lt', $record{lsn}, 'previous pointer less than lsn ptr');
+			}
+
+			push @records, \%record;
+		}
+		else
+		{
+			diag("xlog record on line $lineno:\n$xld_line\n   ...does not match the test pattern");
+			

Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-04-01 Thread Craig Ringer
On 31 March 2016 at 16:13, Andres Freund  wrote:


> It's probably easier to just generate a humongous commit record. You can
> do so by having a *lot* of subtransactions. Relatively easy to do with
> plpgsql by creating them in a loop (SELECT txid_current() in EXCEPTION
> bearing block ought to suffice).
>
>
This does the trick and does it quickly on 9.4:

CREATE TABLE trivial(padding text);
ALTER TABLE trivial ALTER COLUMN padding SET STORAGE plain;

DO
LANGUAGE plpgsql
$$
DECLARE
  wal_seg_size integer;
  remaining_size integer;
BEGIN
  wal_seg_size := (select setting from pg_settings where name =
'wal_segment_size')::integer
* (select setting from pg_settings where name =
'wal_block_size')::integer;
  LOOP
SELECT wal_seg_size - file_offset FROM
pg_xlogfile_name_offset(pg_current_xlog_insert_location()) INTO
remaining_size;
IF remaining_size < 8192
THEN
-- Make a nice big commit record right at the end of the segment
EXIT;
ELSE
BEGIN
  -- About 4k
  INSERT INTO trivial(padding) VALUES (repeat('0123456789abcdef',
256));
EXCEPTION
  WHEN division_by_zero THEN
CONTINUE;
END;
END IF;
  END LOOP;
END;
$$;


I had no issue reproducing the bug on 9.4, but I don't see it in 9.6. At
least, not this one, not yet.

Still, we might want to backpatch the fix; it seems clear enough even if I
can't reproduce the issue yet. It doesn't look like it'll affect logical
decoding since the snapshot builder has its own unrelated logic for finding
the startpoint for decoding, and it certainly doesn't affect WAL replay
otherwise we would've seen the fireworks much earlier. The upside is that
also makes the backpatch much safer.

I was surprised to see that there are no tests for pg_xlogdump to add this
on to, so I've added a trivial test. I've got some more complete
xlogdump-helper code in the failover slots tests that I should extract and
add to PostgresNode.pm but this'll do in the mean time and is much simpler.

In the process I noticed that pg_xlogdump doesn't follow the rules of the
rest of the codebase when it comes to command line handling with --version,
returning nonzero on bad options, etc. It might be good to fix that;
there's a small BC impact, but I doubt anyone cares when it comes to
pg_xlogdump.

I'll attach the new testcase once I either get it to reproduce this bug or
give up and leave the basic xlogdump testcase alone.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-03-31 Thread Andres Freund
On 2016-03-31 09:41:46 +0530, Pavan Deolasee wrote:
> On Thu, Mar 31, 2016 at 6:27 AM, Craig Ringer  wrote:
> >
> >
> > Can you describe the process used to generate the sample WAL segment?
> >
> >
> Shame that I can't find the sql file used to create the problematic WAL
> segment. But this is what I did.

It's probably easier to just generate a humongous commit record. You can
do so by having a *lot* of subtransactions. Relatively easy to do with
plpgsql by creating them in a loop (SELECT txid_current() in EXCEPTION
bearing block ought to suffice).

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-03-30 Thread Pavan Deolasee
On Thu, Mar 31, 2016 at 6:27 AM, Craig Ringer  wrote:
>
>
> Can you describe the process used to generate the sample WAL segment?
>
>
Shame that I can't find the sql file used to create the problematic WAL
segment. But this is what I did.

I wrote a plpgsql function which inserts rows in a loop, every time
checking if the remaining space in the WAL segment  has fallen to less than
couple of hundred bytes. Of course, I used pg_settings to get the WAL
segment size, WAL page size and pg_current_xlog_insert_location() to
correctly compute remaining bytes in the WAL segment. At this point, do a
non-HOT update, preferably to table which is already fully vacuumed and
CHECKPOINTed to avoid getting any other WAL records in between. Assuming
FPW is turned ON, the UPDATE should generate enough WAL to cross over the
first page in the new WAL segment.

Let me know if you would like to me to put together a sql based on this
description.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-03-30 Thread Craig Ringer
On 23 March 2016 at 18:04, Pavan Deolasee  wrote:

>
>
> On Wed, Mar 23, 2016 at 1:13 PM, Michael Paquier <
> michael.paqu...@gmail.com> wrote:
>
>>
>> +   /*
>> +* Compute targetRecOff. It should typically be greater than short
>> +* page-header since a valid record can't , but can also be zero
>> when
>> +* caller has supplied a page-aligned address or when we are
>> skipping
>> +* multi-page continuation record. It doesn't matter though
>> because
>> +* ReadPageInternal() will read at least short page-header worth
>> of
>> +* data
>> +*/
>> This needs some polishing, there is an unfinished sentence here.
>>
>> +   targetRecOff = tmpRecPtr % XLOG_BLCKSZ;
>> targetRecOff, pageHeaderSize and targetPagePtr could be declared
>> inside directly the new while loop.
>>
>
> Thanks Michael for reviewing the patch. I've fixed these issues and new
> version is attached.
>

Looks sensible to me based on a reading of "git diff -w" of the applied
patch. It passes make check and make -C src/test/recovery check . Marked
ready for committer; I'd like to add a TAP test for it, but it's ready to
go without that.



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-03-30 Thread Craig Ringer
On 23 March 2016 at 18:04, Pavan Deolasee  wrote:

>
>
> On Wed, Mar 23, 2016 at 1:13 PM, Michael Paquier <
> michael.paqu...@gmail.com> wrote:
>
>>
>> +   /*
>> +* Compute targetRecOff. It should typically be greater than short
>> +* page-header since a valid record can't , but can also be zero
>> when
>> +* caller has supplied a page-aligned address or when we are
>> skipping
>> +* multi-page continuation record. It doesn't matter though
>> because
>> +* ReadPageInternal() will read at least short page-header worth
>> of
>> +* data
>> +*/
>> This needs some polishing, there is an unfinished sentence here.
>>
>> +   targetRecOff = tmpRecPtr % XLOG_BLCKSZ;
>> targetRecOff, pageHeaderSize and targetPagePtr could be declared
>> inside directly the new while loop.
>>
>
> Thanks Michael for reviewing the patch. I've fixed these issues and new
> version is attached.
>
>
Can you describe the process used to generate the sample WAL segment?

I'd like to turn it into a TAP test to go along with the patch.



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-03-30 Thread Michael Paquier
On Mon, Mar 28, 2016 at 10:15 PM, Andres Freund  wrote:
> It's definitely too late for that; they're going to be wrapped in a
> couple hours.

I have added this patch to the next CF so as we do not lose track of this bug:
https://commitfest.postgresql.org/10/593/
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-03-28 Thread Andres Freund
On 2016-03-28 13:21:41 +0530, Pavan Deolasee wrote:
> On Wed, Mar 23, 2016 at 6:16 PM, Michael Paquier 
> wrote:
> 
> >
> >
> > I'd just add dots at the end of the sentences in the comment blocks
> > because that's project style, but I'm being picky, except that the
> > logic looks quite good.
> >
> 
> Since this is a bug affecting all stable branches, IMHO it will be a good
> idea to fix this before the upcoming minor releases.

It's definitely too late for that; they're going to be wrapped in a
couple hours.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-03-28 Thread Pavan Deolasee
On Wed, Mar 23, 2016 at 6:16 PM, Michael Paquier 
wrote:

>
>
> I'd just add dots at the end of the sentences in the comment blocks
> because that's project style, but I'm being picky, except that the
> logic looks quite good.
>

Since this is a bug affecting all stable branches, IMHO it will be a good
idea to fix this before the upcoming minor releases.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-03-23 Thread Michael Paquier
On Wed, Mar 23, 2016 at 7:04 PM, Pavan Deolasee
 wrote:
>
>
> On Wed, Mar 23, 2016 at 1:13 PM, Michael Paquier 
> wrote:
>>
>>
>> +   /*
>> +* Compute targetRecOff. It should typically be greater than short
>> +* page-header since a valid record can't , but can also be zero
>> when
>> +* caller has supplied a page-aligned address or when we are
>> skipping
>> +* multi-page continuation record. It doesn't matter though
>> because
>> +* ReadPageInternal() will read at least short page-header worth
>> of
>> +* data
>> +*/
>> This needs some polishing, there is an unfinished sentence here.
>>
>> +   targetRecOff = tmpRecPtr % XLOG_BLCKSZ;
>> targetRecOff, pageHeaderSize and targetPagePtr could be declared
>> inside directly the new while loop.
>
>
> Thanks Michael for reviewing the patch. I've fixed these issues and new
> version is attached.

I'd just add dots at the end of the sentences in the comment blocks
because that's project style, but I'm being picky, except that the
logic looks quite good.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-03-23 Thread Pavan Deolasee
On Wed, Mar 23, 2016 at 1:13 PM, Michael Paquier 
wrote:

>
> +   /*
> +* Compute targetRecOff. It should typically be greater than short
> +* page-header since a valid record can't , but can also be zero
> when
> +* caller has supplied a page-aligned address or when we are
> skipping
> +* multi-page continuation record. It doesn't matter though because
> +* ReadPageInternal() will read at least short page-header worth of
> +* data
> +*/
> This needs some polishing, there is an unfinished sentence here.
>
> +   targetRecOff = tmpRecPtr % XLOG_BLCKSZ;
> targetRecOff, pageHeaderSize and targetPagePtr could be declared
> inside directly the new while loop.
>

Thanks Michael for reviewing the patch. I've fixed these issues and new
version is attached.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


xlogdump_multipage_cont_record_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-03-23 Thread Michael Paquier
On Wed, Mar 23, 2016 at 3:43 PM, Pavan Deolasee
 wrote:
> While investigating some issue, I found that pg_xlogdump fails to dump
> contents from a WAL file if the file has continuation data from previous WAL
> record and the data spans more than one page. In such cases,
> XLogFindNextRecord() fails to take into account that there will be more than
> one xlog page headers (long and short) and thus tries to read from an offset
> where no valid record exists. That results in pg_xlogdump throwing error
> such as:

OK. That makes sense, that is indeed a possible scenario.

> Attached WAL file from master branch demonstrates the issue, generated using
> synthetic data. Also, attached patch fixes it for me.

So does it for me.

> While we could have deduced the number of short and long headers and skipped
> directly to the offset, I found reading one page at a time and using
> XLogPageHeaderSize() to find header size of each page separately, a much
> cleaner way. Also, the continuation data is not going to span many pages. So
> I don't see any harm in doing it that way.

I have to agree, the new code is pretty clean this way by calculating
the next page LSN directly in the loop should the record be longer
than it.

> I encountered this on 9.3, but the patch applies to both 9.3 and master. I
> haven't tested it on other branches, but I have no reason to believe it
> won't apply or work. I believe we should back patch it all supported
> branches.

Agreed, that's a bug, and the logic of your patch looks good to me.

+   /*
+* Compute targetRecOff. It should typically be greater than short
+* page-header since a valid record can't , but can also be zero when
+* caller has supplied a page-aligned address or when we are skipping
+* multi-page continuation record. It doesn't matter though because
+* ReadPageInternal() will read at least short page-header worth of
+* data
+*/
This needs some polishing, there is an unfinished sentence here.

+   targetRecOff = tmpRecPtr % XLOG_BLCKSZ;
targetRecOff, pageHeaderSize and targetPagePtr could be declared
inside directly the new while loop.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers