Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-11-02 Thread Heikki Linnakangas

On 02.11.2010 00:47, Tom Lane wrote:

Greg Starkgsst...@mit.edu  writes:

On Mon, Nov 1, 2010 at 12:37 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

Yes, indeed there is a corner-case bug when you try to stream the very first
WAL segment, with log==seg==0.



This smells very much like
http://article.gmane.org/gmane.comp.db.postgresql.devel.general/137052



I wonder if there's some defensive programming way to avoid bugs of this sort.


It strikes me that it's not good if there isn't a recognizable invalid
value for WAL locations.  These bits of code show that there is reason
to have one.  Maybe we should teach initdb to start the WAL one segment
later, and then 0/0 *would* mean invalid, and we could revert these
other hacks.


Good idea. That can even be back-patched to 9.0, it should have no 
effect on existing installations,


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-11-01 Thread Heikki Linnakangas

On 01.11.2010 05:21, Robert Haas wrote:

There seem to be two cases in the code that can generate that error.
One, attempting to open the file returns ENOENT.  Two, after the data
has been read, the last-removed position returned by
XLogGetLastRemoved precedes the data we think we just read, implying
that it was overwritten while we were in the process of reading it.
Does your installation have debugging symbols?  Can you figure out
which case is triggering (inside XLogRead) and what the values of log,
seg, lastRemovedLog, and lastRemovedSeg are?


An easy way to find out which ereport() it is is to set 
log_error_verbosity='verbose' and re-run the test. You then get the 
file and line number of the ereport in the log.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-11-01 Thread Heikki Linnakangas

On 31.10.2010 23:31, Greg Smith wrote:

LOG: replication connection authorized: user=rep host=127.0.0.1 port=52571
FATAL: requested WAL segment 0001 has already been
removed

Which is confusing because that file is certainly on the master still,
and hasn't even been considered archived yet much less removed:

[mas...@pyramid pg_log]$ ls -l $PGDATA/pg_xlog
-rw--- 1 master master 16777216 Oct 31 16:29 0001
drwx-- 2 master master 4096 Oct 4 12:28 archive_status
[mas...@pyramid pg_log]$ ls -l $PGDATA/pg_xlog/archive_status/
total 0

So why isn't SR handing that data over? Is there some weird unhandled
corner case this exposes, but that wasn't encountered by the systems the
tutorial was tried out on?


Yes, indeed there is a corner-case bug when you try to stream the very 
first WAL segment, with log==seg==0. We keep track of the last removed 
WAL segment, and before a piece of WAL is sent to the standby, walsender 
checks that the requested WAL segment is  the last removed. Before any 
WAL segments have been removed since postmaster startup, the latest 
removed segment is initialized to 0/0, with the idea that 0/0 precedes 
any valid WAL segment. That's clearly not true though, it does not 
precede the very first WAL segment after initdb, 0/0.


Seems that we need to change the meaning of the last removed WAL segment 
to avoid the ambiguity of 0/0. Let's store the (last removed)+1 in the 
global variable instead.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-11-01 Thread Heikki Linnakangas

On 01.11.2010 09:37, Heikki Linnakangas wrote:

On 31.10.2010 23:31, Greg Smith wrote:

LOG: replication connection authorized: user=rep host=127.0.0.1
port=52571
FATAL: requested WAL segment 0001 has already been
removed

Which is confusing because that file is certainly on the master still,
and hasn't even been considered archived yet much less removed:

[mas...@pyramid pg_log]$ ls -l $PGDATA/pg_xlog
-rw--- 1 master master 16777216 Oct 31 16:29 0001
drwx-- 2 master master 4096 Oct 4 12:28 archive_status
[mas...@pyramid pg_log]$ ls -l $PGDATA/pg_xlog/archive_status/
total 0

So why isn't SR handing that data over? Is there some weird unhandled
corner case this exposes, but that wasn't encountered by the systems the
tutorial was tried out on?


Yes, indeed there is a corner-case bug when you try to stream the very
first WAL segment, with log==seg==0. We keep track of the last removed
WAL segment, and before a piece of WAL is sent to the standby, walsender
checks that the requested WAL segment is  the last removed. Before any
WAL segments have been removed since postmaster startup, the latest
removed segment is initialized to 0/0, with the idea that 0/0 precedes
any valid WAL segment. That's clearly not true though, it does not
precede the very first WAL segment after initdb, 0/0.

Seems that we need to change the meaning of the last removed WAL segment
to avoid the ambiguity of 0/0. Let's store the (last removed)+1 in the
global variable instead.


Committed that. Thanks for the report, both of you. I'm not subscribed 
to pgsql-admin which is why I didn't see Matt's original report.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-11-01 Thread Fujii Masao
On Mon, Nov 1, 2010 at 5:17 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Committed that. Thanks for the report, both of you. I'm not subscribed to
 pgsql-admin which is why I didn't see Matt's original report.

Thanks!

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-11-01 Thread Greg Smith

Heikki Linnakangas wrote:
Yes, indeed there is a corner-case bug when you try to stream the very 
first WAL segment, with log==seg==0.


I confirmed that the bug exists in only this case by taking my problem 
install and doing this:


psql -d postgres -c checkpoint; select pg_switch_xlog();

To force it to the next xlog file.  With only that change, everything 
else then works.  So we'll just need to warn people about this issue and 
provide that workaround, as something that only trivial new installs 
without much data loaded into them are likely to run into, until 9.0.2 
ships with your fix in it.  I'll update the docs on the wiki 
accordingly, once I've recovered from this morning's flight out West.


I forgot to credit Robert Noles here for rediscovering this bug on one 
of our systems and bringing it to my attention.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services and Supportwww.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-11-01 Thread Greg Stark
On Mon, Nov 1, 2010 at 12:37 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Yes, indeed there is a corner-case bug when you try to stream the very first
 WAL segment, with log==seg==0.

This smells very much like
http://article.gmane.org/gmane.comp.db.postgresql.devel.general/137052

I wonder if there's some defensive programming way to avoid bugs of this sort.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-11-01 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 On Mon, Nov 1, 2010 at 12:37 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Yes, indeed there is a corner-case bug when you try to stream the very first
 WAL segment, with log==seg==0.

 This smells very much like
 http://article.gmane.org/gmane.comp.db.postgresql.devel.general/137052

 I wonder if there's some defensive programming way to avoid bugs of this sort.

It strikes me that it's not good if there isn't a recognizable invalid
value for WAL locations.  These bits of code show that there is reason
to have one.  Maybe we should teach initdb to start the WAL one segment
later, and then 0/0 *would* mean invalid, and we could revert these
other hacks.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] SR fails to send existing WAL file after off-line copy

2010-10-31 Thread Greg Smith
Last week we got this report from Matt Chesler:  
http://archives.postgresql.org/pgsql-admin/2010-10/msg00221.php that he 
was getting errors when trying to do a simple binary replication test.  
The problem is that what appears to be a perfectly good WAL segment 
doesn't get streamed to the standby.  No one responded. 

One of our testers just ran into the same thing.  I just investigated, 
and I'm baffled as to what's going on myself.  Can't tell if this is a 
bug or an under-documented restriction, but this makes two reports of 
the problem now.  (Mine is happening on a standard 9.0.0 RPM set, didn't 
notice any changes in 9.0.1 that would impact this; afraid to upgrade 
while I have a repeatable test case for this sitting here)


The setup is intended to get a simple test replication setup going 
without even having to do the whole pg_start_backup shuffle, by copying 
the whole server directory when it's down.  Basic steps are:


-Follow the first set of instructions at 
http://wiki.postgresql.org/wiki/Binary_Replication_Tutorial to setup a 
master compatible with replication, then duplicate it after stopping it 
using rsync.  Note that you may have to manually create an empty pg_xlog 
directory on the standby, depending on what was there before you started 
replication. 

To rule out one possible source of problems here, I made an additional 
change on the master not listed there:


[mas...@pyramid pg_log]$ psql -d postgres -c show wal_keep_segments
wal_keep_segments
---
10

I wondered if having this set to 0 (the default) was causing the issue, 
thinking perhaps it doesn't do any looking for existing segments at all 
in that situation.  Problem still happens for me.


-Create a recovery.conf pointing to the master as described in the tutorial.

-Start the standby.  Make sure that it has reached the point where it's 
requesting WAL segments from the master; you want to see it looping 
doing periodic FATAL:  could not connect to the primary server: could 
not connect to server: Connection refused before touching the master.


-Start the master

What I expect to happen now is that the current WAL file that was in 
progress at the point the data was copied over gets streamed over.  That 
doesn't seem to happen.  Instead, I see this on the standby:


LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL 
segment 0001 has already been removed


This on the master:

LOG:  replication connection authorized: user=rep host=127.0.0.1 port=52571
FATAL:  requested WAL segment 0001 has already been 
removed


Which is confusing because that file is certainly on the master still, 
and hasn't even been considered archived yet much less removed:


[mas...@pyramid pg_log]$ ls -l $PGDATA/pg_xlog
-rw--- 1 master master 16777216 Oct 31 16:29 0001
drwx-- 2 master master 4096 Oct  4 12:28 archive_status
[mas...@pyramid pg_log]$ ls -l $PGDATA/pg_xlog/archive_status/
total 0

So why isn't SR handing that data over?  Is there some weird unhandled 
corner case this exposes, but that wasn't encountered by the systems the 
tutorial was tried out on?  I'm not familiar enough with the SR 
internals to reason out what's going wrong myself yet.  Wanted to 
validate that Matt's report wasn't a unique one though, with a bit more 
detail included about the state the system gets into, and one potential 
fix (increasing wal_keep_segments) already tried without improvement.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services and Supportwww.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-10-31 Thread Robert Haas
On Sun, Oct 31, 2010 at 5:31 PM, Greg Smith g...@2ndquadrant.com wrote:
 Which is confusing because that file is certainly on the master still, and
 hasn't even been considered archived yet much less removed:

 [mas...@pyramid pg_log]$ ls -l $PGDATA/pg_xlog
 -rw--- 1 master master 16777216 Oct 31 16:29 0001
 drwx-- 2 master master     4096 Oct  4 12:28 archive_status
 [mas...@pyramid pg_log]$ ls -l $PGDATA/pg_xlog/archive_status/
 total 0

 So why isn't SR handing that data over?  Is there some weird unhandled
 corner case this exposes, but that wasn't encountered by the systems the
 tutorial was tried out on?  I'm not familiar enough with the SR internals to
 reason out what's going wrong myself yet.  Wanted to validate that Matt's
 report wasn't a unique one though, with a bit more detail included about the
 state the system gets into, and one potential fix (increasing
 wal_keep_segments) already tried without improvement.

There seem to be two cases in the code that can generate that error.
One, attempting to open the file returns ENOENT.  Two, after the data
has been read, the last-removed position returned by
XLogGetLastRemoved precedes the data we think we just read, implying
that it was overwritten while we were in the process of reading it.
Does your installation have debugging symbols?  Can you figure out
which case is triggering (inside XLogRead) and what the values of log,
seg, lastRemovedLog, and lastRemovedSeg are?

Even if you lack debugging symbols, if you have gdb, you might be able
figure out which case is triggering by looking at whether
XLogGetLastRemoved gets called before the error message is printed
(put a breakpoint on that function).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers