Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-11-02 Thread Heikki Linnakangas
On 02.11.2010 00:47, Tom Lane wrote: Greg Stark writes: On Mon, Nov 1, 2010 at 12:37 AM, Heikki Linnakangas wrote: Yes, indeed there is a corner-case bug when you try to stream the very first WAL segment, with log==seg==0. This smells very much like http://article.gmane.org/gmane.comp.db.

Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-11-01 Thread Tom Lane
Greg Stark writes: > On Mon, Nov 1, 2010 at 12:37 AM, Heikki Linnakangas > wrote: >> Yes, indeed there is a corner-case bug when you try to stream the very first >> WAL segment, with log==seg==0. > This smells very much like > http://article.gmane.org/gmane.comp.db.postgresql.devel.general/13705

Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-11-01 Thread Greg Stark
On Mon, Nov 1, 2010 at 12:37 AM, Heikki Linnakangas wrote: > Yes, indeed there is a corner-case bug when you try to stream the very first > WAL segment, with log==seg==0. This smells very much like http://article.gmane.org/gmane.comp.db.postgresql.devel.general/137052 I wonder if there's some de

Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-11-01 Thread Greg Smith
Heikki Linnakangas wrote: Yes, indeed there is a corner-case bug when you try to stream the very first WAL segment, with log==seg==0. I confirmed that the bug exists in only this case by taking my problem install and doing this: psql -d postgres -c "checkpoint; select pg_switch_xlog();" To

Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-11-01 Thread Fujii Masao
On Mon, Nov 1, 2010 at 5:17 PM, Heikki Linnakangas wrote: > Committed that. Thanks for the report, both of you. I'm not subscribed to > pgsql-admin which is why I didn't see Matt's original report. Thanks! Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Softw

Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-11-01 Thread Heikki Linnakangas
On 01.11.2010 09:37, Heikki Linnakangas wrote: On 31.10.2010 23:31, Greg Smith wrote: LOG: replication connection authorized: user=rep host=127.0.0.1 port=52571 FATAL: requested WAL segment 0001 has already been removed Which is confusing because that file is certainly on th

Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-11-01 Thread Heikki Linnakangas
On 31.10.2010 23:31, Greg Smith wrote: LOG: replication connection authorized: user=rep host=127.0.0.1 port=52571 FATAL: requested WAL segment 0001 has already been removed Which is confusing because that file is certainly on the master still, and hasn't even been considered

Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-10-31 Thread Heikki Linnakangas
On 01.11.2010 05:21, Robert Haas wrote: There seem to be two cases in the code that can generate that error. One, attempting to open the file returns ENOENT. Two, after the data has been read, the last-removed position returned by XLogGetLastRemoved precedes the data we think we just read, imply

Re: [HACKERS] SR fails to send existing WAL file after off-line copy

2010-10-31 Thread Robert Haas
On Sun, Oct 31, 2010 at 5:31 PM, Greg Smith wrote: > Which is confusing because that file is certainly on the master still, and > hasn't even been considered archived yet much less removed: > > [mas...@pyramid pg_log]$ ls -l $PGDATA/pg_xlog > -rw--- 1 master master 16777216 Oct 31 16:29 00