On 01.11.2010 09:37, Heikki Linnakangas wrote:
On 31.10.2010 23:31, Greg Smith wrote:
LOG: replication connection authorized: user=rep host=
FATAL: requested WAL segment 000000010000000000000000 has already been

Which is confusing because that file is certainly on the master still,
and hasn't even been considered archived yet much less removed:

[mas...@pyramid pg_log]$ ls -l $PGDATA/pg_xlog
-rw------- 1 master master 16777216 Oct 31 16:29 000000010000000000000000
drwx------ 2 master master 4096 Oct 4 12:28 archive_status
[mas...@pyramid pg_log]$ ls -l $PGDATA/pg_xlog/archive_status/
total 0

So why isn't SR handing that data over? Is there some weird unhandled
corner case this exposes, but that wasn't encountered by the systems the
tutorial was tried out on?

Yes, indeed there is a corner-case bug when you try to stream the very
first WAL segment, with log==seg==0. We keep track of the last removed
WAL segment, and before a piece of WAL is sent to the standby, walsender
checks that the requested WAL segment is > the last removed. Before any
WAL segments have been removed since postmaster startup, the latest
removed segment is initialized to 0/0, with the idea that 0/0 precedes
any valid WAL segment. That's clearly not true though, it does not
precede the very first WAL segment after initdb, 0/0.

Seems that we need to change the meaning of the last removed WAL segment
to avoid the ambiguity of 0/0. Let's store the (last removed)+1 in the
global variable instead.

Committed that. Thanks for the report, both of you. I'm not subscribed to pgsql-admin which is why I didn't see Matt's original report.

  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to