On 2021/01/14 13:59, Michael Paquier wrote:
Hi Fujii-san,
On Thu, Jan 14, 2021 at 03:32:52AM +0000, Fujii Masao wrote:
Ensure that a standby is able to follow a primary on a newer timeline.
Commit 709d003fbd refactored WAL-reading code, but accidentally caused
WalSndSegmentOpen() to fail to follow a timeline switch while reading from
a historic timeline. This issue caused a standby to fail to follow a primary
on a newer timeline when WAL archiving is enabled.
florican is telling that this test has some stability problems:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=florican&dt=2021-01-14%2003%3A55%3A45
Here I can see that replication keeps asking for a segment that's
already gone:
2021-01-13 23:34:52.104 EST [64611:1] LOG: started streaming WAL from
primary at 0/3000000 on timeline 1
2021-01-13 23:34:52.104 EST [64611:2] FATAL: could not receive data
from WAL stream: ERROR: requested WAL segment
000000010000000000000003 has already been removed
Thanks for reporting this! I'm looking at this issue.
My guess is that the requested WAL file was removed unfortunately by
checkpoint because no replication slot is used and wal_keep_size is not set.
So easy fix is to set wal_keep_size to 512MB or other in that test. Thought?
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION