RE: prevent immature WAL streaming

Jakub Wartak Wed, 25 Aug 2021 05:00:13 -0700

Hi Álvaro, -hackers, 

> I attach the patch with the change you suggested.


I've gave a shot to to the v02 patch on top of REL_12_STABLE (already including 
5065aeafb0b7593c04d3bc5bc2a86037f32143fc). Previously(yesterday) without the 
v02 patch I was getting standby corruption always via simulation by having 
separate /pg_xlog dedicated fs, and archive_mode=on, wal_keep_segments=120, 
archive_command set to rsync to different dir on same fs, wal_init_zero at 
default(true). 

Today (with v02) I've got corruption in only initial 2 runs out of ~ >30 tries 
on standby. Probably the 2 failures were somehow my fault (?) or some rare 
condition (and in 1 of those 2 cases simply restarting standby did help). To be 
honest I've tried to force this error, but with v02 I simply cannot force this 
error anymore, so that's good! :)

> I didn't have a lot of luck with a reliable reproducer script.  I was able to
> reproduce the problem starting with Ryo Matsumura's script and attaching
> a replica; most of the time the replica would recover by restarting from a
> streaming position earlier than where the problem occurred; but a few
> times it would just get stuck with a WAL segment containing a bogus
> record.  

In order to get reliable reproducer and get proper the fault injection instead 
of playing with really filling up fs, apparently one could substitute fd with 
fd of /dev/full using e.g. dup2() so that every write is going to throw this 
error too:

root@hive:~# ./t & # simple while(1) { fprintf() flush () } testcase
root@hive:~# ls -l /proc/27296/fd/3
lrwx------ 1 root root 64 Aug 25 06:22 /proc/27296/fd/3 -> /tmp/testwrite
root@hive:~# gdb -q -p 27296
-- 1089 is bitmask O_WRONLY|..
(gdb) p dup2(open("/dev/full", 1089, 0777), 3)
$1 = 3
(gdb) c
Continuing.
==>
fflush/write(): : No space left on device

So I've also tried to be malicious while writing to the DB and inject ENOSPCE 
near places like:
 
a) XLogWrite()->XLogFileInit() near line 3322 // assuming: if (wal_init_zero) 
is true, one gets classic "PANIC:  could not write to file 
"pg_wal/xlogtemp.90670": No space left on device"
b) XLogWrite() near line 2547 just after pg_pwrite // one can get "PANIC:  
could not write to log file 000000010000003B000000A8 at offset 0, length 
15466496: No space left on device" (that would be possible with 
wal_init_zero=false?)
c) XLogWrite() near line 2592 // just before issue_xlog_fsync to get "PANIC:  
could not fdatasync file "000000010000004300000004": Invalid argument" that 
would pretty much mean same as above but with last possible offset near end of 
WAL? 

This was done with gdb voodoo:
handle SIGUSR1 noprint nostop
break xlog.c:<LINE> // 
https://github.com/postgres/postgres/blob/REL_12_STABLE/src/backend/access/transam/xlog.c#L3311
c
print fd or openLogFile -- to verify it is 3
p dup2(open("/dev/full", 1089, 0777), 3) -- during most of walwriter runtime it 
has current log as fd=3

After restarting master and inspecting standby - in all of those above 3 cases 
- the standby didn't inhibit the "invalid contrecord length" at least here, 
while without corruption this v02 patch it is notorious. So if it passes the 
worst-case code review assumptions I would be wondering if it shouldn't even be 
committed as it stands right now.

-J.

RE: prevent immature WAL streaming

Reply via email to