On Mon, Feb 26, 2018 at 07:25:46AM +0000, Tsunakawa, Takayuki wrote: > From: Michael Paquier [mailto:mich...@paquier.xyz] >> The WAL receiver approach also has a drawback. If WAL is streamed at full >> speed, then the primary sends data with a maximum of 6 WAL pages. >> When beginning streaming with a new segment, then the WAL sent stops at >> page boundary. But if you stop once in the middle of a page then you need >> to zero-fill the page until the current segment is finished streaming. So >> if the workload generates spiky WAL then the WAL receiver can would a lot >> of extra lseek() calls with the patch applied, while all the writes would >> be sequential on HEAD, so that's not performant-wise IMO. > > Does even the non-cascading standby stop in the middle of a page? I > thought the master always the whole WAL blocks without stopping in the > middle of a page.
You even have problems on normal standbys. I have a small script which is able to reproduce that if you want (need a small rewrite as it is adapted to my test framework) which introduces a garbage set of WAL segments on a stopped standby. With the small monitoring patch I mentioned upthread then you can see the XLOG reader finding garbage data as well before validating the record header. With any fixes on the WAL receiver, your first patch included, then the garbage read goes away, and XLOG reader complains about a record with an incorrect length (invalid record length at XX/YYY: wanted 24, got 0) instead of complains from header validation part. One key point is to cleanly stop the primary to as it forces the standby's WAL receiver to write to its WAL segment in the middle of a page. >> Another idea I am thinking about would be to zero-fill the segments when >> recycled instead of being just renamed when doing InstallXLogFileSegment(). >> This would also have the advantage of making the segments ahead more >> compressible, which is a gain for custom backups, and the WAL receiver does >> not need any tweaks as it would write the data on a clean file. Zero-filling >> the segments is done only when a new segment is created (see XLogFileInit). > > Yes, I was (and am) inclined to take this approach; this is easy and > clean, but not good for performance... I hope there's something which > justifies this approach. InstallXLogFileSegment uses a plain durable_link_or_rename() to recycle the past segment which syncs the old segment before the rename anyway, so the I/O effort will be there, no? This was mentioned back in 2001 by the way, but this did not count much for the case discussed here: https://www.postgresql.org/message-id/24901.995381770%40sss.pgh.pa.us The issue here is that the streaming case makes it easier to hit the problem as it opens more easily access to not-completely written WAL pages depending on the message frequency during replication. At the same time, we are discussing about a very low-probability issue. Note that if the XLOG reader is bumping into this problem, then at the next WAL receiver wake up, recovery would begin from the beginning of the last segment, and if the primary has produced some more WAL then the standby would be able to actually avoid the random junk. It is also possible to bypass the problem by zeroing manually the areas in question, or to actually wait for the standby to generate more WAL so as the garbage is overwritten automatically. And you really need to be very, very unlucky to have random garbage able to bypass the header validation checks. -- Michael
signature.asc
Description: PGP signature