On Wed, Jul 18, 2018 at 5:54 PM, Thomas Munro <thomas.mu...@enterprisedb.com> wrote: > On Wed, Jul 18, 2018 at 5:41 PM, Andrey Borodin <x4...@yandex-team.ru> wrote: >>> I think we'd want pg_upgrade tests showing an example of each SLRU >>> growing past one segment, and then being upgraded, and then being >>> accessed in various different pages and segment files, so that we can >>> see that we're able to shift the data to the right place successfully. >>> For example I think I'd want to see that a single aborted transaction >>> surrounded by many committed transactions shows up in the right place >>> after an upgrade. >> >> Can you elaborate a bit on how to implement this test. I've searched for >> some automated pg_upgrade tests but didn't found one. >> Should it be one-time test script or something "make check-world"-able?
Hmm. This proposal doesn't seem to deal with torn writes. If someone modifies an 8KB SLRU page and it is partially written out (say, because your disk has 4KB sectors, and the power cuts out after only one sector is modified), then during recovery we'll try to read that block back in and the checksum will be wrong. The way PostgreSQL usually deals with this problem (and higher level problems caused by torn writes) is by putting a full page image into the WAL the first time each page is modified after each checkpoint. (There are other approaches used by other databases, such as MySQL's write-two-copies strategy with a barrier in between, since they can't both be torn, and the problem goes away if your filesystem somehow magically provides atomic 8KB blocks so you can turn full page writes off.) To reuse the existing machinery, in theory I think you'd call XLogRegisterBuffer() in every place that modifies an SLRU page, and PageSetLSN() after inserting the WAL. The problem is that these pages are not in the regular buffer pool and don't have an LSN in the standard place, so that won't work. I heard about a project to put SLRUs into the regular buffer pool, but I don't know the status. Without that I think you might need to invent equivalent machinery that can register SLRU buffers with xloginsert.c. To avoid writing full page images for every SLRU page, you'd probably want to use something like REGBUF_WILL_INIT to skip FPW for pages you're zero-initialising (eg in ZeroCLOGPage()). I haven't studied the synchronisation problems lurking there. -- Thomas Munro http://www.enterprisedb.com