Hi, Am Dienstag, den 19.03.2019, 11:48 +0100 schrieb Fabien COELHO: > Moving the controlfile looks like an effective way to prevent any > concurrent start, as the fs operation is probably atomic and especially if > external tools uses the same trick. However this is not the case yet, eg > "pg_resetwal" uses a "postmaster.pid" hack instead. Probably the method > could be unified, possibly with some functions in "controlfile_utils.c". > > However, I think that there still is a race condition because of the order > in which it is implemented: > > pg_checksums reads control file > pg_checksums checks control file contents... > ** cluster may be started and the control file updated > pg_checksums moves the (updated) control file > pg_checksums proceeds on a running cluster > pg_checksums moves back the control file > pg_checksums updates the control file contents, overriding updates > > I think that the correct way to handle this for enable/disable is: > > pg_checksums moves the control file > pg_checksums reads, checks, proceeds, updates > pg_checksums moves back the control file > > This probably means extending a little bit the update_controlfile function > to allow a suffix. No big deal. > > Ok, this might not work, because of the following, less likely, race > condition: > > postmaster opens control file RW > pg_checksums moves control file, posmater open file handle follows > ... > > So ISTM that we really need some locking to have something clean.
I think starting the postmaster during offline maintenance is already quite some pilot error. As pg_checksums can potentially run for hours though, I agree it is important to disable the cluster in the meantime. There's really not a lot going on between pg_checksums reading the control file and moving it away - what you propose above sounds a bit like overengineering to me. If anything, we could include the postmaster.pid check from pg_resetwal after we have renamed the control file to make absolutely sure that the cluster is offline. Once the control file is gone and there is no postmaster.pid, it surely cannot be pg_checksums' problem anymore if a postmaster is started regardless of maintenance. I leave that to Michael to decide whether he thinks the above is warranted. I think the more important open issue is what to do about PITR and streaming replication, see my replies to Magnus elsewhere in the thread. > Why not always use "pgrename" instead of the strange pg_mv_file macro? pg_ugprade does it the same way, possibly both could be converted to pgrename, dunno. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz