On Thu, Oct 13, 2016 at 2:49 PM, Michael Paquier <michael.paqu...@gmail.com> wrote: > In my quest of making the backup tools more compliant to data > durability, here is a thread for pg_dump and pg_dumpall. Here is in a > couple of lines my proposal: > - Addition in _archiveHandle of a field to track if the dump generated > should be synced or not. > - This is effective for all modes, when the user specifies an output > file. In short that's when fileSpec is not NULL. > - Actually do the the sync in _EndData and _EndBlob[s] if appropriate. > There is for example nothing to do for pg_backup_null.c > - Addition of --nosync option to allow users to disable it. By default > it is enabled. > Note that to make the data durable, the file need to be sync'ed as > well as its parent folder. So with pg_dump we can only make that > really durable with -Fd. I think that in the case where the user > specifies an output file for the other modes we should sync it, that's > the best we can do. This last statement applies as well for > pg_dumpall. > > Thoughts? I'd like to prepare a patch according to those lines for the next > CF.
Okay, here is a patch doing the above. I have added a new --nosync option to pg_dump and pg_dumpall to switch to the pre-10 behavior. I have arrived at the conclusion that it is better not to touch at _EndData and _EndBlob, and just issue the fsync in CloseArchive when all the write operations are done. In the case of the directory format, the fsync is done on all the entries recursively. This makes as well the patch more simple. The regression tests calling pg_dump don't use --nosync yet in this patch, that's a move that could be done afterwards. I have added that to next CF: https://commitfest.postgresql.org/11/823/ -- Michael
pgdump-sync-v1.patch
Description: application/download
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers