Re: [HACKERS] pg_dump, pg_dumpall and data durability

2017-03-23 Thread Michael Paquier
On Wed, Mar 22, 2017 at 7:00 AM, Michael Paquier
 wrote:
> On Wed, Mar 22, 2017 at 6:24 AM, Andrew Dunstan
>  wrote:
>> This is really a pretty small patch all things considered, and pretty
>> low-risk (although I haven;t been threough the code in fine detail yet).
>> In the end I'm persuaded by Andres' point that there's actually no
>> practical alternative way to make sure the data is actually synced to disk.
>>
>> If nobody else wants to pick it up I will, unless there is a strong
>> objection.
>
> Thanks!

Thanks Andrew, I can see that this has been committed as 96a7128b.

I also saw that:
https://www.postgresql.org/message-id/75e1b6ff-c3d5-9a26-e38b-3cb22a099...@2ndquadrant.com
I'll send a patch in a bit for the regression tests.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2017-03-21 Thread Michael Paquier
On Wed, Mar 22, 2017 at 6:24 AM, Andrew Dunstan
 wrote:
> This is really a pretty small patch all things considered, and pretty
> low-risk (although I haven;t been threough the code in fine detail yet).
> In the end I'm persuaded by Andres' point that there's actually no
> practical alternative way to make sure the data is actually synced to disk.
>
> If nobody else wants to pick it up I will, unless there is a strong
> objection.

Thanks!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2017-03-21 Thread Andrew Dunstan


On 03/04/2017 01:08 AM, Robert Haas wrote:
> On Thu, Mar 2, 2017 at 5:02 AM, Michael Paquier
>  wrote:
>> On Thu, Mar 2, 2017 at 2:26 AM, David Steele  wrote:
>>> This patch is in need of a committer.  Any takers?
>>> I didn't see a lot of enthusiasm from committers on the thread
>> Stephen at least has showed interest.
>>
>>> so if nobody picks it up by the end of the CF I'm going to mark the patch 
>>> RWF.
>> Yes, that makes sense. If no committer is willing to have a look at
>> code-level and/or has room for this patch then it is as good as
>> doomed. Except for bug fixes, I have a couple of other patches that
>> are piling up so they would likely get the same treatment. There is
>> nothing really we can do about that.
> Before we reach the question of whether committers have time to look
> at this, we should first consider the question of whether it has
> achieved consensus.  I'll try to summarize the votes:
>
> Tom Lane: premise pretty dubious
> Robert Haas: do we even want this?
> Peter Eisentraut: I had voiced a similar concern [to Robert's] previously
> Albe Laurenz: I think we should have that
> Andres Freund: [Tom's counterproposal won't work]
> Robert Haas: [Andres has a good point, still nervous] I'm not sure
> there's any better alternative to what's being proposed, though.
> Fujii Masao: One idea is to provide the utility or extension which
> fsync's the specified files and directories
>
> Here's an attempt to translate those words into numerical votes.  As
> per my usual practice, I will count the patch author as +1:
>
> Michael Paquier: +1
> Tom Lane: -1
> Peter Eisentraut: -1
> Albe Laurenz: +1
> Andres Freund: +1
> Robert Haas: +0.5
> Fujii Masao: -0.5
>
> So, my interpretation is that, out of 7 votes, we have -2.5 and +3.5.
> Perhaps that is a consensus for proceeding, but if so it's a pretty
> marginal one.  I think the next step for this patch is to consider why
> we shouldn't, in lieu of what's proposed here, add a pg_fsync utility
> that fsyncs a file or recursively fsyncs a directory, ship that, and
> let people use it on their pg_dump files and/or base backups if they
> wish.  I am not altogether convinced that's a better option, but I am
> also not altogether convinced that it's worse.  Also, if anyone else
> wishes to vote, or if anyone to whom I've attributed a vote wishes to
> assign a numerical value to their vote other than the one I've
> assigned, please feel free.
>
> The issue with this patch isn't that nobody is interested, but that
> not everybody thinks it's a good idea.
>


This is really a pretty small patch all things considered, and pretty
low-risk (although I haven;t been threough the code in fine detail yet).
In the end I'm persuaded by Andres' point that there's actually no
practical alternative way to make sure the data is actually synced to disk.

If nobody else wants to pick it up I will, unless there is a strong
objection.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2017-03-04 Thread Michael Paquier
On Sat, Mar 4, 2017 at 3:08 PM, Robert Haas  wrote:
> On Thu, Mar 2, 2017 at 5:02 AM, Michael Paquier
>  wrote:
>> On Thu, Mar 2, 2017 at 2:26 AM, David Steele  wrote:
>>> This patch is in need of a committer.  Any takers?
>>> I didn't see a lot of enthusiasm from committers on the thread
>>
>> Stephen at least has showed interest.
>>
>>> so if nobody picks it up by the end of the CF I'm going to mark the patch 
>>> RWF.
>>
>> Yes, that makes sense. If no committer is willing to have a look at
>> code-level and/or has room for this patch then it is as good as
>> doomed. Except for bug fixes, I have a couple of other patches that
>> are piling up so they would likely get the same treatment. There is
>> nothing really we can do about that.
>
> Before we reach the question of whether committers have time to look
> at this, we should first consider the question of whether it has
> achieved consensus.  I'll try to summarize the votes:
>
> Tom Lane: premise pretty dubious
> Robert Haas: do we even want this?
> Peter Eisentraut: I had voiced a similar concern [to Robert's] previously
> Albe Laurenz: I think we should have that
> Andres Freund: [Tom's counterproposal won't work]
> Robert Haas: [Andres has a good point, still nervous] I'm not sure
> there's any better alternative to what's being proposed, though.
> Fujii Masao: One idea is to provide the utility or extension which
> fsync's the specified files and directories
>
> Here's an attempt to translate those words into numerical votes.  As
> per my usual practice, I will count the patch author as +1:
>
> Michael Paquier: +1
> Tom Lane: -1
> Peter Eisentraut: -1
> Albe Laurenz: +1
> Andres Freund: +1
> Robert Haas: +0.5
> Fujii Masao: -0.5
>
> So, my interpretation is that, out of 7 votes, we have -2.5 and +3.5.
> Perhaps that is a consensus for proceeding, but if so it's a pretty
> marginal one.  I think the next step for this patch is to consider why
> we shouldn't, in lieu of what's proposed here, add a pg_fsync utility
> that fsyncs a file or recursively fsyncs a directory, ship that, and
> let people use it on their pg_dump files and/or base backups if they
> wish.  I am not altogether convinced that's a better option, but I am
> also not altogether convinced that it's worse.  Also, if anyone else
> wishes to vote, or if anyone to whom I've attributed a vote wishes to
> assign a numerical value to their vote other than the one I've
> assigned, please feel free.

Not completely exact by including this message:
https://www.postgresql.org/message-id/20170123050248.go18...@tamriel.snowman.net
If I interpret this message correctly Stephen Frost can be counted
with either +1 or +0.5.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2017-03-03 Thread Robert Haas
On Thu, Mar 2, 2017 at 5:02 AM, Michael Paquier
 wrote:
> On Thu, Mar 2, 2017 at 2:26 AM, David Steele  wrote:
>> This patch is in need of a committer.  Any takers?
>> I didn't see a lot of enthusiasm from committers on the thread
>
> Stephen at least has showed interest.
>
>> so if nobody picks it up by the end of the CF I'm going to mark the patch 
>> RWF.
>
> Yes, that makes sense. If no committer is willing to have a look at
> code-level and/or has room for this patch then it is as good as
> doomed. Except for bug fixes, I have a couple of other patches that
> are piling up so they would likely get the same treatment. There is
> nothing really we can do about that.

Before we reach the question of whether committers have time to look
at this, we should first consider the question of whether it has
achieved consensus.  I'll try to summarize the votes:

Tom Lane: premise pretty dubious
Robert Haas: do we even want this?
Peter Eisentraut: I had voiced a similar concern [to Robert's] previously
Albe Laurenz: I think we should have that
Andres Freund: [Tom's counterproposal won't work]
Robert Haas: [Andres has a good point, still nervous] I'm not sure
there's any better alternative to what's being proposed, though.
Fujii Masao: One idea is to provide the utility or extension which
fsync's the specified files and directories

Here's an attempt to translate those words into numerical votes.  As
per my usual practice, I will count the patch author as +1:

Michael Paquier: +1
Tom Lane: -1
Peter Eisentraut: -1
Albe Laurenz: +1
Andres Freund: +1
Robert Haas: +0.5
Fujii Masao: -0.5

So, my interpretation is that, out of 7 votes, we have -2.5 and +3.5.
Perhaps that is a consensus for proceeding, but if so it's a pretty
marginal one.  I think the next step for this patch is to consider why
we shouldn't, in lieu of what's proposed here, add a pg_fsync utility
that fsyncs a file or recursively fsyncs a directory, ship that, and
let people use it on their pg_dump files and/or base backups if they
wish.  I am not altogether convinced that's a better option, but I am
also not altogether convinced that it's worse.  Also, if anyone else
wishes to vote, or if anyone to whom I've attributed a vote wishes to
assign a numerical value to their vote other than the one I've
assigned, please feel free.

The issue with this patch isn't that nobody is interested, but that
not everybody thinks it's a good idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2017-03-01 Thread Michael Paquier
On Thu, Mar 2, 2017 at 2:26 AM, David Steele  wrote:
> This patch is in need of a committer.  Any takers?
> I didn't see a lot of enthusiasm from committers on the thread

Stephen at least has showed interest.

> so if nobody picks it up by the end of the CF I'm going to mark the patch RWF.

Yes, that makes sense. If no committer is willing to have a look at
code-level and/or has room for this patch then it is as good as
doomed. Except for bug fixes, I have a couple of other patches that
are piling up so they would likely get the same treatment. There is
nothing really we can do about that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2017-03-01 Thread David Steele
On 1/22/17 11:02 PM, Michael Paquier wrote:
> On Tue, Nov 29, 2016 at 1:30 PM, Michael Paquier
>  wrote:
>> On Mon, Nov 14, 2016 at 6:07 PM, Michael Paquier
>>  wrote:
>>> On Mon, Nov 14, 2016 at 6:04 PM, Albe Laurenz  
>>> wrote:
 Michael Paquier wrote:
> Meh. I forgot docs and --help output updates.

 Looks good, except that you left the "N" option in the getopt_long
 call for pg_dumpall.  I fixed that in the attached patch.
>>>
>>> No, v5 has removed it, but it does not matter much now...
>>>
 I'll mark the patch "ready for committer".
>>>
>>> Thanks!
>>
>> Moved to CF 2017-01.
> 
> Moved to CF 2017-03 with the same status, ready for committer. Perhaps
> there is some interest in this feature? v5 of the patch still applies,
> with a couple of hunks, so v6 is attached.

This patch is in need of a committer.  Any takers?

I didn't see a lot of enthusiasm from committers on the thread so if
nobody picks it up by the end of the CF I'm going to mark the patch RWF.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2017-01-22 Thread Michael Paquier
On Tue, Nov 29, 2016 at 1:30 PM, Michael Paquier
 wrote:
> On Mon, Nov 14, 2016 at 6:07 PM, Michael Paquier
>  wrote:
>> On Mon, Nov 14, 2016 at 6:04 PM, Albe Laurenz  
>> wrote:
>>> Michael Paquier wrote:
 Meh. I forgot docs and --help output updates.
>>>
>>> Looks good, except that you left the "N" option in the getopt_long
>>> call for pg_dumpall.  I fixed that in the attached patch.
>>
>> No, v5 has removed it, but it does not matter much now...
>>
>>> I'll mark the patch "ready for committer".
>>
>> Thanks!
>
> Moved to CF 2017-01.

Moved to CF 2017-03 with the same status, ready for committer. Perhaps
there is some interest in this feature? v5 of the patch still applies,
with a couple of hunks, so v6 is attached.
-- 
Michael


pgdump-sync-v6.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-12-06 Thread Michael Paquier
On Tue, Dec 6, 2016 at 6:00 PM, Fujii Masao  wrote:
> One idea is to provide the utility or extension which fsync's the specified
> files and directories (including all the files under them). It may be 
> overkill,
> though. But it would be useful for other purposes, for example, we can
> improve the durability of archived WAL files by setting this utility in
> archive_command, and we can flush any output files generated by psql
> by using it.

That's the pg_copy utility that we talked a couple of months
(year(s)?) back, which would really be useful for archiving, and the
main advantage is that it would be cross-platform. Still the patch
discussed here is more targeted, this makes sure that once pg_dump
leaves, things created are on disk, facilitating the life of users by
not involving an extra step and by making the safe behavior the
default.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-12-06 Thread Fujii Masao
On Wed, Nov 16, 2016 at 1:27 AM, Robert Haas  wrote:
> On Sun, Nov 13, 2016 at 4:18 AM, Andres Freund  wrote:
>> On 2016-11-08 18:18:01 -0500, Tom Lane wrote:
>>> I think this might be better addressed by adding something to backup.sgml
>>> pointing out that you'd better fsync or sync your backups before assuming
>>> that they can't be lost.
>>
>> How does a normal user do that? I don't think there's a cross-platform
>> advice we can give, and even on *nix the answer basically is 'sync;
>> sync;' which is a pretty big hammer, and might be completely
>> unacceptable on a busy server.
>
> Yeah, that's a pretty fair point.  I see the point of this patch
> pretty clearly but somehow it makes me nervous anyway.  I'm not sure
> there's any better alternative to what's being proposed, though.

One idea is to provide the utility or extension which fsync's the specified
files and directories (including all the files under them). It may be overkill,
though. But it would be useful for other purposes, for example, we can
improve the durability of archived WAL files by setting this utility in
archive_command, and we can flush any output files generated by psql
by using it.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-28 Thread Michael Paquier
On Mon, Nov 14, 2016 at 6:07 PM, Michael Paquier
 wrote:
> On Mon, Nov 14, 2016 at 6:04 PM, Albe Laurenz  wrote:
>> Michael Paquier wrote:
>>> Meh. I forgot docs and --help output updates.
>>
>> Looks good, except that you left the "N" option in the getopt_long
>> call for pg_dumpall.  I fixed that in the attached patch.
>
> No, v5 has removed it, but it does not matter much now...
>
>> I'll mark the patch "ready for committer".
>
> Thanks!

Moved to CF 2017-01.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-15 Thread Robert Haas
On Sun, Nov 13, 2016 at 4:18 AM, Andres Freund  wrote:
> On 2016-11-08 18:18:01 -0500, Tom Lane wrote:
>> I think this might be better addressed by adding something to backup.sgml
>> pointing out that you'd better fsync or sync your backups before assuming
>> that they can't be lost.
>
> How does a normal user do that? I don't think there's a cross-platform
> advice we can give, and even on *nix the answer basically is 'sync;
> sync;' which is a pretty big hammer, and might be completely
> unacceptable on a busy server.

Yeah, that's a pretty fair point.  I see the point of this patch
pretty clearly but somehow it makes me nervous anyway.  I'm not sure
there's any better alternative to what's being proposed, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-14 Thread Michael Paquier
On Mon, Nov 14, 2016 at 6:04 PM, Albe Laurenz  wrote:
> Michael Paquier wrote:
>> Meh. I forgot docs and --help output updates.
>
> Looks good, except that you left the "N" option in the getopt_long
> call for pg_dumpall.  I fixed that in the attached patch.

No, v5 has removed it, but it does not matter much now...

> I'll mark the patch "ready for committer".

Thanks!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-14 Thread Albe Laurenz
Michael Paquier wrote:
> Meh. I forgot docs and --help output updates.

Looks good, except that you left the "N" option in the getopt_long
call for pg_dumpall.  I fixed that in the attached patch.

I'll mark the patch "ready for committer".

Yours,
Laurenz Albe


pgdump-sync-v5.patch
Description: pgdump-sync-v5.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-13 Thread Andres Freund
Hi,

On 2016-11-08 18:18:01 -0500, Tom Lane wrote:
> I think this might be better addressed by adding something to backup.sgml
> pointing out that you'd better fsync or sync your backups before assuming
> that they can't be lost.

How does a normal user do that? I don't think there's a cross-platform
advice we can give, and even on *nix the answer basically is 'sync;
sync;' which is a pretty big hammer, and might be completely
unacceptable on a busy server.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-12 Thread Peter Eisentraut
On 11/8/16 3:48 PM, Robert Haas wrote:
> First question: Do we even want this?  Generally, when a program
> writes to a file, we rely on the operating system to decide when that
> data should be written back to disk.  We have to override that
> distinction for things internal to PostgreSQL because we need certain
> bits of data to reach the disk in a certain order, but it's unclear to
> me how far outside the core database system we want to extend that.

I had voiced a similar concern previously:
https://www.postgresql.org/message-id/f8dff810-f5f4-77c3-933b-127df4ed9...@2ndquadrant.com

At the time, there were no other comments, so we went ahead with it,
which presumably gave encouragement to pursue the current patch.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-12 Thread Michael Paquier
On Sun, Nov 13, 2016 at 12:17 AM, Dilip Kumar  wrote:
> On Sat, Nov 12, 2016 at 10:16 AM, Michael Paquier
>  wrote:
>>> - For pg_dumpall, a short option "-N" is added for "--no-sync", but not for
>>>   pg_dump (because -N is already taken there).
>>>   I'd opt for either using the same short option for both or (IMO better)
>>>   only offering a long option for both.
>>
>> Okay. For consistency's sake let's do that. I was a bit hesitant
>> regarding that to be honest.
>
> Seems like you have missed to remove -N at some places, as mentioned below.
>
> 1.
> --- a/doc/src/sgml/ref/pg_dumpall.sgml
> +++ b/doc/src/sgml/ref/pg_dumpall.sgml
> @@ -365,6 +365,21 @@ PostgreSQL documentation
>   
>
>   
> +  -N
>
> @@ -543,6 +557,7 @@ help(void)
>
>
> 2.
>   printf(_("\nGeneral options:\n"));
>   printf(_("  -f, --file=FILENAME  output file name\n"));
> + printf(_("  -N, --no-syncdo not wait for changes to
> be written safely to disk\n"));

v4 fixed those two places.

> 3.
> - while ((c = getopt_long(argc, argv, "acd:f:gh:l:oOp:rsS:tU:vwWx",
> long_options, )) != -1)
> + while ((c = getopt_long(argc, argv, "acd:f:gh:l:NoOp:rsS:tU:vwWx",
> long_options, )) != -1)

But not this one...
-- 
Michael
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 371a614..dcad095 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -815,6 +815,20 @@ PostgreSQL documentation
  
 
  
+  --no-sync
+  
+   
+By default, pg_dump will wait for all files
+to be written safely to disk.  This option causes
+pg_dump to return without waiting, which is
+faster, but means that a subsequent operating system crash can leave
+the dump corrupt.  Generally, this option is useful for testing
+but should not be used when dumping data from production installation.
+   
+  
+ 
+
+ 
   --quote-all-identifiers
   

diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index 97168a0..70c5332 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -342,6 +342,20 @@ PostgreSQL documentation
  
 
  
+  --no-sync
+  
+   
+By default, pg_dumpall will wait for all files
+to be written safely to disk.  This option causes
+pg_dumpall to return without waiting, which is
+faster, but means that a subsequent operating system crash can leave
+the dump corrupt.  Generally, this option is useful for testing
+but should not be used when dumping data from production installation.
+   
+  
+ 
+
+ 
   --no-tablespaces
   

diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 0a28124..becbc56 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -268,7 +268,7 @@ extern Archive *OpenArchive(const char *FileSpec, const ArchiveFormat fmt);
 
 /* Create a new archive */
 extern Archive *CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
-			  const int compression, ArchiveMode mode,
+			  const int compression, bool dosync, ArchiveMode mode,
 			  SetupWorkerPtr setupDumpWorker);
 
 /* The --list option */
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 0e20985..5b60da6 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -56,7 +56,8 @@ static const char *modulename = gettext_noop("archiver");
 
 
 static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt,
-	 const int compression, ArchiveMode mode, SetupWorkerPtr setupWorkerPtr);
+	 const int compression, bool dosync, ArchiveMode mode,
+	 SetupWorkerPtr setupWorkerPtr);
 static void _getObjectDescription(PQExpBuffer buf, TocEntry *te,
 	  ArchiveHandle *AH);
 static void _printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData, bool acl_pass);
@@ -202,10 +203,12 @@ setupRestoreWorker(Archive *AHX)
 /* Public */
 Archive *
 CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
-	 const int compression, ArchiveMode mode, SetupWorkerPtr setupDumpWorker)
+			  const int compression, bool dosync, ArchiveMode mode,
+			  SetupWorkerPtr setupDumpWorker)
 
 {
-	ArchiveHandle *AH = _allocAH(FileSpec, fmt, compression, mode, setupDumpWorker);
+	ArchiveHandle *AH = _allocAH(FileSpec, fmt, compression, dosync,
+ mode, setupDumpWorker);
 
 	return (Archive *) AH;
 }
@@ -215,7 +218,7 @@ CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
 Archive *
 OpenArchive(const char *FileSpec, const ArchiveFormat fmt)
 {
-	ArchiveHandle *AH = _allocAH(FileSpec, fmt, 0, archModeRead, setupRestoreWorker);
+	ArchiveHandle *AH = _allocAH(FileSpec, fmt, 0, true, archModeRead, setupRestoreWorker);
 
 	return (Archive *) AH;
 }
@@ -2223,7 +2226,8 @@ 

Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-12 Thread Dilip Kumar
On Sat, Nov 12, 2016 at 10:16 AM, Michael Paquier
 wrote:
>> - For pg_dumpall, a short option "-N" is added for "--no-sync", but not for
>>   pg_dump (because -N is already taken there).
>>   I'd opt for either using the same short option for both or (IMO better)
>>   only offering a long option for both.
>
> Okay. For consistency's sake let's do that. I was a bit hesitant
> regarding that to be honest.

Seems like you have missed to remove -N at some places, as mentioned below.

1.
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -365,6 +365,21 @@ PostgreSQL documentation
  

  
+  -N

@@ -543,6 +557,7 @@ help(void)


2.
  printf(_("\nGeneral options:\n"));
  printf(_("  -f, --file=FILENAME  output file name\n"));
+ printf(_("  -N, --no-syncdo not wait for changes to
be written safely to disk\n"));

3.
- while ((c = getopt_long(argc, argv, "acd:f:gh:l:oOp:rsS:tU:vwWx",
long_options, )) != -1)
+ while ((c = getopt_long(argc, argv, "acd:f:gh:l:NoOp:rsS:tU:vwWx",
long_options, )) != -1)

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-12 Thread Michael Paquier
On Sat, Nov 12, 2016 at 1:46 PM, Michael Paquier
 wrote:
> On Fri, Nov 11, 2016 at 11:03 PM, Albe Laurenz  
> wrote:
>>   This would avoid confusion, and we expect that few people will want to use
>>   this option anyway, right?
>
> Definitely a good point.

Meh. I forgot docs and --help output updates.
-- 
Michael
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 371a614..dcad095 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -815,6 +815,20 @@ PostgreSQL documentation
  
 
  
+  --no-sync
+  
+   
+By default, pg_dump will wait for all files
+to be written safely to disk.  This option causes
+pg_dump to return without waiting, which is
+faster, but means that a subsequent operating system crash can leave
+the dump corrupt.  Generally, this option is useful for testing
+but should not be used when dumping data from production installation.
+   
+  
+ 
+
+ 
   --quote-all-identifiers
   

diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index 97168a0..70c5332 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -342,6 +342,20 @@ PostgreSQL documentation
  
 
  
+  --no-sync
+  
+   
+By default, pg_dumpall will wait for all files
+to be written safely to disk.  This option causes
+pg_dumpall to return without waiting, which is
+faster, but means that a subsequent operating system crash can leave
+the dump corrupt.  Generally, this option is useful for testing
+but should not be used when dumping data from production installation.
+   
+  
+ 
+
+ 
   --no-tablespaces
   

diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 0a28124..becbc56 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -268,7 +268,7 @@ extern Archive *OpenArchive(const char *FileSpec, const ArchiveFormat fmt);
 
 /* Create a new archive */
 extern Archive *CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
-			  const int compression, ArchiveMode mode,
+			  const int compression, bool dosync, ArchiveMode mode,
 			  SetupWorkerPtr setupDumpWorker);
 
 /* The --list option */
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 0e20985..5b60da6 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -56,7 +56,8 @@ static const char *modulename = gettext_noop("archiver");
 
 
 static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt,
-	 const int compression, ArchiveMode mode, SetupWorkerPtr setupWorkerPtr);
+	 const int compression, bool dosync, ArchiveMode mode,
+	 SetupWorkerPtr setupWorkerPtr);
 static void _getObjectDescription(PQExpBuffer buf, TocEntry *te,
 	  ArchiveHandle *AH);
 static void _printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData, bool acl_pass);
@@ -202,10 +203,12 @@ setupRestoreWorker(Archive *AHX)
 /* Public */
 Archive *
 CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
-	 const int compression, ArchiveMode mode, SetupWorkerPtr setupDumpWorker)
+			  const int compression, bool dosync, ArchiveMode mode,
+			  SetupWorkerPtr setupDumpWorker)
 
 {
-	ArchiveHandle *AH = _allocAH(FileSpec, fmt, compression, mode, setupDumpWorker);
+	ArchiveHandle *AH = _allocAH(FileSpec, fmt, compression, dosync,
+ mode, setupDumpWorker);
 
 	return (Archive *) AH;
 }
@@ -215,7 +218,7 @@ CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
 Archive *
 OpenArchive(const char *FileSpec, const ArchiveFormat fmt)
 {
-	ArchiveHandle *AH = _allocAH(FileSpec, fmt, 0, archModeRead, setupRestoreWorker);
+	ArchiveHandle *AH = _allocAH(FileSpec, fmt, 0, true, archModeRead, setupRestoreWorker);
 
 	return (Archive *) AH;
 }
@@ -2223,7 +2226,8 @@ _discoverArchiveFormat(ArchiveHandle *AH)
  */
 static ArchiveHandle *
 _allocAH(const char *FileSpec, const ArchiveFormat fmt,
-	  const int compression, ArchiveMode mode, SetupWorkerPtr setupWorkerPtr)
+		 const int compression, bool dosync, ArchiveMode mode,
+		 SetupWorkerPtr setupWorkerPtr)
 {
 	ArchiveHandle *AH;
 
@@ -2277,6 +2281,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
 
 	AH->mode = mode;
 	AH->compression = compression;
+	AH->dosync = dosync;
 
 	memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
 
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 50cf452..4492542 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -307,6 +307,7 @@ struct _archiveHandle
  * values for compression: -1
  * Z_DEFAULT_COMPRESSION 0	COMPRESSION_NONE
  * 1-9 levels for gzip compression */
+	bool		dosync;			/* 

Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-11 Thread Michael Paquier
On Fri, Nov 11, 2016 at 11:03 PM, Albe Laurenz  wrote:
> - In pg_dumpall.c, the result of fsync_fname() is cast to "void" to show that
>   the return code is ignored, but not anywhere else.  Is that by design?

Right. The patch is lacking consistency in this area. The main thought
regarding this design is to not consider a fsync failure as critical,
and just issue a warning in stderr. I have updated the two other call
sites with a (void) cast.

> - For pg_dumpall, a short option "-N" is added for "--no-sync", but not for
>   pg_dump (because -N is already taken there).
>   I'd opt for either using the same short option for both or (IMO better)
>   only offering a long option for both.

Okay. For consistency's sake let's do that. I was a bit hesitant
regarding that to be honest.

>   This would avoid confusion, and we expect that few people will want to use
>   this option anyway, right?

Definitely a good point.
-- 
Michael
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 371a614..dcad095 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -815,6 +815,20 @@ PostgreSQL documentation
  
 
  
+  --no-sync
+  
+   
+By default, pg_dump will wait for all files
+to be written safely to disk.  This option causes
+pg_dump to return without waiting, which is
+faster, but means that a subsequent operating system crash can leave
+the dump corrupt.  Generally, this option is useful for testing
+but should not be used when dumping data from production installation.
+   
+  
+ 
+
+ 
   --quote-all-identifiers
   

diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index 97168a0..4e6839b 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -365,6 +365,21 @@ PostgreSQL documentation
  
 
  
+  -N
+  --no-sync
+  
+   
+By default, pg_dumpall will wait for all files
+to be written safely to disk.  This option causes
+pg_dumpall to return without waiting, which is
+faster, but means that a subsequent operating system crash can leave
+the dump corrupt.  Generally, this option is useful for testing
+but should not be used when dumping data from production installation.
+   
+  
+ 
+
+ 
   --quote-all-identifiers
   

diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 0a28124..becbc56 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -268,7 +268,7 @@ extern Archive *OpenArchive(const char *FileSpec, const 
ArchiveFormat fmt);
 
 /* Create a new archive */
 extern Archive *CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, ArchiveMode mode,
+ const int compression, bool dosync, ArchiveMode mode,
  SetupWorkerPtr setupDumpWorker);
 
 /* The --list option */
diff --git a/src/bin/pg_dump/pg_backup_archiver.c 
b/src/bin/pg_dump/pg_backup_archiver.c
index 0e20985..5b60da6 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -56,7 +56,8 @@ static const char *modulename = gettext_noop("archiver");
 
 
 static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt,
-const int compression, ArchiveMode mode, SetupWorkerPtr 
setupWorkerPtr);
+const int compression, bool dosync, ArchiveMode mode,
+SetupWorkerPtr setupWorkerPtr);
 static void _getObjectDescription(PQExpBuffer buf, TocEntry *te,
  ArchiveHandle *AH);
 static void _printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData, bool 
acl_pass);
@@ -202,10 +203,12 @@ setupRestoreWorker(Archive *AHX)
 /* Public */
 Archive *
 CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
-const int compression, ArchiveMode mode, SetupWorkerPtr 
setupDumpWorker)
+ const int compression, bool dosync, ArchiveMode mode,
+ SetupWorkerPtr setupDumpWorker)
 
 {
-   ArchiveHandle *AH = _allocAH(FileSpec, fmt, compression, mode, 
setupDumpWorker);
+   ArchiveHandle *AH = _allocAH(FileSpec, fmt, compression, dosync,
+mode, 
setupDumpWorker);
 
return (Archive *) AH;
 }
@@ -215,7 +218,7 @@ CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
 Archive *
 OpenArchive(const char *FileSpec, const ArchiveFormat fmt)
 {
-   ArchiveHandle *AH = _allocAH(FileSpec, fmt, 0, archModeRead, 
setupRestoreWorker);
+   ArchiveHandle *AH = _allocAH(FileSpec, fmt, 0, true, archModeRead, 
setupRestoreWorker);
 
return (Archive *) AH;
 }
@@ -2223,7 +2226,8 @@ _discoverArchiveFormat(ArchiveHandle *AH)
  */
 static ArchiveHandle *
 

Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-11 Thread Albe Laurenz
Michael Paquier wrote:
> A typo s/pre_fsync_fname/pre_sync_fname, and a mistake from me because
> I did not compile with -DPG_FLUSH_DATA_WORKS to check this code.
> 
> v2 is attached, fixing those issues.

The patch applies and compiles fine.

I have tested it on Linux and MinGW and could see the fsync(2) and
FlushFileBuffers calls I expected.
This adds crash safety for a reasonable price, and I think we should have that.

The documentation additions are sufficient.

Looking through the patch, I had two questions that are more about
style and consistency than anything else:

- In pg_dumpall.c, the result of fsync_fname() is cast to "void" to show that
  the return code is ignored, but not anywhere else.  Is that by design?

- For pg_dumpall, a short option "-N" is added for "--no-sync", but not for
  pg_dump (because -N is already taken there).
  I'd opt for either using the same short option for both or (IMO better)
  only offering a long option for both.
  This would avoid confusion, and we expect that few people will want to use
  this option anyway, right?

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-08 Thread Tom Lane
Robert Haas  writes:
> First question: Do we even want this?  Generally, when a program
> writes to a file, we rely on the operating system to decide when that
> data should be written back to disk.  We have to override that
> distinction for things internal to PostgreSQL because we need certain
> bits of data to reach the disk in a certain order, but it's unclear to
> me how far outside the core database system we want to extend that.
> Are we going to have psql fsync() data it writes to a file when \o is
> used, for example?  That would seem to me to be beyond insane, because
> we have no idea whether the user actually needs that file to be
> durable.  It is a better bet that a pg_dump command's output needs
> durability, of course, but I feel that we shouldn't just go disabling
> the filesystem cache one program at a time without some guiding
> principle.

FWIW, I find the premise pretty dubious.  People don't normally expect
programs to fsync their standard output, and the argument that pg_dump's
output is always critical data doesn't withstand inspection.  Also,
I don't understand what pg_dump should do if it fails to fsync.  There
are too many cases where that would fail (eg, output piped to a program)
for it to be treated as an error condition.  But if it isn't reported as
an error, then how much durability guarantee are we really adding?

I think this might be better addressed by adding something to backup.sgml
pointing out that you'd better fsync or sync your backups before assuming
that they can't be lost.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-08 Thread Michael Paquier
On Wed, Nov 9, 2016 at 5:48 AM, Robert Haas  wrote:
> Kyotaro Horiguchi set this patch to "Ready for Committer" in the
> CommitFest application, but that seems entirely premature to me
> considering that v2 has had no review at all, or at least not that I
> see on this thread.  I'm setting it back to "Needs Review".

That's indeed too early. Thanks for correcting.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-08 Thread Michael Paquier
On Wed, Nov 9, 2016 at 8:18 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> First question: Do we even want this?  Generally, when a program
>> writes to a file, we rely on the operating system to decide when that
>> data should be written back to disk.  We have to override that
>> distinction for things internal to PostgreSQL because we need certain
>> bits of data to reach the disk in a certain order, but it's unclear to
>> me how far outside the core database system we want to extend that.
>> Are we going to have psql fsync() data it writes to a file when \o is
>> used, for example?  That would seem to me to be beyond insane, because
>> we have no idea whether the user actually needs that file to be
>> durable.  It is a better bet that a pg_dump command's output needs
>> durability, of course, but I feel that we shouldn't just go disabling
>> the filesystem cache one program at a time without some guiding
>> principle.
>
> FWIW, I find the premise pretty dubious.  People don't normally expect
> programs to fsync their standard output, and the argument that pg_dump's
> output is always critical data doesn't withstand inspection.  Also,
> I don't understand what pg_dump should do if it fails to fsync.  There
> are too many cases where that would fail (eg, output piped to a program)
> for it to be treated as an error condition.  But if it isn't reported as
> an error, then how much durability guarantee are we really adding?

If the output is piped to a program, there is no way to guarantee that
the data will be flushed and the user is responsible for that. We
cannot control all the use cases. The same applies for example with
pg_basebackup where the data is sent to stdout. IMO, the limit set is
that tools aimed at taking physical and logical backups should do a
better effort in the cases where they can do it. That's a cheap
insurance.

Based on this past thread, it seems to me that Magnus, Andres and Jim
Nasby are of the opinion that making things is useful:
https://www.postgresql.org/message-id/20160327233033.gd20...@awork2.anarazel.de
And so do I.

> I think this might be better addressed by adding something to backup.sgml
> pointing out that you'd better fsync or sync your backups before assuming
> that they can't be lost.

Perhaps. That would be better than nothing at least, but that won't
help for cases where we can help a bit.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-08 Thread Robert Haas
On Mon, Nov 7, 2016 at 7:52 PM, Michael Paquier
 wrote:
> On Tue, Nov 8, 2016 at 1:24 AM, Albe Laurenz  wrote:
>> The patch does not apply, I had to change the hunk for
>> src/include/common/file_utils.h.
>
> Yes, the patch has rotten a bit because of f82ec32a. 5d58c07a has also
> made the --noxxx option names appearing as --no-xxx.
>
>> Also, compilation fails because function "pre_fsync_fname" cannot be
>> resolved during linking. Is that a typo for "pre_fsync_fname" or is
>> the patch incomplete?
>
> A typo s/pre_fsync_fname/pre_sync_fname, and a mistake from me because
> I did not compile with -DPG_FLUSH_DATA_WORKS to check this code.
>
> v2 is attached, fixing those issues.

Kyotaro Horiguchi set this patch to "Ready for Committer" in the
CommitFest application, but that seems entirely premature to me
considering that v2 has had no review at all, or at least not that I
see on this thread.  I'm setting it back to "Needs Review".

First question: Do we even want this?  Generally, when a program
writes to a file, we rely on the operating system to decide when that
data should be written back to disk.  We have to override that
distinction for things internal to PostgreSQL because we need certain
bits of data to reach the disk in a certain order, but it's unclear to
me how far outside the core database system we want to extend that.
Are we going to have psql fsync() data it writes to a file when \o is
used, for example?  That would seem to me to be beyond insane, because
we have no idea whether the user actually needs that file to be
durable.  It is a better bet that a pg_dump command's output needs
durability, of course, but I feel that we shouldn't just go disabling
the filesystem cache one program at a time without some guiding
principle.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-07 Thread Michael Paquier
On Tue, Nov 8, 2016 at 1:24 AM, Albe Laurenz  wrote:
> The patch does not apply, I had to change the hunk for
> src/include/common/file_utils.h.

Yes, the patch has rotten a bit because of f82ec32a. 5d58c07a has also
made the --noxxx option names appearing as --no-xxx.

> Also, compilation fails because function "pre_fsync_fname" cannot be
> resolved during linking. Is that a typo for "pre_fsync_fname" or is
> the patch incomplete?

A typo s/pre_fsync_fname/pre_sync_fname, and a mistake from me because
I did not compile with -DPG_FLUSH_DATA_WORKS to check this code.

v2 is attached, fixing those issues.
-- 
Michael
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 371a614..dcad095 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -815,6 +815,20 @@ PostgreSQL documentation
  
 
  
+  --no-sync
+  
+   
+By default, pg_dump will wait for all files
+to be written safely to disk.  This option causes
+pg_dump to return without waiting, which is
+faster, but means that a subsequent operating system crash can leave
+the dump corrupt.  Generally, this option is useful for testing
+but should not be used when dumping data from production installation.
+   
+  
+ 
+
+ 
   --quote-all-identifiers
   

diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index 97168a0..4e6839b 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -365,6 +365,21 @@ PostgreSQL documentation
  
 
  
+  -N
+  --no-sync
+  
+   
+By default, pg_dumpall will wait for all files
+to be written safely to disk.  This option causes
+pg_dumpall to return without waiting, which is
+faster, but means that a subsequent operating system crash can leave
+the dump corrupt.  Generally, this option is useful for testing
+but should not be used when dumping data from production installation.
+   
+  
+ 
+
+ 
   --quote-all-identifiers
   

diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 0a28124..becbc56 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -268,7 +268,7 @@ extern Archive *OpenArchive(const char *FileSpec, const ArchiveFormat fmt);
 
 /* Create a new archive */
 extern Archive *CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
-			  const int compression, ArchiveMode mode,
+			  const int compression, bool dosync, ArchiveMode mode,
 			  SetupWorkerPtr setupDumpWorker);
 
 /* The --list option */
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 0e20985..5b60da6 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -56,7 +56,8 @@ static const char *modulename = gettext_noop("archiver");
 
 
 static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt,
-	 const int compression, ArchiveMode mode, SetupWorkerPtr setupWorkerPtr);
+	 const int compression, bool dosync, ArchiveMode mode,
+	 SetupWorkerPtr setupWorkerPtr);
 static void _getObjectDescription(PQExpBuffer buf, TocEntry *te,
 	  ArchiveHandle *AH);
 static void _printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData, bool acl_pass);
@@ -202,10 +203,12 @@ setupRestoreWorker(Archive *AHX)
 /* Public */
 Archive *
 CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
-	 const int compression, ArchiveMode mode, SetupWorkerPtr setupDumpWorker)
+			  const int compression, bool dosync, ArchiveMode mode,
+			  SetupWorkerPtr setupDumpWorker)
 
 {
-	ArchiveHandle *AH = _allocAH(FileSpec, fmt, compression, mode, setupDumpWorker);
+	ArchiveHandle *AH = _allocAH(FileSpec, fmt, compression, dosync,
+ mode, setupDumpWorker);
 
 	return (Archive *) AH;
 }
@@ -215,7 +218,7 @@ CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
 Archive *
 OpenArchive(const char *FileSpec, const ArchiveFormat fmt)
 {
-	ArchiveHandle *AH = _allocAH(FileSpec, fmt, 0, archModeRead, setupRestoreWorker);
+	ArchiveHandle *AH = _allocAH(FileSpec, fmt, 0, true, archModeRead, setupRestoreWorker);
 
 	return (Archive *) AH;
 }
@@ -2223,7 +2226,8 @@ _discoverArchiveFormat(ArchiveHandle *AH)
  */
 static ArchiveHandle *
 _allocAH(const char *FileSpec, const ArchiveFormat fmt,
-	  const int compression, ArchiveMode mode, SetupWorkerPtr setupWorkerPtr)
+		 const int compression, bool dosync, ArchiveMode mode,
+		 SetupWorkerPtr setupWorkerPtr)
 {
 	ArchiveHandle *AH;
 
@@ -2277,6 +2281,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
 
 	AH->mode = mode;
 	AH->compression = compression;
+	AH->dosync = dosync;
 
 	memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
 
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 

Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-11-07 Thread Albe Laurenz
Michael Paquier wrote:
>> In my quest of making the backup tools more compliant to data
>> durability, here is a thread for pg_dump and pg_dumpall.
> 
> Okay, here is a patch doing the above. I have added a new --nosync
> option to pg_dump and pg_dumpall to switch to the pre-10 behavior. I
> have arrived at the conclusion that it is better not to touch at
> _EndData and _EndBlob, and just issue the fsync in CloseArchive when
> all the write operations are done. In the case of the directory
> format, the fsync is done on all the entries recursively. This makes
> as well the patch more simple. The regression tests calling pg_dump
> don't use --nosync yet in this patch, that's a move that could be done
> afterwards.

The patch does not apply, I had to change the hunk for
src/include/common/file_utils.h.

Also, compilation fails because function "pre_fsync_fname" cannot be
resolved during linking. Is that a typo for "pre_fsync_fname" or is
the patch incomplete?

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-10-14 Thread Michael Paquier
On Thu, Oct 13, 2016 at 2:49 PM, Michael Paquier
 wrote:
> In my quest of making the backup tools more compliant to data
> durability, here is a thread for pg_dump and pg_dumpall. Here is in a
> couple of lines my proposal:
> - Addition in _archiveHandle of a field to track if the dump generated
> should be synced or not.
> - This is effective for all modes, when the user specifies an output
> file. In short that's when fileSpec is not NULL.
> - Actually do the the sync in _EndData and _EndBlob[s] if appropriate.
> There is for example nothing to do for pg_backup_null.c
> - Addition of --nosync option to allow users to disable it. By default
> it is enabled.
> Note that to make the data durable, the file need to be sync'ed as
> well as its parent folder. So with pg_dump we can only make that
> really durable with -Fd. I think that in the case where the user
> specifies an output file for the other modes we should sync it, that's
> the best we can do. This last statement applies as well for
> pg_dumpall.
>
> Thoughts? I'd like to prepare a patch according to those lines for the next 
> CF.

Okay, here is a patch doing the above. I have added a new --nosync
option to pg_dump and pg_dumpall to switch to the pre-10 behavior. I
have arrived at the conclusion that it is better not to touch at
_EndData and _EndBlob, and just issue the fsync in CloseArchive when
all the write operations are done. In the case of the directory
format, the fsync is done on all the entries recursively. This makes
as well the patch more simple. The regression tests calling pg_dump
don't use --nosync yet in this patch, that's a move that could be done
afterwards.

I have added that to next CF:
https://commitfest.postgresql.org/11/823/
-- 
Michael


pgdump-sync-v1.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-10-13 Thread Michael Paquier
On Thu, Oct 13, 2016 at 2:49 PM, Michael Paquier
 wrote:
> - This is effective for all modes, when the user specifies an output
> file. In short that's when fileSpec is not NULL.

I have sent the email too early here... In this case the sync is a
no-op. It is missing a sentence.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers