Re: [HACKERS] Potential data loss of 2PC files

2017-03-27 Thread Michael Paquier
On Tue, Mar 28, 2017 at 9:37 AM, Michael Paquier wrote: > On Tue, Mar 28, 2017 at 8:38 AM, Tsunakawa, Takayuki > wrote: >> From: pgsql-hackers-ow...@postgresql.org >>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Michael

Re: [HACKERS] Potential data loss of 2PC files

2017-03-27 Thread Michael Paquier
On Tue, Mar 28, 2017 at 8:38 AM, Tsunakawa, Takayuki wrote: > From: pgsql-hackers-ow...@postgresql.org >> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Michael Paquier >> Do you think that this qualifies as a bug fix for a backpatch? I would think >> so,

Re: [HACKERS] Potential data loss of 2PC files

2017-03-27 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Michael Paquier > Do you think that this qualifies as a bug fix for a backpatch? I would think > so, but I would not mind waiting for some dust to be on it before considering > applying that on

Re: [HACKERS] Potential data loss of 2PC files

2017-03-27 Thread Michael Paquier
On Tue, Mar 28, 2017 at 1:34 AM, Teodor Sigaev wrote: > Thank you, pushed. Thanks! Do you think that this qualifies as a bug fix for a backpatch? I would think so, but I would not mind waiting for some dust to be on it before considering applying that on back-branches.

Re: [HACKERS] Potential data loss of 2PC files

2017-03-27 Thread Teodor Sigaev
Thank you, pushed Michael Paquier wrote: On Fri, Mar 24, 2017 at 11:36 PM, Teodor Sigaev wrote: And the renaming of pg_clog to pg_xact is also my fault. Attached is an updated patch. Thank you. One more question: what about symlinks? If DBA moves, for example, pg_xact to

Re: [HACKERS] Potential data loss of 2PC files

2017-03-27 Thread Teodor Sigaev
Thank you. One more question: what about symlinks? If DBA moves, for example, pg_xact to another dist and leaves the symlink in data directoty. Suppose, fsync on symlink will do nothing actually. I did not think of this case, but is that really common? There is even I saw a lot such cases.

Re: [HACKERS] Potential data loss of 2PC files

2017-03-24 Thread Michael Paquier
On Fri, Mar 24, 2017 at 11:36 PM, Teodor Sigaev wrote: >> And the renaming of pg_clog to pg_xact is also my fault. Attached is >> an updated patch. > > > Thank you. One more question: what about symlinks? If DBA moves, for > example, pg_xact to another dist and leaves the

Re: [HACKERS] Potential data loss of 2PC files

2017-03-24 Thread Teodor Sigaev
And the renaming of pg_clog to pg_xact is also my fault. Attached is an updated patch. Thank you. One more question: what about symlinks? If DBA moves, for example, pg_xact to another dist and leaves the symlink in data directoty. Suppose, fsync on symlink will do nothing actually. --

Re: [HACKERS] Potential data loss of 2PC files

2017-03-23 Thread Michael Paquier
On Fri, Mar 24, 2017 at 5:08 AM, Teodor Sigaev wrote: > Hmm, it doesn't work (but appplies) on current HEAD: > [...] > Data page checksums are disabled. > > fixing permissions on existing directory /spool/pg_data ... ok > creating subdirectories ... ok > selecting default

Re: [HACKERS] Potential data loss of 2PC files

2017-03-23 Thread Teodor Sigaev
Hmm, it doesn't work (but appplies) on current HEAD: % uname -a FreeBSD *** 11.0-RELEASE-p8 FreeBSD 11.0-RELEASE-p8 #0 r315651: Tue Mar 21 02:44:23 MSK 2017 teodor@***:/usr/obj/usr/src/sys/XOR amd64 % pg_config --configure '--enable-depend' '--enable-cassert' '--enable-debug'

Re: [HACKERS] Potential data loss of 2PC files

2017-03-21 Thread Michael Paquier
On Wed, Mar 22, 2017 at 12:46 AM, Teodor Sigaev wrote: If that can happen, don't we have the same problem in many other places? Like, all the SLRUs? They don't fsync the directory either. >>> >>> Right, pg_commit_ts and pg_clog enter in this category. >> >> >>

Re: [HACKERS] Potential data loss of 2PC files

2017-03-21 Thread Teodor Sigaev
If that can happen, don't we have the same problem in many other places? Like, all the SLRUs? They don't fsync the directory either. Right, pg_commit_ts and pg_clog enter in this category. Implemented as attached. Is unlink() guaranteed to be durable, without fsyncing the directory? If not,

Re: [HACKERS] Potential data loss of 2PC files

2017-03-17 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Ashutosh Bapat > The scope of this work has expanded, since last time I reviewed and marked > it as RFC. Right now I am busy with partition-wise joins and do not have > sufficient time to take a

Re: [HACKERS] Potential data loss of 2PC files

2017-03-16 Thread Ashutosh Bapat
On Thu, Mar 16, 2017 at 10:17 PM, David Steele wrote: > On 2/13/17 12:10 AM, Michael Paquier wrote: >> On Tue, Jan 31, 2017 at 11:07 AM, Michael Paquier >> wrote: >>> On Mon, Jan 30, 2017 at 10:52 PM, Heikki Linnakangas >>>

Re: [HACKERS] Potential data loss of 2PC files

2017-03-16 Thread David Steele
On 2/13/17 12:10 AM, Michael Paquier wrote: > On Tue, Jan 31, 2017 at 11:07 AM, Michael Paquier > wrote: >> On Mon, Jan 30, 2017 at 10:52 PM, Heikki Linnakangas wrote: >>> If that can happen, don't we have the same problem in many other places? >>>

Re: [HACKERS] Potential data loss of 2PC files

2017-02-12 Thread Michael Paquier
On Tue, Jan 31, 2017 at 11:07 AM, Michael Paquier wrote: > On Mon, Jan 30, 2017 at 10:52 PM, Heikki Linnakangas wrote: >> If that can happen, don't we have the same problem in many other places? >> Like, all the SLRUs? They don't fsync the directory

Re: [HACKERS] Potential data loss of 2PC files

2017-01-30 Thread Michael Paquier
On Fri, Jan 6, 2017 at 9:26 PM, Ashutosh Bapat wrote: > On Wed, Jan 4, 2017 at 12:16 PM, Michael Paquier > wrote: >> On Wed, Jan 4, 2017 at 1:23 PM, Ashutosh Bapat >> wrote: >>> I don't have anything

Re: [HACKERS] Potential data loss of 2PC files

2017-01-30 Thread Michael Paquier
On Mon, Jan 30, 2017 at 10:52 PM, Heikki Linnakangas wrote: > So, if I understood correctly, the problem scenario is: > > 1. Create and write to a file. > 2. fsync() the file. > 3. Crash. > 4. After restart, the file is gone. Yes, that's a problem with fsync's durability, and we

Re: [HACKERS] Potential data loss of 2PC files

2017-01-30 Thread Heikki Linnakangas
On 12/27/2016 01:31 PM, Andres Freund wrote: On 2016-12-27 14:09:05 +0900, Michael Paquier wrote: On Fri, Dec 23, 2016 at 3:02 AM, Andres Freund wrote: Not quite IIRC: that doesn't deal with file size increase. All this would be easier if hardlinks wouldn't exist IIUC.

Re: [HACKERS] Potential data loss of 2PC files

2017-01-06 Thread Ashutosh Bapat
Marking this as ready for committer. On Wed, Jan 4, 2017 at 12:16 PM, Michael Paquier wrote: > On Wed, Jan 4, 2017 at 1:23 PM, Ashutosh Bapat > wrote: >> I don't have anything more to review in this patch. I will leave that >>

Re: [HACKERS] Potential data loss of 2PC files

2017-01-03 Thread Michael Paquier
On Wed, Jan 4, 2017 at 1:23 PM, Ashutosh Bapat wrote: > I don't have anything more to review in this patch. I will leave that > commitfest entry in "needs review" status for few days in case anyone > else wants to review it. If none is going to review it, we can

Re: [HACKERS] Potential data loss of 2PC files

2017-01-03 Thread Ashutosh Bapat
On Tue, Jan 3, 2017 at 5:38 PM, Michael Paquier wrote: > On Tue, Jan 3, 2017 at 8:41 PM, Ashutosh Bapat > wrote: >> Are you talking about >> /* >> * Now we can mark ourselves as out of the commit critical section: a >> *

Re: [HACKERS] Potential data loss of 2PC files

2017-01-03 Thread Michael Paquier
On Tue, Jan 3, 2017 at 8:41 PM, Ashutosh Bapat wrote: > Are you talking about > /* > * Now we can mark ourselves as out of the commit critical section: a > * checkpoint starting after this will certainly see the gxact as a > * candidate for

Re: [HACKERS] Potential data loss of 2PC files

2017-01-03 Thread Ashutosh Bapat
On Tue, Jan 3, 2017 at 2:50 PM, Michael Paquier wrote: > On Tue, Jan 3, 2017 at 3:32 PM, Ashutosh Bapat > wrote: >> I am wondering what happens if a 2PC file gets created, at the time of >> checkpoint we flush the pg_twophase directory,

Re: [HACKERS] Potential data loss of 2PC files

2017-01-03 Thread Michael Paquier
On Tue, Jan 3, 2017 at 3:32 PM, Ashutosh Bapat wrote: > I am wondering what happens if a 2PC file gets created, at the time of > checkpoint we flush the pg_twophase directory, then the file gets > removed. Do we need to flush the directory to ensure that the

Re: [HACKERS] Potential data loss of 2PC files

2017-01-02 Thread Ashutosh Bapat
On Sat, Dec 31, 2016 at 5:53 AM, Michael Paquier wrote: > On Fri, Dec 30, 2016 at 10:59 PM, Ashutosh Bapat > wrote: >>> >>> Well, flushing the meta-data of pg_twophase is really going to be far >>> cheaper than the many pages done until

Re: [HACKERS] Potential data loss of 2PC files

2016-12-30 Thread Michael Paquier
On Fri, Dec 30, 2016 at 10:59 PM, Ashutosh Bapat wrote: >> >> Well, flushing the meta-data of pg_twophase is really going to be far >> cheaper than the many pages done until CheckpointTwoPhase is reached. >> There should really be a check on serialized_xacts for

Re: [HACKERS] Potential data loss of 2PC files

2016-12-30 Thread Ashutosh Bapat
> > Well, flushing the meta-data of pg_twophase is really going to be far > cheaper than the many pages done until CheckpointTwoPhase is reached. > There should really be a check on serialized_xacts for the > non-recovery code path, but considering how cheap that's going to be > compared to the

Re: [HACKERS] Potential data loss of 2PC files

2016-12-30 Thread Michael Paquier
On Fri, Dec 30, 2016 at 5:20 PM, Ashutosh Bapat wrote: > As per the prologue of the function, it doesn't expect any 2PC files > to be written out in the function i.e. between two checkpoints. Most > of those are created and deleted between two checkpoints. Same

Re: [HACKERS] Potential data loss of 2PC files

2016-12-30 Thread Ashutosh Bapat
On Fri, Dec 30, 2016 at 11:22 AM, Michael Paquier wrote: > On Thu, Dec 29, 2016 at 6:41 PM, Ashutosh Bapat > wrote: >> I agree with this. >> If no prepared transactions were required to be fsynced >> CheckPointTwoPhase(), do we want to

Re: [HACKERS] Potential data loss of 2PC files

2016-12-29 Thread Michael Paquier
On Thu, Dec 29, 2016 at 6:41 PM, Ashutosh Bapat wrote: > I agree with this. > If no prepared transactions were required to be fsynced > CheckPointTwoPhase(), do we want to still fsync the directory? > Probably not. > > May be you want to call

Re: [HACKERS] Potential data loss of 2PC files

2016-12-29 Thread Ashutosh Bapat
On Thu, Dec 22, 2016 at 7:00 AM, Michael Paquier wrote: > Hi all, > > 2PC files are created using RecreateTwoPhaseFile() in two places currently: > - at replay on a XLOG_XACT_PREPARE record. > - At checkpoint with CheckPointTwoPhase(). > > Now RecreateTwoPhaseFile() is

Re: [HACKERS] Potential data loss of 2PC files

2016-12-27 Thread Andres Freund
On 2016-12-27 14:09:05 +0900, Michael Paquier wrote: > On Fri, Dec 23, 2016 at 3:02 AM, Andres Freund wrote: > > Not quite IIRC: that doesn't deal with file size increase. All this would > > be easier if hardlinks wouldn't exist IIUC. It's basically a question > > whether

Re: [HACKERS] Potential data loss of 2PC files

2016-12-26 Thread Michael Paquier
On Fri, Dec 23, 2016 at 3:02 AM, Andres Freund wrote: > Not quite IIRC: that doesn't deal with file size increase. All this would be > easier if hardlinks wouldn't exist IIUC. It's basically a question whether > dentry, inode or contents need to be synced. Yes, it sucks.

Re: [HACKERS] Potential data loss of 2PC files

2016-12-22 Thread Michael Paquier
On Fri, Dec 23, 2016 at 6:33 AM, Jim Nasby wrote: > On 12/22/16 12:02 PM, Andres Freund wrote: >> >> >> On December 22, 2016 6:44:22 PM GMT+01:00, Robert Haas >> wrote: >>> >>> On Thu, Dec 22, 2016 at 12:39 PM, Andres Freund

Re: [HACKERS] Potential data loss of 2PC files

2016-12-22 Thread Jim Nasby
On 12/22/16 12:02 PM, Andres Freund wrote: On December 22, 2016 6:44:22 PM GMT+01:00, Robert Haas wrote: On Thu, Dec 22, 2016 at 12:39 PM, Andres Freund wrote: It makes more sense of you mentally separate between filename(s) and file contents.

Re: [HACKERS] Potential data loss of 2PC files

2016-12-22 Thread Andres Freund
On December 22, 2016 6:44:22 PM GMT+01:00, Robert Haas wrote: >On Thu, Dec 22, 2016 at 12:39 PM, Andres Freund >wrote: >> It makes more sense of you mentally separate between filename(s) and >file contents. Having to do filesystem metatata

Re: [HACKERS] Potential data loss of 2PC files

2016-12-22 Thread Robert Haas
On Thu, Dec 22, 2016 at 12:39 PM, Andres Freund wrote: > It makes more sense of you mentally separate between filename(s) and file > contents. Having to do filesystem metatata transactions for an fsync > intended to sync contents would be annoying... I thought that's why

Re: [HACKERS] Potential data loss of 2PC files

2016-12-22 Thread Andres Freund
On December 22, 2016 5:50:38 PM GMT+01:00, Robert Haas wrote: >On Wed, Dec 21, 2016 at 8:30 PM, Michael Paquier > wrote: >> Hi all, >> >> 2PC files are created using RecreateTwoPhaseFile() in two places >currently: >> - at replay on a

Re: [HACKERS] Potential data loss of 2PC files

2016-12-22 Thread Robert Haas
On Wed, Dec 21, 2016 at 8:30 PM, Michael Paquier wrote: > Hi all, > > 2PC files are created using RecreateTwoPhaseFile() in two places currently: > - at replay on a XLOG_XACT_PREPARE record. > - At checkpoint with CheckPointTwoPhase(). > > Now RecreateTwoPhaseFile() is

[HACKERS] Potential data loss of 2PC files

2016-12-21 Thread Michael Paquier
Hi all, 2PC files are created using RecreateTwoPhaseFile() in two places currently: - at replay on a XLOG_XACT_PREPARE record. - At checkpoint with CheckPointTwoPhase(). Now RecreateTwoPhaseFile() is careful to call pg_fsync() to be sure that the 2PC files find their way into disk. But one piece