Re: [HACKERS] silent data loss with ext4 / all current versions

2016-05-12 Thread Michael Paquier
On Thu, May 12, 2016 at 2:58 PM, Michael Paquier wrote: > On Mon, Mar 28, 2016 at 8:25 AM, Andres Freund wrote: >> I've also noticed that > > Coming back to this issue because... > >> a) pg_basebackup doesn't do anything about durability (it

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-05-11 Thread Michael Paquier
On Mon, Mar 28, 2016 at 8:25 AM, Andres Freund wrote: > I've also noticed that Coming back to this issue because... > a) pg_basebackup doesn't do anything about durability (it probably needs >a very similar patch to the one pg_rewind just received). I think that one of

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-27 Thread Michael Paquier
On Mon, Mar 28, 2016 at 8:25 AM, Andres Freund wrote: > On 2016-03-18 15:08:32 +0900, Michael Paquier wrote: >> + fprintf(stderr, _("%s: could not rename file \"%s\": %s\n"), >> + progname, current_walfile_name, >> strerror(errno)); > >

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-27 Thread Andres Freund
Hi, On 2016-03-18 15:08:32 +0900, Michael Paquier wrote: > +/* > + * Sync data directory to ensure that what has been generated up to now is > + * persistent in case of a crash, and this is done once globally for > + * performance reasons as sync requests on individual files would be > + * a

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-19 Thread Michael Paquier
On Wed, Mar 16, 2016 at 2:46 AM, Andres Freund wrote: > On 2016-03-15 15:39:50 +0100, Michael Paquier wrote: >> Yeah, true. We definitely need to do something for that, even for HEAD >> it seems like an overkill to have something in for example src/common >> to allow frontends

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-19 Thread Andres Freund
On 2016-03-17 23:05:42 +0900, Michael Paquier wrote: > > Are you working on a fix for pg_rewind? Let's go with initdb -S in a > > first iteration, then we can, if somebody is interest enough, work on > > making this nicer in master. > > I am really -1 for this approach. Wrapping initdb -S with >

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-19 Thread Michael Paquier
On Fri, Mar 18, 2016 at 12:03 AM, Andres Freund wrote: > This is a *much* more expensive approach though. Doing the fsync > directly after modifying the file. One file by one file. Will usually > result in each fsync blocking for a while. > > In comparison of doing a flush and

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-18 Thread Robert Haas
On Thu, Mar 17, 2016 at 11:03 AM, Andres Freund wrote: > On 2016-03-17 23:05:42 +0900, Michael Paquier wrote: >> > Are you working on a fix for pg_rewind? Let's go with initdb -S in a >> > first iteration, then we can, if somebody is interest enough, work on >> > making this

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-15 Thread David Steele
On 3/15/16 10:39 AM, Michael Paquier wrote: > On Thu, Mar 10, 2016 at 4:25 AM, Andres Freund wrote: > >> Note that we currently have some frontend programs with the equivalent >> problem. Most importantly receivelog.c (pg_basebackup/pg_recveivexlog) >> are missing pretty much the same directory

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-15 Thread Andres Freund
On 2016-03-15 15:39:50 +0100, Michael Paquier wrote: > I have finally been able to spend some time reviewing what you pushed > on back-branches, and things are in correct shape I think. One small > issue that I have is that for EXEC_BACKEND builds, in > write_nondefault_variables we still use one

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-15 Thread Michael Paquier
On Thu, Mar 10, 2016 at 4:25 AM, Andres Freund wrote: > I've finally pushed these, after making a number of mostly cosmetic > fixes. The only of real consequence is that I've removed the durable_* > call from the renames to .deleted in xlog[archive].c - these don't need > to be durable, and are

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-09 Thread Andres Freund
On 2016-03-07 21:55:52 -0800, Andres Freund wrote: > Here's my updated version. > > Note that I've split the patch into two. One for the infrastructure, and > one for the callsites. I've finally pushed these, after making a number of mostly cosmetic fixes. The only of real consequence is that

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-08 Thread Joshua D. Drake
On 03/08/2016 02:16 PM, Robert Haas wrote: On Mon, Mar 7, 2016 at 10:18 PM, Andres Freund wrote: Instead of "durable" I think that "persistent" makes more sense. I find durable a lot more descriptive. persistent could refer to retrying the rename or something. Yeah, I

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-08 Thread Andres Freund
On 2016-03-08 23:47:48 +0100, Tomas Vondra wrote: > I've repeated the power-loss testing today. With the patches applied I'm > not longer able to reproduce the issue (despite trying about 10x), while > without them I've hit it on the first try. This is on kernel 4.4.2. Yay, thanks for testing!

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-08 Thread Tomas Vondra
Hi, On Mon, 2016-03-07 at 21:55 -0800, Andres Freund wrote: > On 2016-03-08 12:26:34 +0900, Michael Paquier wrote: > > On Tue, Mar 8, 2016 at 12:18 PM, Andres Freund wrote: > > > On 2016-03-08 12:01:18 +0900, Michael Paquier wrote: > > >> I have spent a couple of hours

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-08 Thread Robert Haas
On Mon, Mar 7, 2016 at 10:18 PM, Andres Freund wrote: >> Instead of "durable" I think that "persistent" makes more sense. > > I find durable a lot more descriptive. persistent could refer to > retrying the rename or something. Yeah, I like durable, too. -- Robert Haas

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-07 Thread Andres Freund
Hi, On 2016-03-08 16:21:45 +0900, Michael Paquier wrote: > + durable_link_or_rename(tmppath, path, ERROR); > + durable_rename(path, xlogfpath, ERROR); > You may want to add a (void) cast in front of those calls for correctness. "correctness"? This is neatnikism, not correctness. I've

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-07 Thread Michael Paquier
On Tue, Mar 8, 2016 at 2:55 PM, Andres Freund wrote: > On 2016-03-08 12:26:34 +0900, Michael Paquier wrote: >> On Tue, Mar 8, 2016 at 12:18 PM, Andres Freund wrote: >> > On 2016-03-08 12:01:18 +0900, Michael Paquier wrote: >> >> I have spent a couple of

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-07 Thread Andres Freund
On 2016-03-08 12:26:34 +0900, Michael Paquier wrote: > On Tue, Mar 8, 2016 at 12:18 PM, Andres Freund wrote: > > On 2016-03-08 12:01:18 +0900, Michael Paquier wrote: > >> I have spent a couple of hours looking at that in details, and the > >> patch is neat. > > > > Cool. Doing

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-07 Thread Michael Paquier
On Tue, Mar 8, 2016 at 12:18 PM, Andres Freund wrote: > On 2016-03-08 12:01:18 +0900, Michael Paquier wrote: >> I have spent a couple of hours looking at that in details, and the >> patch is neat. > > Cool. Doing some more polishing right now. Will be back with an updated >

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-07 Thread Andres Freund
Hi, On 2016-03-08 12:01:18 +0900, Michael Paquier wrote: > I have spent a couple of hours looking at that in details, and the > patch is neat. Cool. Doing some more polishing right now. Will be back with an updated version soonish. Did you do some testing? > + * This routine ensures that,

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-07 Thread Michael Paquier
On Mon, Mar 7, 2016 at 3:38 PM, Andres Freund wrote: > On 2016-03-05 19:54:05 -0800, Andres Freund wrote: >> I started working on this; delayed by taking longer than planned on the >> logical decoding stuff (quite a bit complicated by >>

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-06 Thread Andres Freund
Hi, On 2016-03-05 19:54:05 -0800, Andres Freund wrote: > I started working on this; delayed by taking longer than planned on the > logical decoding stuff (quite a bit complicated by > e1a11d93111ff3fba7a91f3f2ac0b0aca16909a8). I'm not very happy with the > error handling as it is right now. For

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-05 Thread Andres Freund
On 2016-03-05 22:25:36 +0900, Michael Paquier wrote: > OK, I hacked a v7: > - Move the link()/rename() group with HAVE_WORKING_LINK into a single > routine, making the previous link_safe renamed to replace_safe. This > is sharing a lot of things with rename_safe. I am not sure it is worth >

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-05 Thread Michael Paquier
On Sat, Mar 5, 2016 at 7:47 AM, Andres Freund wrote: > On 2016-03-05 07:43:00 +0900, Michael Paquier wrote: >> On Sat, Mar 5, 2016 at 7:35 AM, Andres Freund wrote: >> > On 2016-03-04 14:51:50 +0900, Michael Paquier wrote: >> >> On Fri, Mar 4, 2016 at 4:06

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Andres Freund
On 2016-03-05 07:43:00 +0900, Michael Paquier wrote: > On Sat, Mar 5, 2016 at 7:35 AM, Andres Freund wrote: > > On 2016-03-04 14:51:50 +0900, Michael Paquier wrote: > >> On Fri, Mar 4, 2016 at 4:06 AM, Andres Freund wrote: > >> Hm. OK. I don't see any

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Michael Paquier
On Sat, Mar 5, 2016 at 7:37 AM, Andres Freund wrote: > On 2016-03-05 07:29:35 +0900, Michael Paquier wrote: >> OK. I could produce that by tonight my time, not before unfortunately. > > I'm switching to this patch, after pushing the pending logical decoding > fixes. Probably

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Michael Paquier
On Sat, Mar 5, 2016 at 7:35 AM, Andres Freund wrote: > On 2016-03-04 14:51:50 +0900, Michael Paquier wrote: >> On Fri, Mar 4, 2016 at 4:06 AM, Andres Freund wrote: >> > I don't think we want any stat()s here. I'd much, much rather check open >> > for

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Andres Freund
On 2016-03-05 07:29:35 +0900, Michael Paquier wrote: > OK. I could produce that by tonight my time, not before unfortunately. I'm switching to this patch, after pushing the pending logical decoding fixes. Probably not today, but tomorrow PST afternoon should work. > And FWIW, per the comments of

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Andres Freund
On 2016-03-04 14:51:50 +0900, Michael Paquier wrote: > On Fri, Mar 4, 2016 at 4:06 AM, Andres Freund wrote: > > Hi, > > Thanks for the review. > > >> +/* > >> + * rename_safe -- rename of a file, making it on-disk persistent > >> + * > >> + * This routine ensures that a

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Michael Paquier
On Sat, Mar 5, 2016 at 1:23 AM, Robert Haas wrote: > On Fri, Mar 4, 2016 at 11:09 AM, Tom Lane wrote: >> Alvaro Herrera writes: >>> I would like to have a patch for this finalized today, so that we can >>> apply to master

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Robert Haas
On Fri, Mar 4, 2016 at 11:09 AM, Tom Lane wrote: > Alvaro Herrera writes: >> I would like to have a patch for this finalized today, so that we can >> apply to master before or during the weekend; with it in the tree for >> about a week we can be more

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Tom Lane
Alvaro Herrera writes: > I would like to have a patch for this finalized today, so that we can > apply to master before or during the weekend; with it in the tree for > about a week we can be more confident and backpatch close to next > weekend, so that we see it in the

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-04 Thread Alvaro Herrera
I would like to have a patch for this finalized today, so that we can apply to master before or during the weekend; with it in the tree for about a week we can be more confident and backpatch close to next weekend, so that we see it in the next set of minor releases. Does that sound good? --

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-03 Thread Michael Paquier
On Fri, Mar 4, 2016 at 4:06 AM, Andres Freund wrote: > Hi, Thanks for the review. >> +/* >> + * rename_safe -- rename of a file, making it on-disk persistent >> + * >> + * This routine ensures that a rename file persists in case of a crash by >> using >> + * fsync on the

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-03 Thread Andres Freund
Hi, > +/* > + * rename_safe -- rename of a file, making it on-disk persistent > + * > + * This routine ensures that a rename file persists in case of a crash by > using > + * fsync on the old and new files before and after performing the rename so > as > + * this categorizes as an

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-23 Thread Michael Paquier
On Wed, Feb 24, 2016 at 7:26 AM, Tomas Vondra wrote: > 1) I'm not quite sure why the patch adds missing_ok to fsync_fname()? The > only place where we use missing_ok=true is in rename_safe, where right at > the beginning we do this: > > fsync_fname(newfile, false, true); > > I.e. we're

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-23 Thread Tomas Vondra
Hi, On 02/05/2016 10:40 AM, Michael Paquier wrote: On Thu, Feb 4, 2016 at 2:34 PM, Michael Paquier wrote: On Thu, Feb 4, 2016 at 12:02 PM, Michael Paquier wrote: On Tue, Feb 2, 2016 at 4:20 PM, Michael Paquier wrote: ... So, attached

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-06 Thread Andres Freund
On 2016-02-06 17:43:48 +0100, Tomas Vondra wrote: > >Still the data is here... But well. I won't insist. > > Huh? This thread started by an example how to cause loss of committed > transactions. That fits my definition of "data loss" quite well. Agreed, that view doesn't seem to make much sense.

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-06 Thread Tomas Vondra
Hi, On 02/06/2016 01:16 PM, Michael Paquier wrote: On Sat, Feb 6, 2016 at 2:11 AM, Tomas Vondra wrote: On 02/04/2016 09:59 AM, Michael Paquier wrote: On Tue, Feb 2, 2016 at 9:59 AM, Andres Freund wrote: On 2016-02-02 09:56:40 +0900,

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-06 Thread Michael Paquier
On Sat, Feb 6, 2016 at 2:11 AM, Tomas Vondra wrote: > On 02/04/2016 09:59 AM, Michael Paquier wrote: >> >> On Tue, Feb 2, 2016 at 9:59 AM, Andres Freund wrote: >>> >>> On 2016-02-02 09:56:40 +0900, Michael Paquier wrote: And there is no

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-05 Thread Tomas Vondra
On 02/04/2016 09:59 AM, Michael Paquier wrote: On Tue, Feb 2, 2016 at 9:59 AM, Andres Freund wrote: On 2016-02-02 09:56:40 +0900, Michael Paquier wrote: And there is no actual risk of data loss Huh? More precise: what I mean here is that should an OS crash or a power

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-05 Thread Michael Paquier
On Thu, Feb 4, 2016 at 2:34 PM, Michael Paquier wrote: > On Thu, Feb 4, 2016 at 12:02 PM, Michael Paquier > wrote: >> On Tue, Feb 2, 2016 at 4:20 PM, Michael Paquier wrote: >>> Not wrong, and this leads to the following: >>> void

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-04 Thread Michael Paquier
On Thu, Feb 4, 2016 at 12:02 PM, Michael Paquier wrote: > On Tue, Feb 2, 2016 at 4:20 PM, Michael Paquier wrote: >> Not wrong, and this leads to the following: >> void rename_safe(const char *old, const char *new, bool isdir, int elevel); >> Controlling elevel is

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-04 Thread Michael Paquier
On Tue, Feb 2, 2016 at 9:59 AM, Andres Freund wrote: > On 2016-02-02 09:56:40 +0900, Michael Paquier wrote: >> And there is no actual risk of data loss > > Huh? More precise: what I mean here is that should an OS crash or a power failure happen, we would fall back to recovery

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-04 Thread Michael Paquier
On Tue, Feb 2, 2016 at 4:20 PM, Michael Paquier wrote: > Not wrong, and this leads to the following: > void rename_safe(const char *old, const char *new, bool isdir, int elevel); > Controlling elevel is necessary per the multiple code paths that would > use it. Some use ERROR, most of them FATAL,

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-01 Thread Michael Paquier
On Tue, Feb 2, 2016 at 1:08 AM, Andres Freund wrote: > On 2016-02-01 16:49:46 +0100, Alvaro Herrera wrote: >> Yeah. On 9.4 there are already some conflicts, and I'm sure there will >> be more in almost each branch. Does anyone want to volunteer for >> producing per-branch

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-01 Thread Andres Freund
On 2016-02-02 09:56:40 +0900, Michael Paquier wrote: > And there is no actual risk of data loss Huh? - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-01 Thread Michael Paquier
On Tue, Feb 2, 2016 at 12:49 AM, Alvaro Herrera wrote: > Michael Paquier wrote: >> On Mon, Jan 25, 2016 at 6:50 PM, Tomas Vondra >> wrote: >> > Seems OK to me. Thanks for the time and improvements! >> >> Thanks. Perhaps a committer could

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-01 Thread Andres Freund
On 2016-01-25 16:30:47 +0900, Michael Paquier wrote: > diff --git a/src/backend/access/transam/xlog.c > b/src/backend/access/transam/xlog.c > index a2846c4..b124f90 100644 > --- a/src/backend/access/transam/xlog.c > +++ b/src/backend/access/transam/xlog.c > @@ -3278,6 +3278,14 @@

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-01 Thread Andres Freund
On 2016-02-01 16:49:46 +0100, Alvaro Herrera wrote: > Yeah. On 9.4 there are already some conflicts, and I'm sure there will > be more in almost each branch. Does anyone want to volunteer for > producing per-branch versions? > The next minor release is to be tagged next week and it'd be good to

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-01 Thread Michael Paquier
On Tue, Feb 2, 2016 at 1:07 AM, Andres Freund wrote: > On 2016-01-25 16:30:47 +0900, Michael Paquier wrote: >> diff --git a/src/backend/access/transam/xlog.c >> b/src/backend/access/transam/xlog.c >> index a2846c4..b124f90 100644 >> --- a/src/backend/access/transam/xlog.c >>

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-02-01 Thread Alvaro Herrera
Michael Paquier wrote: > On Mon, Jan 25, 2016 at 6:50 PM, Tomas Vondra > wrote: > > Seems OK to me. Thanks for the time and improvements! > > Thanks. Perhaps a committer could have a look then? I have switched > the patch as such in the CF app. Seeing the

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-25 Thread Michael Paquier
On Mon, Jan 25, 2016 at 6:50 PM, Tomas Vondra wrote: > Seems OK to me. Thanks for the time and improvements! Thanks. Perhaps a committer could have a look then? I have switched the patch as such in the CF app. Seeing the accumulated feedback upthread that's

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-25 Thread Tomas Vondra
On 01/25/2016 08:30 AM, Michael Paquier wrote: On Fri, Jan 22, 2016 at 9:32 PM, Michael Paquier wrote: ,,, My first line of thoughts after looking at the patch is that I am not against adding those fsync calls on HEAD as there is roughly an advantage to not go

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-24 Thread Michael Paquier
On Fri, Jan 22, 2016 at 9:32 PM, Michael Paquier wrote: > On Fri, Jan 22, 2016 at 5:26 PM, Tomas Vondra > wrote: >> On 01/22/2016 06:45 AM, Michael Paquier wrote: >>> Here are some comments about your patch after a look at the code. >>>

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-23 Thread Michael Paquier
On Sat, Jan 23, 2016 at 11:39 AM, Tomas Vondra wrote: > On 01/23/2016 02:35 AM, Michael Paquier wrote: >> >> On Fri, Jan 22, 2016 at 9:41 PM, Greg Stark wrote: >>> On Fri, Jan 22, 2016 at 8:26 AM, Tomas Vondra >>> wrote:

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-22 Thread Tomas Vondra
Hi, On 01/22/2016 06:45 AM, Michael Paquier wrote: So, I have been playing with a Linux VM with VMware Fusion and on ext4 with data=ordered the renames are getting lost if the root folder is not fsync. By killing-9 the VM I am able to reproduce that really easily. Yep. Same experience here

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-22 Thread Magnus Hagander
On Fri, Jan 22, 2016 at 9:26 AM, Tomas Vondra wrote: > Hi, > > On 01/22/2016 06:45 AM, Michael Paquier wrote: > > So, I have been playing with a Linux VM with VMware Fusion and on >> ext4 with data=ordered the renames are getting lost if the root >> folder is not

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-22 Thread Greg Stark
On Fri, Jan 22, 2016 at 8:26 AM, Tomas Vondra wrote: > On 01/22/2016 06:45 AM, Michael Paquier wrote: > >> So, I have been playing with a Linux VM with VMware Fusion and on >> ext4 with data=ordered the renames are getting lost if the root >> folder is not fsync. By

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-22 Thread Andres Freund
On 2016-01-22 21:32:29 +0900, Michael Paquier wrote: > Group shot with 3), 4) and 5). Well, there is no data loss here, > putting me in the direction of considering this addition of an fsync > as an optimization and not a bug. I think this is an extremely weak argument. The reasoning when exactly

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-22 Thread Michael Paquier
On Fri, Jan 22, 2016 at 5:26 PM, Tomas Vondra wrote: > On 01/22/2016 06:45 AM, Michael Paquier wrote: >> Here are some comments about your patch after a look at the code. >> >> Regarding the additions in fsync_fname() in xlog.c: >> 1) In InstallXLogFileSegment,

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-22 Thread Tomas Vondra
On 01/23/2016 02:35 AM, Michael Paquier wrote: On Fri, Jan 22, 2016 at 9:41 PM, Greg Stark wrote: On Fri, Jan 22, 2016 at 8:26 AM, Tomas Vondra wrote: On 01/22/2016 06:45 AM, Michael Paquier wrote: So, I have been playing with a Linux VM with

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-22 Thread Michael Paquier
On Fri, Jan 22, 2016 at 9:41 PM, Greg Stark wrote: > On Fri, Jan 22, 2016 at 8:26 AM, Tomas Vondra > wrote: >> On 01/22/2016 06:45 AM, Michael Paquier wrote: >> >>> So, I have been playing with a Linux VM with VMware Fusion and on >>> ext4 with

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-21 Thread Michael Paquier
On Tue, Jan 19, 2016 at 4:20 PM, Tomas Vondra wrote: > > > On 01/19/2016 08:03 AM, Michael Paquier wrote: >> >> On Tue, Jan 19, 2016 at 3:58 PM, Tomas Vondra >> wrote: >>> >>> > ... Tomas, I am planning to have a look at that,

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-18 Thread Michael Paquier
On Wed, Dec 2, 2015 at 3:24 PM, Michael Paquier wrote: > On Wed, Dec 2, 2015 at 3:23 PM, Michael Paquier > wrote: >> On Wed, Dec 2, 2015 at 7:05 AM, Tomas Vondra >> wrote: >>> Attached is v2 of the patch, that

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-18 Thread Tomas Vondra
On 01/19/2016 07:44 AM, Michael Paquier wrote: On Wed, Dec 2, 2015 at 3:24 PM, Michael Paquier wrote: On Wed, Dec 2, 2015 at 3:23 PM, Michael Paquier wrote: On Wed, Dec 2, 2015 at 7:05 AM, Tomas Vondra

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-18 Thread Michael Paquier
On Tue, Jan 19, 2016 at 3:58 PM, Tomas Vondra wrote: > > > On 01/19/2016 07:44 AM, Michael Paquier wrote: >> >> On Wed, Dec 2, 2015 at 3:24 PM, Michael Paquier >> wrote: >>> >>> On Wed, Dec 2, 2015 at 3:23 PM, Michael Paquier >>>

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-18 Thread Tomas Vondra
On 01/19/2016 08:03 AM, Michael Paquier wrote: On Tue, Jan 19, 2016 at 3:58 PM, Tomas Vondra wrote: ... Tomas, I am planning to have a look at that, because it seems to be important. In case it becomes lost on my radar, do you mind if I add it to the 2016-03

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-12-01 Thread Peter Eisentraut
On 11/27/15 8:18 AM, Michael Paquier wrote: > On Fri, Nov 27, 2015 at 8:17 PM, Tomas Vondra > wrote: >> > So, what's going on? The problem is that while the rename() is atomic, it's >> > not guaranteed to be durable without an explicit fsync on the parent >> >

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-12-01 Thread Tomas Vondra
Attached is v2 of the patch, that (a) adds explicit fsync on the parent directory after all the rename() calls in timeline.c, xlog.c, xlogarchive.c and pgarch.c (b) adds START/END_CRIT_SECTION around the new fsync_fname calls (except for those in timeline.c, as the

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-12-01 Thread Tomas Vondra
On 12/01/2015 10:44 PM, Peter Eisentraut wrote: On 11/27/15 8:18 AM, Michael Paquier wrote: On Fri, Nov 27, 2015 at 8:17 PM, Tomas Vondra wrote: So, what's going on? The problem is that while the rename() is atomic, it's not guaranteed to be durable without an

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-12-01 Thread Michael Paquier
On Wed, Dec 2, 2015 at 3:23 PM, Michael Paquier wrote: > On Wed, Dec 2, 2015 at 7:05 AM, Tomas Vondra > wrote: >> Attached is v2 of the patch, that >> >> (a) adds explicit fsync on the parent directory after all the rename() >> calls

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-12-01 Thread Michael Paquier
On Wed, Dec 2, 2015 at 7:05 AM, Tomas Vondra wrote: > Attached is v2 of the patch, that > > (a) adds explicit fsync on the parent directory after all the rename() > calls in timeline.c, xlog.c, xlogarchive.c and pgarch.c > > (b) adds START/END_CRIT_SECTION around

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Tomas Vondra
On 11/29/2015 03:33 PM, Tomas Vondra wrote: Hi, On 11/29/2015 02:38 PM, Craig Ringer wrote: I've had a few tries at implementing a qemu-based crashtester where it hard kills the qemu instance at a random point then starts it back up. I've tried to reproduce the issue by killing a qemu VM,

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Tomas Vondra
Hi, On 11/29/2015 02:38 PM, Craig Ringer wrote: On 27 November 2015 at 21:28, Greg Stark > wrote: On Fri, Nov 27, 2015 at 11:17 AM, Tomas Vondra > wrote: > I plan to do more

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Tomas Vondra
On 11/29/2015 02:41 PM, Craig Ringer wrote: On 27 November 2015 at 19:17, Tomas Vondra > wrote: It's also possible to mitigate this by setting wal_sync_method=fsync Are you sure? https://lwn.net/Articles/322823/ tends

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Michael Paquier
On Sat, Nov 28, 2015 at 3:01 AM, Tomas Vondra wrote: > > > On 11/27/2015 02:18 PM, Michael Paquier wrote: >> >> On Fri, Nov 27, 2015 at 8:17 PM, Tomas Vondra >> wrote: >>> >>> So, what's going on? The problem is that while the rename()

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Craig Ringer
On 27 November 2015 at 21:28, Greg Stark wrote: > On Fri, Nov 27, 2015 at 11:17 AM, Tomas Vondra > wrote: > > I plan to do more power failure testing soon, with more complex test > > scenarios. I suspect there might be other similar issues (e.g. when

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-29 Thread Craig Ringer
On 27 November 2015 at 19:17, Tomas Vondra wrote: > It's also possible to mitigate this by setting wal_sync_method=fsync Are you sure? https://lwn.net/Articles/322823/ tends to suggest that fsync() on the file is insufficient to ensure rename() is persistent,

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-27 Thread Tomas Vondra
On 11/27/2015 02:18 PM, Michael Paquier wrote: On Fri, Nov 27, 2015 at 8:17 PM, Tomas Vondra wrote: So, what's going on? The problem is that while the rename() is atomic, it's not guaranteed to be durable without an explicit fsync on the parent directory. And by

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-27 Thread Teodor Sigaev
What happens is that when we recycle WAL segments, we rename them and then sync them using fdatasync (which is the default on Linux). However fdatasync does not force fsync on the parent directory, so in case of power failure the rename may get lost. The recovery won't realize those segments

[HACKERS] silent data loss with ext4 / all current versions

2015-11-27 Thread Tomas Vondra
Hi, I've been doing some power failure tests (i.e. unexpectedly interrupting power) a few days ago, and I've discovered a fairly serious case of silent data loss on ext3/ext4. Initially i thought it's a filesystem bug, but after further investigation I'm pretty sure it's our fault. What

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-27 Thread Tomas Vondra
Hi, On 11/27/2015 02:28 PM, Greg Stark wrote: On Fri, Nov 27, 2015 at 11:17 AM, Tomas Vondra wrote: I plan to do more power failure testing soon, with more complex test scenarios. I suspect there might be other similar issues (e.g. when we rename a file before a

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-27 Thread Michael Paquier
On Fri, Nov 27, 2015 at 8:17 PM, Tomas Vondra wrote: > So, what's going on? The problem is that while the rename() is atomic, it's > not guaranteed to be durable without an explicit fsync on the parent > directory. And by default we only do fdatasync on the recycled

Re: [HACKERS] silent data loss with ext4 / all current versions

2015-11-27 Thread Greg Stark
On Fri, Nov 27, 2015 at 11:17 AM, Tomas Vondra wrote: > I plan to do more power failure testing soon, with more complex test > scenarios. I suspect there might be other similar issues (e.g. when we > rename a file before a checkpoint and don't fsync the directory -