Re: [HACKERS] initdb and fsync

2012-07-13 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: On Tuesday, June 19, 2012 07:22:02 PM Jeff Davis wrote: Right now I'm inclined to leave the patch as-is. Fine with that, I wanted to bring it up and see it documented. I have marked it with ready for committer. That committer needs to decide on -

Re: [HACKERS] initdb and fsync

2012-07-13 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes: On Mon, 2012-06-18 at 20:57 +0200, Andres Freund wrote: Ok. Sensible reasons. I dislike that we know have two files using different logic (copydir.c only using fadvise, initdb using sync_file_range if available). Maybe we should just move the fadvise and

Re: [HACKERS] initdb and fsync

2012-07-13 Thread Tom Lane
I wrote: I'm picking up this patch now. What I'm inclined to do about the -N business is to commit without that, so that we get a round of testing in the buildfarm and find out about any portability issues, but then change to use -N after a week or so. I agree that in the long run we don't

Re: [HACKERS] initdb and fsync

2012-07-13 Thread Jeff Davis
On Fri, 2012-07-13 at 15:21 -0400, Tom Lane wrote: No, that's incorrect: the fadvise is there, inside pg_flush_data() which is done during the writing phase. Yeah, Andres pointed that out, also. It seems to me that if we think sync_file_range is a win, we ought to be using it in

Re: [HACKERS] initdb and fsync

2012-07-13 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes: For the case of initdb, the data needing to be fsync'd is effectively constant, and it's a lot of small files. If the requests don't make it to the io scheduler queue before fsync, the kernel doesn't have an opportunity to schedule them properly. But for

Re: [HACKERS] initdb and fsync

2012-07-13 Thread Jeff Davis
On Fri, 2012-07-13 at 17:35 -0400, Tom Lane wrote: I wrote: I'm picking up this patch now. What I'm inclined to do about the -N business is to commit without that, so that we get a round of testing in the buildfarm and find out about any portability issues, but then change to use -N

Re: [HACKERS] initdb and fsync

2012-07-13 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes: One point about the commit message: fadvise does not block to go into the request queue, sync_file_range does. The problem with fadvise is that, when the request queue is small, it fills up so fast that most of the requests never make it in, and fadvise is

Re: [HACKERS] initdb and fsync

2012-06-23 Thread Jeff Davis
On Fri, 2012-06-22 at 10:04 -0400, Robert Haas wrote: This may be a stupid question, by why is it initdb's job to fsync the files the server creates, rather than the server's job? Normally we rely on the server to make its own writes persistent. That was my first reaction as well:

Re: [HACKERS] initdb and fsync

2012-06-23 Thread Peter Eisentraut
On mån, 2012-06-18 at 20:57 +0200, Andres Freund wrote: I don't think the difference in initdb cost is relevant when running the regression tests. Should it prove to be we can re-add -N after a week or two in the buildfarm machines. Keep in mind that the regression tests are not only run on

Re: [HACKERS] initdb and fsync

2012-06-22 Thread Robert Haas
On Sat, Feb 4, 2012 at 8:18 PM, Noah Misch n...@leadboat.com wrote: On Sat, Feb 04, 2012 at 03:41:27PM -0800, Jeff Davis wrote: On Sat, 2012-01-28 at 13:18 -0500, Tom Lane wrote: Yeah.  Personally I would be sad if initdb got noticeably slower, and I've never seen or heard of a failure that

Re: [HACKERS] initdb and fsync

2012-06-22 Thread Noah Misch
On Fri, Jun 22, 2012 at 10:04:23AM -0400, Robert Haas wrote: On Sat, Feb 4, 2012 at 8:18 PM, Noah Misch n...@leadboat.com wrote: If we add fsync calls to the initdb process, they should cover the entire data directory tree. ?This patch syncs files that initdb.c writes, but we ought to

Re: [HACKERS] initdb and fsync

2012-06-19 Thread David Fetter
On Mon, Jun 18, 2012 at 09:34:30PM +0300, Peter Eisentraut wrote: On mån, 2012-06-18 at 18:05 +0200, Andres Freund wrote: - defaulting to initdb -N in the regression suite is not a good imo, because that way the buildfarm won't catch problems in that area... The regression test suite also

Re: [HACKERS] initdb and fsync

2012-06-19 Thread Jeff Davis
On Mon, 2012-06-18 at 21:41 +0200, Andres Freund wrote: It calls pg_flush_data inside of copy_file which does the posix_fadvise... So maybe just put the sync_file_range in pg_flush_data? The functions in fd.c aren't linked to initdb, so it's a challenge to share that code (I remember now:

Re: [HACKERS] initdb and fsync

2012-06-19 Thread Andres Freund
On Tuesday, June 19, 2012 07:22:02 PM Jeff Davis wrote: On Mon, 2012-06-18 at 21:41 +0200, Andres Freund wrote: It calls pg_flush_data inside of copy_file which does the posix_fadvise... So maybe just put the sync_file_range in pg_flush_data? The functions in fd.c aren't linked to initdb,

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Andres Freund
On Wednesday, June 13, 2012 06:53:17 PM Jeff Davis wrote: On Wed, 2012-06-13 at 13:53 +0300, Peter Eisentraut wrote: The --help output for the -N option was copy-and-pasted wrongly. The message issued when using -N is also a bit content-free. Maybe something like Running in nosync

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Peter Eisentraut
On mån, 2012-06-18 at 18:05 +0200, Andres Freund wrote: - defaulting to initdb -N in the regression suite is not a good imo, because that way the buildfarm won't catch problems in that area... The regression test suite also starts postgres with fsync off. This is good, and I don't want to

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Jeff Davis
On Mon, 2012-06-18 at 18:05 +0200, Andres Freund wrote: Quick review: - defaulting to initdb -N in the regression suite is not a good imo, because that way the buildfarm won't catch problems in that area... I removed the -N as you suggest. How much does performance matter on the buildfarm?

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Jeff Davis
On Mon, 2012-06-18 at 21:34 +0300, Peter Eisentraut wrote: On mån, 2012-06-18 at 18:05 +0200, Andres Freund wrote: - defaulting to initdb -N in the regression suite is not a good imo, because that way the buildfarm won't catch problems in that area... The regression test suite also starts

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Andres Freund
On Monday, June 18, 2012 08:39:47 PM Jeff Davis wrote: On Mon, 2012-06-18 at 18:05 +0200, Andres Freund wrote: Quick review: - defaulting to initdb -N in the regression suite is not a good imo, because that way the buildfarm won't catch problems in that area... I removed the -N as you

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Jeff Davis
On Mon, 2012-06-18 at 20:57 +0200, Andres Freund wrote: - defaulting to initdb -N in the regression suite is not a good imo, because that way the buildfarm won't catch problems in that area... I removed the -N as you suggest. How much does performance matter on the buildfarm? I don't

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Andres Freund
On Monday, June 18, 2012 09:32:25 PM Jeff Davis wrote: - could the copydir.c and initdb.c versions of walkdir/sync_fname et al be unified? There's a lot of backend-specific code in the copydir versions, like using ereport() and CHECK_FOR_INTERRUPTS(). I gave a brief attempt at

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Alvaro Herrera
Excerpts from Jeff Davis's message of lun jun 18 15:32:25 -0400 2012: On Mon, 2012-06-18 at 20:57 +0200, Andres Freund wrote: Btw, I just want to have said this, although I don't think its particularly relevant as it doesn't affect correctness: Its possible to have a system where

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Jeff Davis
On Mon, 2012-06-18 at 21:41 +0200, Andres Freund wrote: It calls pg_flush_data inside of copy_file which does the posix_fadvise... So maybe just put the sync_file_range in pg_flush_data? Oh, I didn't notice that, thank you. In that case, it may be good to combine them if possible. I will look

Re: [HACKERS] initdb and fsync

2012-06-13 Thread Peter Eisentraut
On tis, 2012-06-12 at 21:09 -0700, Jeff Davis wrote: On Sun, 2012-03-25 at 19:59 -0700, Jeff Davis wrote: On Sat, 2012-03-17 at 17:48 +0100, Cédric Villemain wrote: I agree with Andres. I believe we should use sync_file_range (_before?) with linux. And we can use

Re: [HACKERS] initdb and fsync

2012-06-13 Thread Jeff Davis
On Wed, 2012-06-13 at 13:53 +0300, Peter Eisentraut wrote: The --help output for the -N option was copy-and-pasted wrongly. The message issued when using -N is also a bit content-free. Maybe something like Running in nosync mode. The data directory might become corrupt if the operating

Re: [HACKERS] initdb and fsync

2012-06-12 Thread Jeff Davis
On Sun, 2012-03-25 at 19:59 -0700, Jeff Davis wrote: On Sat, 2012-03-17 at 17:48 +0100, Cédric Villemain wrote: I agree with Andres. I believe we should use sync_file_range (_before?) with linux. And we can use posix_fadvise_dontneed on other kernels. OK, updated patch

Re: [HACKERS] initdb and fsync

2012-03-25 Thread Jeff Davis
On Sat, 2012-03-17 at 17:48 +0100, Cédric Villemain wrote: I agree with Andres. I believe we should use sync_file_range (_before?) with linux. And we can use posix_fadvise_dontneed on other kernels. OK, updated patch attached. sync_file_range() is preferred, posix_fadvise() is a

Re: [HACKERS] initdb and fsync

2012-03-17 Thread Cédric Villemain
Le vendredi 16 mars 2012 16:51:04, Andres Freund a écrit : On Friday, March 16, 2012 04:47:06 PM Robert Haas wrote: On Fri, Mar 16, 2012 at 6:25 AM, Andres Freund and...@anarazel.de wrote: How are the results with sync_file_range(fd, 0, 0, SYNC_FILE_RANGE_WRITE)? That is much

Re: [HACKERS] initdb and fsync

2012-03-16 Thread Andres Freund
On Thursday, March 15, 2012 07:38:38 AM Jeff Davis wrote: On Wed, 2012-03-14 at 10:26 +0100, Andres Freund wrote: On Wednesday, March 14, 2012 05:23:03 AM Jeff Davis wrote: On Tue, 2012-03-13 at 09:42 +0100, Andres Freund wrote: for recursively everything in dir: posix_fadvise(fd,

Re: [HACKERS] initdb and fsync

2012-03-16 Thread Robert Haas
On Fri, Mar 16, 2012 at 6:25 AM, Andres Freund and...@anarazel.de wrote: How are the results with sync_file_range(fd, 0, 0, SYNC_FILE_RANGE_WRITE)? That is much faster than using fadvise. It goes down to ~2s. Unfortunately, that's non-portable. Any other ideas? 6.5s a little on the

Re: [HACKERS] initdb and fsync

2012-03-16 Thread Andres Freund
On Friday, March 16, 2012 04:47:06 PM Robert Haas wrote: On Fri, Mar 16, 2012 at 6:25 AM, Andres Freund and...@anarazel.de wrote: How are the results with sync_file_range(fd, 0, 0, SYNC_FILE_RANGE_WRITE)? That is much faster than using fadvise. It goes down to ~2s. Unfortunately,

Re: [HACKERS] initdb and fsync

2012-03-16 Thread Jeff Davis
On Fri, 2012-03-16 at 11:25 +0100, Andres Freund wrote: I take that back. There was something wrong with my test -- fadvise helps, but it only takes it from ~10s to ~6.5s. Not quite as good as I hoped. Thats surprising. I wouldn't expect such a big difference between fadvise +

Re: [HACKERS] initdb and fsync

2012-03-15 Thread Jeff Davis
On Wed, 2012-03-14 at 10:26 +0100, Andres Freund wrote: On Wednesday, March 14, 2012 05:23:03 AM Jeff Davis wrote: On Tue, 2012-03-13 at 09:42 +0100, Andres Freund wrote: for recursively everything in dir: posix_fadvise(fd, POSIX_FADV_DONTNEED); for recursively everything in

Re: [HACKERS] initdb and fsync

2012-03-14 Thread Andres Freund
On Wednesday, March 14, 2012 05:23:03 AM Jeff Davis wrote: On Tue, 2012-03-13 at 09:42 +0100, Andres Freund wrote: for recursively everything in dir: posix_fadvise(fd, POSIX_FADV_DONTNEED); for recursively everything in dir: fsync(fd); Wow, that made a huge difference! no

Re: [HACKERS] initdb and fsync

2012-03-13 Thread Andres Freund
On Tuesday, March 13, 2012 04:49:40 AM Jeff Davis wrote: On Sun, 2012-02-05 at 17:56 -0500, Noah Misch wrote: I meant primarily to illustrate the need to be comprehensive, not comment on which executable should fsync a particular file. Bootstrap-mode backends do not sync anything during an

Re: [HACKERS] initdb and fsync

2012-03-13 Thread Jeff Davis
On Tue, 2012-03-13 at 09:42 +0100, Andres Freund wrote: for recursively everything in dir: posix_fadvise(fd, POSIX_FADV_DONTNEED); for recursively everything in dir: fsync(fd); Wow, that made a huge difference! no sync: ~ 1.0s sync: ~10.0s fadvise+sync: ~ 1.3s

Re: [HACKERS] initdb and fsync

2012-03-12 Thread Jeff Davis
On Sun, 2012-02-05 at 17:56 -0500, Noah Misch wrote: I meant primarily to illustrate the need to be comprehensive, not comment on which executable should fsync a particular file. Bootstrap-mode backends do not sync anything during an initdb run on my system. With your patch, we'll fsync a

Re: [HACKERS] initdb and fsync

2012-02-13 Thread Robert Haas
On Fri, Feb 10, 2012 at 3:57 PM, Peter Eisentraut pete...@gmx.net wrote: On sön, 2012-02-05 at 10:53 -0800, Jeff Davis wrote: initdb should do these syncs by default and offer an option to disable them. For test frameworks that run initdb often, that makes sense. But for developers, it

Re: [HACKERS] initdb and fsync

2012-02-10 Thread Peter Eisentraut
On sön, 2012-02-05 at 10:53 -0800, Jeff Davis wrote: initdb should do these syncs by default and offer an option to disable them. For test frameworks that run initdb often, that makes sense. But for developers, it doesn't make sense to spend 0.5s typing an option that saves you 0.3s.

Re: [HACKERS] initdb and fsync

2012-02-05 Thread Jeff Davis
On Sat, 2012-02-04 at 20:18 -0500, Noah Misch wrote: If we add fsync calls to the initdb process, they should cover the entire data directory tree. This patch syncs files that initdb.c writes, but we ought to also sync files that bootstrap-mode backends had written. It doesn't make sense for

Re: [HACKERS] initdb and fsync

2012-02-05 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes: On Sat, 2012-02-04 at 20:18 -0500, Noah Misch wrote: If we add fsync calls to the initdb process, they should cover the entire data directory tree. This patch syncs files that initdb.c writes, but we ought to also sync files that bootstrap-mode backends

Re: [HACKERS] initdb and fsync

2012-02-05 Thread Noah Misch
On Sun, Feb 05, 2012 at 10:53:20AM -0800, Jeff Davis wrote: On Sat, 2012-02-04 at 20:18 -0500, Noah Misch wrote: If we add fsync calls to the initdb process, they should cover the entire data directory tree. This patch syncs files that initdb.c writes, but we ought to also sync files

Re: [HACKERS] initdb and fsync

2012-02-04 Thread Florian Weimer
* Tom Lane: I wonder whether it wouldn't be sufficient to call sync(2) at the end, anyway, rather than cluttering the entire initdb codebase with fsync calls. We tried to do this in the Debian package mananger. It works as expected on Linux systems, but it can cause a lot of data to hit the

Re: [HACKERS] initdb and fsync

2012-02-04 Thread Jeff Davis
On Sat, 2012-01-28 at 13:18 -0500, Tom Lane wrote: Yeah. Personally I would be sad if initdb got noticeably slower, and I've never seen or heard of a failure that this would fix. I worked up a patch, and it looks like it does about 6 file fsync's and a 7th for the PGDATA directory. That

Re: [HACKERS] initdb and fsync

2012-02-04 Thread Noah Misch
On Sat, Feb 04, 2012 at 03:41:27PM -0800, Jeff Davis wrote: On Sat, 2012-01-28 at 13:18 -0500, Tom Lane wrote: Yeah. Personally I would be sad if initdb got noticeably slower, and I've never seen or heard of a failure that this would fix. I worked up a patch, and it looks like it does

Re: [HACKERS] initdb and fsync

2012-01-28 Thread Andrew Dunstan
On 01/27/2012 11:52 PM, Noah Misch wrote: Is a platform-independent fsync be available at initdb time? Not sure. It's a macro on Windows that calls _commit(fd), so it should be portable enough. I'm curious what problem we're actually solving here, though. I've run the buildfarm

Re: [HACKERS] initdb and fsync

2012-01-28 Thread Jeff Davis
On Sat, 2012-01-28 at 10:31 -0500, Andrew Dunstan wrote: I'm curious what problem we're actually solving here, though. I've run the buildfarm countless thousands of times on different VMs, and five of my seven current animals run in VMs, and I don't think I've ever seen a failure ascribable

Re: [HACKERS] initdb and fsync

2012-01-28 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes: I'm curious what problem we're actually solving here, though. I've run the buildfarm countless thousands of times on different VMs, and five of my seven current animals run in VMs, and I don't think I've ever seen a failure ascribable to

Re: [HACKERS] initdb and fsync

2012-01-28 Thread Jeff Janes
On Sat, Jan 28, 2012 at 7:31 AM, Andrew Dunstan and...@dunslane.net wrote: On 01/27/2012 11:52 PM, Noah Misch wrote: Is a platform-independent fsync be available at initdb time? Not sure. It's a macro on Windows that calls _commit(fd), so it should be portable enough. I'm curious what

Re: [HACKERS] initdb and fsync

2012-01-28 Thread Jeff Davis
On Sat, 2012-01-28 at 13:18 -0500, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: I'm curious what problem we're actually solving here, though. I've run the buildfarm countless thousands of times on different VMs, and five of my seven current animals run in VMs, and I don't

Re: [HACKERS] initdb and fsync

2012-01-28 Thread Jeff Janes
On Sat, Jan 28, 2012 at 10:18 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andrew Dunstan and...@dunslane.net writes: I'm curious what problem we're actually solving here, though. I've run the buildfarm countless thousands of times on different VMs, and five of my seven current animals run in VMs,

Re: [HACKERS] initdb and fsync

2012-01-28 Thread Andrew Dunstan
On 01/28/2012 01:46 PM, Jeff Davis wrote: On Sat, 2012-01-28 at 13:18 -0500, Tom Lane wrote: Andrew Dunstanand...@dunslane.net writes: I'm curious what problem we're actually solving here, though. I've run the buildfarm countless thousands of times on different VMs, and five of my seven

[HACKERS] initdb and fsync

2012-01-27 Thread Jeff Davis
It looks like initdb doesn't fsync all the files it creates, e.g. the PG_VERSION file. While it's unlikely that it would cause any real data loss, it can be inconvenient in some testing scenarios involving VMs. Thoughts? Would a patch to add a few fsync calls to initdb be accepted? Is a

Re: [HACKERS] initdb and fsync

2012-01-27 Thread Noah Misch
On Fri, Jan 27, 2012 at 04:19:41PM -0800, Jeff Davis wrote: It looks like initdb doesn't fsync all the files it creates, e.g. the PG_VERSION file. While it's unlikely that it would cause any real data loss, it can be inconvenient in some testing scenarios involving VMs. Thoughts? Would a