Re: [HACKERS] initdb and fsync

2012-07-13 Thread Tom Lane
Jeff Davis writes: > One point about the commit message: fadvise does not block to go into > the request queue, sync_file_range does. The problem with fadvise is > that, when the request queue is small, it fills up so fast that most of > the requests never make it in, and fadvise is essentially a

Re: [HACKERS] initdb and fsync

2012-07-13 Thread Jeff Davis
On Fri, 2012-07-13 at 17:35 -0400, Tom Lane wrote: > I wrote: > > I'm picking up this patch now. What I'm inclined to do about the -N > > business is to commit without that, so that we get a round of testing > > in the buildfarm and find out about any portability issues, but then > > change to use

Re: [HACKERS] initdb and fsync

2012-07-13 Thread Tom Lane
Jeff Davis writes: > For the case of initdb, the data needing to be fsync'd is effectively > constant, and it's a lot of small files. If the requests don't make it > to the io scheduler queue before fsync, the kernel doesn't have an > opportunity to schedule them properly. > But for larger amount

Re: [HACKERS] initdb and fsync

2012-07-13 Thread Jeff Davis
On Fri, 2012-07-13 at 15:21 -0400, Tom Lane wrote: > No, that's incorrect: the fadvise is there, inside pg_flush_data() which > is done during the writing phase. Yeah, Andres pointed that out, also. > It seems to me that if we think > sync_file_range is a win, we ought to be using it in pg_flus

Re: [HACKERS] initdb and fsync

2012-07-13 Thread Tom Lane
I wrote: > I'm picking up this patch now. What I'm inclined to do about the -N > business is to commit without that, so that we get a round of testing > in the buildfarm and find out about any portability issues, but then > change to use -N after a week or so. I agree that in the long run > we do

Re: [HACKERS] initdb and fsync

2012-07-13 Thread Tom Lane
Jeff Davis writes: > On Mon, 2012-06-18 at 20:57 +0200, Andres Freund wrote: >> Ok. Sensible reasons. I dislike that we know have two files using different >> logic (copydir.c only using fadvise, initdb using sync_file_range if >> available). Maybe we should just move the fadvise and sync_file_r

Re: [HACKERS] initdb and fsync

2012-07-13 Thread Tom Lane
Andres Freund writes: > On Tuesday, June 19, 2012 07:22:02 PM Jeff Davis wrote: >> Right now I'm inclined to leave the patch as-is. > Fine with that, I wanted to bring it up and see it documented. > I have marked it with ready for committer. That committer needs to decide on - > N in the regress

Re: [HACKERS] initdb and fsync

2012-06-23 Thread Peter Eisentraut
On mån, 2012-06-18 at 20:57 +0200, Andres Freund wrote: > I don't think the difference in initdb cost is relevant when running > the regression tests. Should it prove to be we can re-add -N after a > week or two in the buildfarm machines. Keep in mind that the regression tests are not only run on

Re: [HACKERS] initdb and fsync

2012-06-23 Thread Jeff Davis
On Fri, 2012-06-22 at 10:04 -0400, Robert Haas wrote: > This may be a stupid question, by why is it initdb's job to fsync the > files the server creates, rather than the server's job? Normally we > rely on the server to make its own writes persistent. That was my first reaction as well: http://a

Re: [HACKERS] initdb and fsync

2012-06-22 Thread Noah Misch
On Fri, Jun 22, 2012 at 10:04:23AM -0400, Robert Haas wrote: > On Sat, Feb 4, 2012 at 8:18 PM, Noah Misch wrote: > > If we add fsync calls to the initdb process, they should cover the entire > > data > > directory tree. ?This patch syncs files that initdb.c writes, but we ought > > to > > also s

Re: [HACKERS] initdb and fsync

2012-06-22 Thread Robert Haas
On Sat, Feb 4, 2012 at 8:18 PM, Noah Misch wrote: > On Sat, Feb 04, 2012 at 03:41:27PM -0800, Jeff Davis wrote: >> On Sat, 2012-01-28 at 13:18 -0500, Tom Lane wrote: >> > Yeah.  Personally I would be sad if initdb got noticeably slower, and >> > I've never seen or heard of a failure that this woul

Re: [HACKERS] initdb and fsync

2012-06-19 Thread Andres Freund
On Tuesday, June 19, 2012 07:22:02 PM Jeff Davis wrote: > On Mon, 2012-06-18 at 21:41 +0200, Andres Freund wrote: > > It calls pg_flush_data inside of copy_file which does the > > posix_fadvise... So maybe just put the sync_file_range in pg_flush_data? > > The functions in fd.c aren't linked to in

Re: [HACKERS] initdb and fsync

2012-06-19 Thread Jeff Davis
On Mon, 2012-06-18 at 21:41 +0200, Andres Freund wrote: > It calls pg_flush_data inside of copy_file which does the posix_fadvise... So > maybe just put the sync_file_range in pg_flush_data? The functions in fd.c aren't linked to initdb, so it's a challenge to share that code (I remember now: tha

Re: [HACKERS] initdb and fsync

2012-06-19 Thread David Fetter
On Mon, Jun 18, 2012 at 09:34:30PM +0300, Peter Eisentraut wrote: > On mån, 2012-06-18 at 18:05 +0200, Andres Freund wrote: > > - defaulting to initdb -N in the regression suite is not a good imo, > > because that way the buildfarm won't catch problems in that area... > > > The regression test sui

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Jeff Davis
On Mon, 2012-06-18 at 21:41 +0200, Andres Freund wrote: > It calls pg_flush_data inside of copy_file which does the posix_fadvise... So > maybe just put the sync_file_range in pg_flush_data? Oh, I didn't notice that, thank you. In that case, it may be good to combine them if possible. I will loo

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Alvaro Herrera
Excerpts from Jeff Davis's message of lun jun 18 15:32:25 -0400 2012: > On Mon, 2012-06-18 at 20:57 +0200, Andres Freund wrote: > > Btw, I just want to have said this, although I don't think its particularly > > relevant as it doesn't affect correctness: Its possible to have a system > > where

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Andres Freund
On Monday, June 18, 2012 09:32:25 PM Jeff Davis wrote: > > > > - could the copydir.c and initdb.c versions of walkdir/sync_fname et > > > > al be unified? > > > > > > There's a lot of backend-specific code in the copydir versions, like > > > using ereport() and CHECK_FOR_INTERRUPTS(). I gave a bri

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Jeff Davis
On Mon, 2012-06-18 at 20:57 +0200, Andres Freund wrote: > > > - defaulting to initdb -N in the regression suite is not a good imo, > > > because that way the buildfarm won't catch problems in that area... > > I removed the -N as you suggest. How much does performance matter on the > > buildfarm? >

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Andres Freund
On Monday, June 18, 2012 08:39:47 PM Jeff Davis wrote: > On Mon, 2012-06-18 at 18:05 +0200, Andres Freund wrote: > > Quick review: > > - defaulting to initdb -N in the regression suite is not a good imo, > > because that way the buildfarm won't catch problems in that area... > I removed the -N as y

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Jeff Davis
On Mon, 2012-06-18 at 21:34 +0300, Peter Eisentraut wrote: > On mån, 2012-06-18 at 18:05 +0200, Andres Freund wrote: > > - defaulting to initdb -N in the regression suite is not a good imo, > > because that way the buildfarm won't catch problems in that area... > > > The regression test suite also

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Jeff Davis
On Mon, 2012-06-18 at 18:05 +0200, Andres Freund wrote: > Quick review: > - defaulting to initdb -N in the regression suite is not a good imo, because > that way the buildfarm won't catch problems in that area... I removed the -N as you suggest. How much does performance matter on the buildfarm?

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Peter Eisentraut
On mån, 2012-06-18 at 18:05 +0200, Andres Freund wrote: > - defaulting to initdb -N in the regression suite is not a good imo, > because that way the buildfarm won't catch problems in that area... > The regression test suite also starts postgres with fsync off. This is good, and I don't want to h

Re: [HACKERS] initdb and fsync

2012-06-18 Thread Andres Freund
On Wednesday, June 13, 2012 06:53:17 PM Jeff Davis wrote: > On Wed, 2012-06-13 at 13:53 +0300, Peter Eisentraut wrote: > > The --help output for the -N option was copy-and-pasted wrongly. > > > > The message issued when using -N is also a bit content-free. Maybe > > something like > > > > "Runni

Re: [HACKERS] initdb and fsync

2012-06-13 Thread Jeff Davis
On Wed, 2012-06-13 at 13:53 +0300, Peter Eisentraut wrote: > The --help output for the -N option was copy-and-pasted wrongly. > > The message issued when using -N is also a bit content-free. Maybe > something like > > "Running in nosync mode. The data directory might become corrupt if the > ope

Re: [HACKERS] initdb and fsync

2012-06-13 Thread Peter Eisentraut
On tis, 2012-06-12 at 21:09 -0700, Jeff Davis wrote: > On Sun, 2012-03-25 at 19:59 -0700, Jeff Davis wrote: > > On Sat, 2012-03-17 at 17:48 +0100, Cédric Villemain wrote: > > > I agree with Andres. > > > > > > > > > I believe we should use sync_file_range (_before?) with linux. > > > > > > And w

Re: [HACKERS] initdb and fsync

2012-06-12 Thread Jeff Davis
On Sun, 2012-03-25 at 19:59 -0700, Jeff Davis wrote: > On Sat, 2012-03-17 at 17:48 +0100, Cédric Villemain wrote: > > I agree with Andres. > > > > > > I believe we should use sync_file_range (_before?) with linux. > > > > And we can use posix_fadvise_dontneed on other kernels. > > > OK, updated

Re: [HACKERS] initdb and fsync

2012-03-25 Thread Jeff Davis
On Sat, 2012-03-17 at 17:48 +0100, Cédric Villemain wrote: > I agree with Andres. > > > I believe we should use sync_file_range (_before?) with linux. > > And we can use posix_fadvise_dontneed on other kernels. > OK, updated patch attached. sync_file_range() is preferred, posix_fadvise() is a f

Re: [HACKERS] initdb and fsync

2012-03-17 Thread Cédric Villemain
Le vendredi 16 mars 2012 16:51:04, Andres Freund a écrit : > On Friday, March 16, 2012 04:47:06 PM Robert Haas wrote: > > On Fri, Mar 16, 2012 at 6:25 AM, Andres Freund wrote: > > >> > How are the results with sync_file_range(fd, 0, 0, > > >> > SYNC_FILE_RANGE_WRITE)? > > >> > > >> That is much f

Re: [HACKERS] initdb and fsync

2012-03-16 Thread Jeff Davis
On Fri, 2012-03-16 at 11:25 +0100, Andres Freund wrote: > > I take that back. There was something wrong with my test -- fadvise > > helps, but it only takes it from ~10s to ~6.5s. Not quite as good as I > > hoped. > Thats surprising. I wouldn't expect such a big difference between fadvise + > sync

Re: [HACKERS] initdb and fsync

2012-03-16 Thread Andres Freund
On Friday, March 16, 2012 04:47:06 PM Robert Haas wrote: > On Fri, Mar 16, 2012 at 6:25 AM, Andres Freund wrote: > >> > How are the results with sync_file_range(fd, 0, 0, > >> > SYNC_FILE_RANGE_WRITE)? > >> > >> That is much faster than using fadvise. It goes down to ~2s. > >> > >> Unfortunately

Re: [HACKERS] initdb and fsync

2012-03-16 Thread Robert Haas
On Fri, Mar 16, 2012 at 6:25 AM, Andres Freund wrote: >> > How are the results with sync_file_range(fd, 0, 0, >> > SYNC_FILE_RANGE_WRITE)? >> That is much faster than using fadvise. It goes down to ~2s. > >> Unfortunately, that's non-portable. Any other ideas? 6.5s a little on >> the annoying side

Re: [HACKERS] initdb and fsync

2012-03-16 Thread Andres Freund
On Thursday, March 15, 2012 07:38:38 AM Jeff Davis wrote: > On Wed, 2012-03-14 at 10:26 +0100, Andres Freund wrote: > > On Wednesday, March 14, 2012 05:23:03 AM Jeff Davis wrote: > > > On Tue, 2012-03-13 at 09:42 +0100, Andres Freund wrote: > > > > for recursively everything in dir: > > > >posi

Re: [HACKERS] initdb and fsync

2012-03-14 Thread Jeff Davis
On Wed, 2012-03-14 at 10:26 +0100, Andres Freund wrote: > On Wednesday, March 14, 2012 05:23:03 AM Jeff Davis wrote: > > On Tue, 2012-03-13 at 09:42 +0100, Andres Freund wrote: > > > for recursively everything in dir: > > >posix_fadvise(fd, POSIX_FADV_DONTNEED); > > > > > > for recursively eve

Re: [HACKERS] initdb and fsync

2012-03-14 Thread Andres Freund
On Wednesday, March 14, 2012 05:23:03 AM Jeff Davis wrote: > On Tue, 2012-03-13 at 09:42 +0100, Andres Freund wrote: > > for recursively everything in dir: > >posix_fadvise(fd, POSIX_FADV_DONTNEED); > > > > for recursively everything in dir: > >fsync(fd); > > Wow, that made a huge differe

Re: [HACKERS] initdb and fsync

2012-03-13 Thread Jeff Davis
On Tue, 2012-03-13 at 09:42 +0100, Andres Freund wrote: > for recursively everything in dir: >posix_fadvise(fd, POSIX_FADV_DONTNEED); > > for recursively everything in dir: >fsync(fd); Wow, that made a huge difference! no sync: ~ 1.0s sync: ~10.0s fadvise+sync: ~ 1.3s

Re: [HACKERS] initdb and fsync

2012-03-13 Thread Andres Freund
On Tuesday, March 13, 2012 04:49:40 AM Jeff Davis wrote: > On Sun, 2012-02-05 at 17:56 -0500, Noah Misch wrote: > > I meant primarily to illustrate the need to be comprehensive, not comment > > on which executable should fsync a particular file. Bootstrap-mode > > backends do not sync anything dur

Re: [HACKERS] initdb and fsync

2012-03-12 Thread Jeff Davis
On Sun, 2012-02-05 at 17:56 -0500, Noah Misch wrote: > I meant primarily to illustrate the need to be comprehensive, not comment on > which executable should fsync a particular file. Bootstrap-mode backends do > not sync anything during an initdb run on my system. With your patch, we'll > fsync a

Re: [HACKERS] initdb and fsync

2012-02-13 Thread Robert Haas
On Fri, Feb 10, 2012 at 3:57 PM, Peter Eisentraut wrote: > On sön, 2012-02-05 at 10:53 -0800, Jeff Davis wrote: >> > initdb should do these syncs by default and offer an option to >> disable them. >> >> For test frameworks that run initdb often, that makes sense. >> >> But for developers, it doesn

Re: [HACKERS] initdb and fsync

2012-02-10 Thread Peter Eisentraut
On sön, 2012-02-05 at 10:53 -0800, Jeff Davis wrote: > > initdb should do these syncs by default and offer an option to > disable them. > > For test frameworks that run initdb often, that makes sense. > > But for developers, it doesn't make sense to spend 0.5s typing an > option > that saves you

Re: [HACKERS] initdb and fsync

2012-02-05 Thread Noah Misch
On Sun, Feb 05, 2012 at 10:53:20AM -0800, Jeff Davis wrote: > On Sat, 2012-02-04 at 20:18 -0500, Noah Misch wrote: > > If we add fsync calls to the initdb process, they should cover the entire > > data > > directory tree. This patch syncs files that initdb.c writes, but we ought > > to > > also

Re: [HACKERS] initdb and fsync

2012-02-05 Thread Tom Lane
Jeff Davis writes: > On Sat, 2012-02-04 at 20:18 -0500, Noah Misch wrote: >> If we add fsync calls to the initdb process, they should cover the entire >> data >> directory tree. This patch syncs files that initdb.c writes, but we ought to >> also sync files that bootstrap-mode backends had writt

Re: [HACKERS] initdb and fsync

2012-02-05 Thread Jeff Davis
On Sat, 2012-02-04 at 20:18 -0500, Noah Misch wrote: > If we add fsync calls to the initdb process, they should cover the entire data > directory tree. This patch syncs files that initdb.c writes, but we ought to > also sync files that bootstrap-mode backends had written. It doesn't make sense fo

Re: [HACKERS] initdb and fsync

2012-02-04 Thread Noah Misch
On Sat, Feb 04, 2012 at 03:41:27PM -0800, Jeff Davis wrote: > On Sat, 2012-01-28 at 13:18 -0500, Tom Lane wrote: > > Yeah. Personally I would be sad if initdb got noticeably slower, and > > I've never seen or heard of a failure that this would fix. > > I worked up a patch, and it looks like it do

Re: [HACKERS] initdb and fsync

2012-02-04 Thread Jeff Davis
On Sat, 2012-01-28 at 13:18 -0500, Tom Lane wrote: > Yeah. Personally I would be sad if initdb got noticeably slower, and > I've never seen or heard of a failure that this would fix. I worked up a patch, and it looks like it does about 6 file fsync's and a 7th for the PGDATA directory. That degra

Re: [HACKERS] initdb and fsync

2012-02-04 Thread Florian Weimer
* Tom Lane: > I wonder whether it wouldn't be sufficient to call sync(2) at the end, > anyway, rather than cluttering the entire initdb codebase with fsync > calls. We tried to do this in the Debian package mananger. It works as expected on Linux systems, but it can cause a lot of data to hit th

Re: [HACKERS] initdb and fsync

2012-01-28 Thread Andrew Dunstan
On 01/28/2012 01:46 PM, Jeff Davis wrote: On Sat, 2012-01-28 at 13:18 -0500, Tom Lane wrote: Andrew Dunstan writes: I'm curious what problem we're actually solving here, though. I've run the buildfarm countless thousands of times on different VMs, and five of my seven current animals run in

Re: [HACKERS] initdb and fsync

2012-01-28 Thread Jeff Janes
On Sat, Jan 28, 2012 at 10:18 AM, Tom Lane wrote: > Andrew Dunstan writes: >> I'm curious what problem we're actually solving here, though. I've run >> the buildfarm countless thousands of times on different VMs, and five of >> my seven current animals run in VMs, and I don't think I've ever seen

Re: [HACKERS] initdb and fsync

2012-01-28 Thread Jeff Davis
On Sat, 2012-01-28 at 13:18 -0500, Tom Lane wrote: > Andrew Dunstan writes: > > I'm curious what problem we're actually solving here, though. I've run > > the buildfarm countless thousands of times on different VMs, and five of > > my seven current animals run in VMs, and I don't think I've ever

Re: [HACKERS] initdb and fsync

2012-01-28 Thread Jeff Janes
On Sat, Jan 28, 2012 at 7:31 AM, Andrew Dunstan wrote: > > > On 01/27/2012 11:52 PM, Noah Misch wrote: >>> >>> Is a platform-independent fsync be available at initdb time? >> >> Not sure. >> > > It's a macro on Windows that calls _commit(fd), so it should be portable > enough. > > I'm curious what

Re: [HACKERS] initdb and fsync

2012-01-28 Thread Tom Lane
Andrew Dunstan writes: > I'm curious what problem we're actually solving here, though. I've run > the buildfarm countless thousands of times on different VMs, and five of > my seven current animals run in VMs, and I don't think I've ever seen a > failure ascribable to inadequately synced files

Re: [HACKERS] initdb and fsync

2012-01-28 Thread Jeff Davis
On Sat, 2012-01-28 at 10:31 -0500, Andrew Dunstan wrote: > I'm curious what problem we're actually solving here, though. I've run > the buildfarm countless thousands of times on different VMs, and five of > my seven current animals run in VMs, and I don't think I've ever seen a > failure ascriba

Re: [HACKERS] initdb and fsync

2012-01-28 Thread Andrew Dunstan
On 01/27/2012 11:52 PM, Noah Misch wrote: Is a platform-independent fsync be available at initdb time? Not sure. It's a macro on Windows that calls _commit(fd), so it should be portable enough. I'm curious what problem we're actually solving here, though. I've run the buildfarm countles

Re: [HACKERS] initdb and fsync

2012-01-27 Thread Noah Misch
On Fri, Jan 27, 2012 at 04:19:41PM -0800, Jeff Davis wrote: > It looks like initdb doesn't fsync all the files it creates, e.g. the > PG_VERSION file. > > While it's unlikely that it would cause any real data loss, it can be > inconvenient in some testing scenarios involving VMs. > > Thoughts? Wo