On Mon, May 13, 2013 at 8:32 AM, Andres Freund <and...@2ndquadrant.com> wrote: > On 2013-05-12 19:41:26 -0500, Jon Nelson wrote: >> On Sun, May 12, 2013 at 3:46 PM, Jim Nasby <j...@nasby.net> wrote: >> > On 5/10/13 1:06 PM, Jeff Janes wrote: >> >> >> >> Of course the paranoid DBA could turn off restart_after_crash and do a >> >> manual investigation on every crash, but in that case the database would >> >> refuse to restart even in the case where it perfectly clear that all the >> >> following WAL belongs to the recycled file and not the current file. >> > >> > >> > Perhaps we should also allow for zeroing out WAL files before reuse (or >> > just >> > disable reuse). I know there's a performance hit there, but the reuse idea >> > happened before we had bgWriter. Theoretically the overhead creating a new >> > file would always fall to bgWriter and therefore not be a big deal. >> >> For filesystems like btrfs, re-using a WAL file is suboptimal to >> simply creating a new one and removing the old one when it's no longer >> required. Using fallocate (or posix_fallocate) (I have a patch for >> that!) to create a new one is - by my tests - 28 times faster than the >> currently-used method. > > I don't think the comparison between just fallocate()ing and what we > currently do is fair. fallocate() doesn't guarantee that the file is the > same size after a crash, so you would still need an fsync() or we > couldn't use fdatasync() anymore. And I'd guess the benefits aren't all > that big anymore in that case?
fallocate (16MB) + fsync is still almost certainly faster than write+write+write... + fsync. The test I performed at the time did exactly that .. posix_fallocate + pg_fsync. > That said, using posix_fallocate seems like a good idea in lots of > places inside pg, its just not all that easy to do in some of the > places. I should not derail this thread any further. Perhaps, if interested parties would like to discuss the use of fallocate/posix_fallocate, a new thread might be more appropriate? -- Jon -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers