Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> As I recall, that was based on testing on some different platforms.
> > But why perfer O_DSYNC over fdatasync if you don't prefer O_SYNC over
> > fsync?
> It's what tested out as the best bet. I think we were using pgbench
> as the test platform, which as you know I have doubts about, but at
> least it is testing one actual write/sync pattern Postgres can generate.
> The choice between the open flags and fdatasync/fsync depends a whole
> lot on your writing patterns (how much data you tend to write between
> fsync points), so I don't have a lot of faith in randomly-chosen test
> programs as a guide to what to use for Postgres.
I assume pgbench has so much variance that trying to see fsync changes
in there would be hopeless.
> >> What does that mean? You can't fsync a closed file.
> > You reopen and fsync.
> Um. I just looked at that test program, and I think it needs a whole
> lot of work yet.
> * Some of the test cases count open()/close() overhead, some don't.
> This is bad, especially on platforms like Solaris where open() is
> notoriously expensive.
The only one I saw that had an extra open() was the fsync after close
test. I add a do-nothing open/close to the previous test so they are
> * You really cannot put any faith in measuring a single write,
> especially on a machine that's not *completely* idle otherwise.
> I'd feel somewhat comfortable if you wrote, say, 1000 8K blocks and
> measured the time for that. (And you have to think about how far
OK, it now measures a loop of 1000.
> apart the fsyncs are in that sequence; you probably want to repeat the
> measurement with several different fsync spacings.) It would also be
> a good idea to compare writing 1000 successive blocks with rewriting
> the same block 1000 times --- if the latter does not happen roughly
> at the disk RPM rate, then we know the drive is lying and all the
> numbers should be discarded as meaningless.
> * The program is claimed to test whether you can write from one process
> and fsync from another, but it does no such thing AFAICS.
It really just shows whether the fsync fater the close has similar
timing to the one before the close. That was the best way I could think
to test it.
> BTW, rather than hard-wiring the test file name, why don't you let it be
> specified on the command line? That would make it lots easier for
> people to compare the performance of several disk drives, if they have
I have updated the test program in CVS.
New BSD/OS results:
Simple write timing:
Compare fsync times on write() and non-write() descriptor:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 0.868831
write, close, fsync 0.717281
Compare one o_sync write to two:
one 16k o_sync write 10.121422
two 8k o_sync writes 4.405151
Compare file sync methods with one 8k write:
open o_sync, write 1.542213
write, fsync, 1.703689
Compare file sync methods with 2 8k writes:
(The fastest should be used for wal_sync_method)
open o_sync, write 4.498607
write, fsync, 2.473842
Bruce Momjian | http://candle.pha.pa.us
[EMAIL PROTECTED] | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster