Re: [HACKERS] fsync vs open_sync (more info)

2004-08-10 Thread pgsql
Some more information:

I started to perform the tests on one of the machines in my lab, and guess
what, almost no difference between fsync and open_sync. Either on jfs or
ext2.

The difference, Linux 2.6.3? My original tests where on Linux 2.4.25.

The good part is that open_sync wasn't worse.

Just a question about conceptually, What is the right thing to do? I
started to think about this. To me, the O_SYNC flag is to ensure that what
you write, at the time of write, is on the disk. In SQL terms it is like
auto commit. Calling fsync or fdatasync is so that one can batch write
calls and flush it out to disk in one shot, conceptually, it is like
transaction.

Does it make sense, then, to say that WAL O_SYNC should be O_SYNC? If
there are no reasons not too, doesn't it make sense to make this the
default. It will give a boost for any 2.4 Linux machines and won't seem to
hurt anyone else.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] fsync vs open_sync (more info)

2004-08-10 Thread pgsql


 In particular, you need to offer some evidence for that completely
 undocumented assertion that it won't hurt anyone else.

It should be easy enough to prove whether or not O_SYNC hurts anyone.

OK, let me ask a few questions:

(1) what is a good sample set on which to run? Linux, FreeBSD, MacIntosh?
(2) What sort of tests would be definitive? Auto commit and some
transactional load?


After delving into this a little, it seems to me that if you are going to
do this:

write(file, buffer, size);
f[data]sync(file);

Opening with O_SYNC seems to be an optimization specifically to this
methodology. At the very least, it will save one user/kernel transition.
If we can prove beyond a reasonable doubt that using O_SYNC does not hurt
any platform, then what reason would there be to continue making it the
default?

Again, conceptually, O_SYNC does what you want it to do, and should be
able to do it more efficiently than fdatasync().

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] fsync vs open_sync (more info)

2004-08-10 Thread Tom Lane
[EMAIL PROTECTED] writes:
 Does it make sense, then, to say that WAL O_SYNC should be O_SYNC? If
 there are no reasons not too, doesn't it make sense to make this the
 default. It will give a boost for any 2.4 Linux machines and won't seem to
 hurt anyone else.

You have got the terms of debate backwards here.  These decisions were
already made once, on the basis of more testing than you have done
(okay, it wasn't months worth of work, but we at least exercised a
number of scenarios on a number of platforms).  The question is not why
shouldn't we make this the default but why should we make this the
default, and what are we likely to break if we do so?  Showing that one
release series of one platform wins in one particular set of tests is
not sufficient grounds for changing the default.

In particular, you need to offer some evidence for that completely
undocumented assertion that it won't hurt anyone else.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] fsync vs open_sync (more info)

2004-08-10 Thread pgsql
 On Tue, 2004-08-10 at 07:48, [EMAIL PROTECTED] wrote:
 Some more information:

 I started to perform the tests on one of the machines in my lab, and
 guess
 what, almost no difference between fsync and open_sync. Either on jfs or
 ext2.

 The difference, Linux 2.6.3? My original tests where on Linux 2.4.25.
 Very hazy memory recalls something about O_SYNC not really doing
 anything in early kernel versions.


 The good part is that open_sync wasn't worse.

In early Linux kernels, O_SYNC was implemented using fsync(), and there
was an amount of debate about people using O_SYNC should see performance
degradation.


 Just a question about conceptually, What is the right thing to do? I
 started to think about this. To me, the O_SYNC flag is to ensure that
 what
 you write, at the time of write, is on the disk. In SQL terms it is like
 auto commit. Calling fsync or fdatasync is so that one can batch write
 calls and flush it out to disk in one shot, conceptually, it is like
 transaction.
 With the caveat that the kernel can start flushing your data to disk
 in the background, not just when you call fdatasync/fsync.

I was speaking in conceptual terms, not exact ones. Just a general analogy.

In theory, theory and practice are the same thing, in practice, they are not.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] fsync vs open_sync

2004-08-10 Thread Manfred Spraul
[EMAIL PROTECTED] wrote:
I have been considering a full sweep in my test lab off client time later on.
ext2, ext3, jfs, xfs, and ReiserFS, fsync on with fdatasync or open_sync,
and fsync off.
 

Before you start: double check that the disks are not lying:
At least the suse 2.4 kernel send cache flush commands to ide disks on 
fsync(), but not with O_SYNC:

http://marc.theaimsgroup.com/?l=linux-kernelm=107964507113585
--
   Manfred
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] fsync vs open_sync (more info)

2004-08-10 Thread Tom Lane
[EMAIL PROTECTED] writes:
 After delving into this a little, it seems to me that if you are going to
 do this:

 write(file, buffer, size);
 f[data]sync(file);

 Opening with O_SYNC seems to be an optimization specifically to this
 methodology.

What you are missing is that we don't necessarily do that.  Writes and
flushes of xlog don't always occur together: we may write out a buffer
to make room in shared memory even though we do not yet need it flushed
to disk.  In this situation it is better *not* to have O_SYNC on because
we don't need to force (and wait for) a write just then.  With a little
luck the kernel will write the buffer before we actually need a flush
to occur, and so there will be no actual delaying for it at all.

In particular this scenario applies for bulk-update transactions that
create vast amounts of WAL traffic but don't need an fsync till the very
end.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[HACKERS] fsync vs open_sync

2004-08-09 Thread pgsql
I did a little test on the various options of fsync.

I'm not sure my tests are scientific enough for general publication or
evaluation, all I am doing is performaing a loop that inserts a value into
a table 1 million times.
create table testndx (value integer, name varchar);
create index testndx_val on testndx (value);

for(int i=0; i  100; i++)
{
   insert into testndx (value, name) values ('%d', 'test')



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[HACKERS] fsync vs open_sync

2004-08-09 Thread pgsql

I did a little test on the various options of fsync.

I'm not sure my tests are scientific enough for general publication or
evaluation, all I am doing is performaing a loop that inserts a value into
a table 1 million times.
create table testndx (value integer, name varchar);
create index testndx_val on testndx (value);

for(int i=0; i  100; i++)
{
  printf_query( insert into testndx (value, name) values ('%d', 'test'),
random());

   // report here
}


Anyway, with fsync enabled using standard fsync(), I get roughly 300-400
inserts per second. With fsync disabled, I get about 7000 inserts per
second. When I re-enable fsync but use the open_sync option, I can get
about 2500 inserts per second.

(This is on Linux 2.4 kernel, ext2 file system)

(1) Is there any drawback to using open_sync as it appears to be a happy
medium to turing fsync off?
(2) Does anyone know if the open_sync option performs this well across
most platforms or only Linux?
(3) If open_sync works well across many platforms, and there are no
drawbacks, shouldn't it be the default wal sync method? The performance
bood is increadible.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] fsync vs open_sync

2004-08-09 Thread Tom Lane
[EMAIL PROTECTED] writes:
 I did a little test on the various options of fsync.

There were considerably more extensive tests back when we created the
different WAL options, and the conclusions seemed to be that the best
choice is platform-dependent and also usage-dependent.  (In particular,
it makes a huge difference whether WAL has its own drive or not.)

I don't really recall why open_sync didn't end up among the set of
choices considered for the default setting.  It may be that we need to
reconsider based on the behavior of newer Linux versions ...

In any case, comparing open_sync to fsync is irrelevant, seeing that
the current default choice on Linux is fdatasync.  What you ought to
be telling us about is the performance relative to that.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] fsync vs open_sync

2004-08-09 Thread pgsql
 [EMAIL PROTECTED] writes:
 I did a little test on the various options of fsync.

 There were considerably more extensive tests back when we created the
 different WAL options, and the conclusions seemed to be that the best
 choice is platform-dependent and also usage-dependent.  (In particular,
 it makes a huge difference whether WAL has its own drive or not.)

 I don't really recall why open_sync didn't end up among the set of
 choices considered for the default setting.  It may be that we need to
 reconsider based on the behavior of newer Linux versions ...

 In any case, comparing open_sync to fsync is irrelevant, seeing that
 the current default choice on Linux is fdatasync.  What you ought to
 be telling us about is the performance relative to that.

I can tell you, and I'll send all the results if you like, but fsync and
fdatasync are, as far as I can tell, idenitical. In fact, I can't find any
documentation that fdatasync is no longer implemented on Linux as fsync.

I tested fsync and fdatasync first and in my tests, the performance of
fdatasync and fsync were the same. I never went beyond these as it looked
like the fsync options were all basically the same. I hadn't read anywhere
where open_sync could make such a difference. It is only because of some
idle chatter (over a few years) I read in a couple Linux kernel mailing
list about O_SYNC being improved, that I thought I'd try it.

The improvements were REALLY astounding, and I would like to know if other
Linux users see this performance increase, I mean, it is almost 8~10 times
faster than using fsync.

Furthermore, it seems to also have the added benefit of reducing the I/O
storm at checkpoints over a system running with fsync off.

I'm really serious about this, changing this one parameter had dramatic
results on performance. We should have a general call to users to test
this setting with their OS of choice. If not that, if we can be sure that
there are no cases where using O_SYNC is worse than fsync() or
fdatasync(), it should be considered as the default.



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] fsync vs open_sync

2004-08-09 Thread Bruce Momjian
[EMAIL PROTECTED] wrote:
 Furthermore, it seems to also have the added benefit of reducing the I/O
 storm at checkpoints over a system running with fsync off.
 
 I'm really serious about this, changing this one parameter had dramatic
 results on performance. We should have a general call to users to test
 this setting with their OS of choice. If not that, if we can be sure that
 there are no cases where using O_SYNC is worse than fsync() or
 fdatasync(), it should be considered as the default.

Agreed.  Have you looked at src/tools/fsync?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] fsync vs open_sync

2004-08-09 Thread Mark Kirkwood
Just out of interest, what happens to the difference if you use *ext3*  
(perhaps with data=writeback)

regards
Mark
[EMAIL PROTECTED] wrote:
I did a little test on the various options of fsync.
...
create table testndx (value integer, name varchar);
create index testndx_val on testndx (value);
for(int i=0; i  100; i++)
{
 printf_query( insert into testndx (value, name) values ('%d', 'test'),
random());
  // report here
}
Anyway, with fsync enabled using standard fsync(), I get roughly 300-400
inserts per second. With fsync disabled, I get about 7000 inserts per
second. When I re-enable fsync but use the open_sync option, I can get
about 2500 inserts per second.
(This is on Linux 2.4 kernel, ext2 file system)
 

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] fsync vs open_sync

2004-08-09 Thread pgsql
 Just out of interest, what happens to the difference if you use *ext3*
 (perhaps with data=writeback)

Actually, I was working for a client, so it wasn't a general exploritory,
but I can say that early on we discovered that ext3 was about the worst
file system for PostgreSQL. We gave up on it and decided to use ext2.

I have been considering a full sweep in my test lab off client time later on.

ext2, ext3, jfs, xfs, and ReiserFS, fsync on with fdatasync or open_sync,
and fsync off.

One million inserts with auto commit.




---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] fsync vs open_sync

2004-08-09 Thread Doug McNaught
[EMAIL PROTECTED] writes:

 Just out of interest, what happens to the difference if you use *ext3*
 (perhaps with data=writeback)

 Actually, I was working for a client, so it wasn't a general exploritory,
 but I can say that early on we discovered that ext3 was about the worst
 file system for PostgreSQL. We gave up on it and decided to use ext2.

I'd be interested in which ext3 mount options you used--I can see how
anything other than 'data=writeback' could be a performance killer.
I've been meaning to run a few tests myself, but haven't had the
time...

-Doug
-- 
Let us cross over the river, and rest under the shade of the trees.
   --T. J. Jackson, 1863

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] fsync vs open_sync

2004-08-09 Thread Manfred Spraul
Tom Lane wrote:
[EMAIL PROTECTED] writes:
 

The improvements were REALLY astounding, and I would like to know if other
Linux users see this performance increase, I mean, it is almost 8~10 times
faster than using fsync.
Furthermore, it seems to also have the added benefit of reducing the I/O
storm at checkpoints over a system running with fsync off.
   

What size transactions are you using in your tests?
For a system with small transactions (not much more than 1 page worth of
WAL traffic per transaction) I'd be pretty surprised if there was any
real difference at all.  There certainly should not be any difference in
terms of the number of physical writes.  We have seen some platforms
where fsync() is inefficiently implemented and requires more kernel
overhead than is reasonable --- not for I/O, but just to look through
the kernel buffers and confirm that none of them need flushing.  But I
didn't think Linux was one of these.
 

IDE or scsi? If IDE: Write cache on or off? Which 2.4 kernel?
The numbers are very high - it could be a side effect of write caching 
by the disks. I think some Suse 2.4 kernels have partial support for 
reliable fsync even if the write cache is on (i.e. fsync issues a cache 
flush command to the disk), but not all code paths are handled. Perhaps 
fsync is handled and O_SYNC is not handled.
I could try to find the details.

--
   Manfred
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org