Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-03-17 Thread Magnus Hagander
* Win32, with fsync, write-cache disabled: no data corruption
* Win32, with fsync, write-cache enabled: no data corruption
* Win32, with osync, write cache disabled: no data corruption
* Win32, with osync, write cache enabled: no data 
 corruption. Once 
I
got:
2005-02-24 12:19:54 LOG:  could not open file C:/Program 
Files/PostgreSQL/8.0/data/pg_xlog/00010010
   (log file
0, segment 16): No such file or directory
  but the data in the database was consistent.
   
   It disturbs me that you couldn't produce data corruption in the 
   cases where it theoretically should occur.  Seems like this is an 
   indication that your test was insufficiently severe, or 
 that there 
   is something going on we don't understand.
  
  The Windows driver knows abotu the write cache, and at 
 least fsync() 
  pushes through the write cache even if it's there. This seems to 
  indicate taht O_SYNC at least partiallyi does this as well. This is 
  why there is no performance difference at all on fsync() with write 
  cache on or off.
  
  I don't know if this is true for all IDE disks. COuld be 
 that my disk 
  is particularly well-behaved.
 
 This indicated to me that open_sync did not require any 
 additional changes than our current fsync.

fsync and open_sync both write through the write cache in the operating
system. Only fsync=off turns this off.

fsync also writes through the hardware write cache. o_sync does not.
This is what causes the large slowdown with write cache enabled,
*including* most battery backed write cache systems (pretty much making
the write-cache a waste of money). This may be a good thing on IDE
systems (for admins that don't know how to remove the little check in
the box for enable write caching on the disk that MS provides, which
*explicitly* warns that you may lose data if you enabled it), but it's a
very bad thing for anything higher end.

fsync also syncs the directory metadata. o_sync only cares about the
files contents. (This is what causes the large slowdown with write cache
*disabled*, becuase it requires multiple writes on multiple disk
locations for each fsync). 


Basically, fsync hurts people who configure their box correctly, or who
use things like SCSI disks. o_sync hurts people who configure their
machine in an unsafe way.

//Magnus

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-03-17 Thread Bruce Momjian
Magnus Hagander wrote:
  This indicated to me that open_sync did not require any 
  additional changes than our current fsync.
 
 fsync and open_sync both write through the write cache in the operating
 system. Only fsync=off turns this off.
 
 fsync also writes through the hardware write cache. o_sync does not.
 This is what causes the large slowdown with write cache enabled,
 *including* most battery backed write cache systems (pretty much making
 the write-cache a waste of money). This may be a good thing on IDE
 systems (for admins that don't know how to remove the little check in
 the box for enable write caching on the disk that MS provides, which
 *explicitly* warns that you may lose data if you enabled it), but it's a
 very bad thing for anything higher end.

I found the checkbox on XP looking at Properties for the drive, then
choosing Hardware, the drive, Properties, and Policies.

 fsync also syncs the directory metadata. o_sync only cares about the
 files contents. (This is what causes the large slowdown with write cache
 *disabled*, because it requires multiple writes on multiple disk
 locations for each fsync). 
 
 Basically, fsync hurts people who configure their box correctly, or who
 use things like SCSI disks. o_sync hurts people who configure their
 machine in an unsafe way.

So, it seems that Win32 open_sync is exactly the same as our
wal_sync_method = open_datasync on Unix (it needs to be renamed), and
wal_sync_method = fsync on Win32 is something we don't have that
writes through the disk write cache even if it is enabled.

I have developed the following patch which renames our wal_sync_method
Win32 support from open_sync to open_datasync:

ftp://candle.pha.pa.us/pub/postgresql/mypatches

One issue with this patch is that if applied it would make open_datasync
the default sync method on Win32 because we prefer open_datasync over
all other sync methods.  If we don't want to do that, I think we should
still do the rename for accuracy and add a !WIN32 test to prevent
open_datasync from being the default.

However, I do prefer this patch and let Win32 have the same write cache
issues as Unix, for consistency.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-03-17 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes:
 However, I do prefer this patch and let Win32 have the same write cache
 issues as Unix, for consistency.

I agree that the open flag is more nearly O_DSYNC than O_SYNC.

ISTM Windows' idea of fsync is quite different from Unix's and therefore
we should name the wal_sync_method that invokes it something different
than fsync.  write_through or some such?  We already have precedent
that not all wal_sync_method values are available on all platforms.

I'm not taking a position on which the default should be ...

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-03-17 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  However, I do prefer this patch and let Win32 have the same write cache
  issues as Unix, for consistency.
 
 I agree that the open flag is more nearly O_DSYNC than O_SYNC.
 
 ISTM Windows' idea of fsync is quite different from Unix's and therefore
 we should name the wal_sync_method that invokes it something different
 than fsync.  write_through or some such?  We already have precedent
 that not all wal_sync_method values are available on all platforms.
 
 I'm not taking a position on which the default should be ...

Yes, I am thinking that too.  I hesistated because it adds yet another
sync method, and we have to document it works only on Win32, but I see
no better solution.

I am going to let the Win32 users mostly vote on what the default should
be.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-03-17 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes:
 Tom Lane wrote:
 we should name the wal_sync_method that invokes it something different
 than fsync.  write_through or some such?  We already have precedent
 that not all wal_sync_method values are available on all platforms.

 Yes, I am thinking that too.  I hesistated because it adds yet another
 sync method, and we have to document it works only on Win32, but I see
 no better solution.

It occurs to me that it'd probably be a good idea if the error message
for an unsupported wal_sync_method value explicitly listed the allowed
values for the platform.  If there's no objection I'll try to make
that happen.  (I'm not sure if it's trivial or not: I think the GUC
framework is a bit restrictive about custom error messages from GUC
assign hooks...)

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-03-17 Thread Dann Corbit
The default should clearly be the safest method.

Personally, I would disable anything but the safest method for all
database files that are not read-only.

IMO-YMMV.
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Bruce Momjian
Sent: Thursday, March 17, 2005 10:53 AM
To: Tom Lane
Cc: Magnus Hagander; Michael Paesold; pgsql-hackers@postgresql.org;
[EMAIL PROTECTED]; Merlin Moncure
Subject: Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync
question

Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  However, I do prefer this patch and let Win32 have the same write
cache
  issues as Unix, for consistency.
 
 I agree that the open flag is more nearly O_DSYNC than O_SYNC.
 
 ISTM Windows' idea of fsync is quite different from Unix's and
therefore
 we should name the wal_sync_method that invokes it something different
 than fsync.  write_through or some such?  We already have precedent
 that not all wal_sync_method values are available on all platforms.
 
 I'm not taking a position on which the default should be ...

Yes, I am thinking that too.  I hesistated because it adds yet another
sync method, and we have to document it works only on Win32, but I see
no better solution.

I am going to let the Win32 users mostly vote on what the default should
be.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania
19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-03-17 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  Tom Lane wrote:
  ISTM Windows' idea of fsync is quite different from Unix's and therefore
  we should name the wal_sync_method that invokes it something different
  than fsync.  write_through or some such?
 
  Ah, I remember now.  On Win32 our fsync is:
  #define fsync(a)_commit(a)
  I am wondering if we should call the new mode open_commit or
  open_writethrough.  Our typical rule is to tie it to the API call, which
  should suggest open_commit.
 
 fsync_writethrough, perhaps.  I don't see any open about it.

Sorry, yea, go confused.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-03-17 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes:
 Tom Lane wrote:
 ISTM Windows' idea of fsync is quite different from Unix's and therefore
 we should name the wal_sync_method that invokes it something different
 than fsync.  write_through or some such?

 Ah, I remember now.  On Win32 our fsync is:
   #define fsync(a)_commit(a)
 I am wondering if we should call the new mode open_commit or
 open_writethrough.  Our typical rule is to tie it to the API call, which
 should suggest open_commit.

fsync_writethrough, perhaps.  I don't see any open about it.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-03-16 Thread Bruce Momjian
Michael Paesold wrote:
 Magnus Hagander wrote:
 
 
  Magnus Hagander wrote:
   Magnus prepared a trivial patch which added the O_SYNC flag
   for windows and mapped it to FILE_FLAG_WRITE_THROUGH in
   win32_open.c.
 [snip]
 
  Michael Paesold wrote:
 The original patch did not have any documentation. Have you
 added some? Since this has to be configured in GUC (wal_sync_method),
 the implications should be documented somewhere, no?
 
 The patch just implements behaviour that was already documented (for
 unix) on a new platform (win32). The documentation in general appears to 
 have very little information on what to pick there, though ;-)
 
 Reading your mails about the pull-the-plug tests, I see that at least with 
 write caching enabled, fsync is more secure on win32 than open_sync. I.e. 
 one should disable write caching for use with open_sync. Also open_sync 
 seems to perform much better. All that information would be nice to have in 
 the docs.

Michael, I am not sure why you come to the conclusion that open_sync
requires turning off the disk write cache.  I saw nothing to indicate
that in the thread:

http://archives.postgresql.org/pgsql-hackers-win32/2005-02/msg00035.php

I read the following:

   * Win32, with fsync, write-cache disabled: no data corruption
   * Win32, with fsync, write-cache enabled: no data corruption
   * Win32, with osync, write cache disabled: no data corruption
   * Win32, with osync, write cache enabled: no data corruption. Once I
   got:
   2005-02-24 12:19:54 LOG:  could not open file C:/Program 
   Files/PostgreSQL/8.0/data/pg_xlog/00010010 
  (log file 
   0, segment 16): No such file or directory
 but the data in the database was consistent.
  
  It disturbs me that you couldn't produce data corruption in 
  the cases where it theoretically should occur.  Seems like 
  this is an indication that your test was insufficiently 
  severe, or that there is something going on we don't understand.
 
 The Windows driver knows abotu the write cache, and at least fsync()
 pushes through the write cache even if it's there. This seems to
 indicate taht O_SYNC at least partiallyi does this as well. This is why
 there is no performance difference at all on fsync() with write cache on
 or off.
 
 I don't know if this is true for all IDE disks. COuld be that my disk is
 particularly well-behaved.

This indicated to me that open_sync did not require any additional
changes than our current fsync.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-03-16 Thread Michael Paesold
Bruce Momjian wrote:
Michael Paesold wrote:
Magnus Hagander wrote:
[snip]
Michael, I am not sure why you come to the conclusion that open_sync
requires turning off the disk write cache.  I saw nothing to indicate
that in the thread:
I was just seeing his error message below...
http://archives.postgresql.org/pgsql-hackers-win32/2005-02/msg00035.php
I read the following:
  * Win32, with fsync, write-cache disabled: no data corruption
  * Win32, with fsync, write-cache enabled: no data corruption
  * Win32, with osync, write cache disabled: no data corruption
  * Win32, with osync, write cache enabled: no data corruption. Once I
  got:
  2005-02-24 12:19:54 LOG:  could not open file C:/Program
  Files/PostgreSQL/8.0/data/pg_xlog/00010010
 (log file
  0, segment 16): No such file or directory
but the data in the database was consistent.
A missing xlog file does not strike me as very save. Perhaps someone can 
explain what happened, but I would not feel good about this. Again this note 
(from Tom Lane) in combination with the above error would tell me, we don't 
fully understand the risk here.

 It disturbs me that you couldn't produce data corruption in
 the cases where it theoretically should occur.  Seems like
 this is an indication that your test was insufficiently
 severe, or that there is something going on we don't understand.
The Windows driver knows abotu the write cache, and at least fsync()
pushes through the write cache even if it's there. This seems to
indicate taht O_SYNC at least partiallyi does this as well. This is why
there is no performance difference at all on fsync() with write cache on
or off.
I don't know if this is true for all IDE disks. COuld be that my disk is
particularly well-behaved.
This indicated to me that open_sync did not require any additional
changes than our current fsync.
We both based our understanding on the same evidence. It seems we just have 
a different level of paranoia. ;-)

Best Regards,
Michael Paesold
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-02-28 Thread Dave Page



-Original Message-
From: [EMAIL PROTECTED] on behalf of Bruce Momjian
Sent: Sun 2/27/2005 12:54 AM
To: Magnus Hagander
Cc: Tom Lane; pgsql-hackers@postgresql.org; [EMAIL PROTECTED]; Merlin Moncure
Subject: Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question
 

 Patch applied.  Thanks.
 
 I assume this is not approprate for 8.0.X.

I think it would be good to backpatch it given proper testing - the changes are 
relatively minor, and they do give a significant performance boost.

Regards, Dave

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-02-27 Thread Magnus Hagander
I'd like to see this one also considered for 8.0.x, though I'd certainly
like to see some more testing as well. Perhaps it's suitable for the
8.0.x with extended testing that is planned for the ARC replacement
code?

It does make a huge difference on win32. While we definitly don't want
to risk data, a 60% speedup in write intensive apps is a *lot*.

//Magnus


-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of 
Bruce Momjian
Sent: den 27 februari 2005 01:54
To: Magnus Hagander
Cc: Tom Lane; pgsql-hackers@postgresql.org; 
[EMAIL PROTECTED]; Merlin Moncure
Subject: Re: [pgsql-hackers-win32] [HACKERS] win32 performance 
- fsync question



Patch applied.  Thanks.

I assume this is not approprate for 8.0.X.

---



Magnus Hagander wrote:
  Magnus prepared a trivial patch which added the O_SYNC flag 
  for windows and mapped it to FILE_FLAG_WRITE_THROUGH in 
  win32_open.c. 
 
 Attached is this trivial patch. As Merlin says, it needs some more
 reliability testing. But the numbers are at least reasonable - it
 *seems* like it's doing the right thing (as long as you turn 
off write
 cache). And it's certainly a significant performance increase - it
 brings the speed almost up to the same as linux.
 
 
 //Magnus

Content-Description: o_sync.patch

[ Attachment, skipping... ]

 
 ---(end of 
broadcast)---
 TIP 8: explain analyze is your friend

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, 
Pennsylvania 19073

---(end of 
broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-02-25 Thread Zeugswetter Andreas DAZ SD

 Are you verifying that all the data that was committed was actually stored? 
 Or
 just verifying that the database works properly after rebooting?
 
 I verified the data.

Does pg startup increase the xid by some amount (say 1000 xids) after crash ?
Else I think you would also need to rollback a range of xids after
the crash, to see if you don't loose data by reusing and rolling back xids.

The risk is datapages reaching the disk before WAL, because the disk rearranges.
I think you would not notice such corruption (with pg_dump) unless you do the
range rollback.

Andreas

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-02-24 Thread Magnus Hagander
  Magnus prepared a trivial patch which added the O_SYNC flag for 
  windows and mapped it to FILE_FLAG_WRITE_THROUGH in win32_open.c.
 
 Attached is this trivial patch. As Merlin says, it needs some 
 more reliability testing. But the numbers are at least reasonable - it
 *seems* like it's doing the right thing (as long as you turn 
 off write cache). And it's certainly a significant 
 performance increase - it brings the speed almost up to the 
 same as linux.

I have now run a bunch of pull-the-plug testing on this patch (literally
pulling the plug, yes. to the point of some of my co-workers thinking
I'm crazy)

My results are:
Fisrt, baseline:
* Linux, with fsync (default), write-cache disabled: no data corruption
* Linux, with fsync (default), write-cache enabled: usually no data
corruption, but two runs which had
* Win32, with fsync, write-cache disabled: no data corruption
* Win32, with fsync, write-cache enabled: no data corruption
* Win32, with osync, write cache disabled: no data corruption
* Win32, with osync, write cache enabled: no data corruption. Once I
got:
2005-02-24 12:19:54 LOG:  could not open file C:/Program
Files/PostgreSQL/8.0/data/pg_xlog/00010010 (log file 0,
segment 16): No such file or directory

  but the data in the database was consistent.

Almost all runs showed a line along the line:
2005-02-24 11:22:41 LOG:  record with zero length at 0/A450548


In the final test, the BIOS decided the disk was giving up and
reassigned it as 0Mb.. Required two extra cold boots, then it was back
up to 20Gb. Still no data loss.


My tests was three clients doing lots of inserts and updates, some in
transactions some bare. In some tests, I kicked in a manual vacuum while
at it. Then I yanked the powercord, rebooted, manually started pg, and
verified taht the data in the db came up with the same values the cliens
reported as last committed. I also ran vacuum verbose on all tables
after it was back up to see if there were any warnings.

Test machine is a 1GHz Celeron, 256Mb RAM and a Maxtor IDE disk.

It'd of course be good if others could also test, but I'm getting the
feeling that this patch at least doesn't make things worse than before
:-) ANd it's *a lot* faster.

//Magnus

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-02-24 Thread Christopher Kings-Lynne
In the final test, the BIOS decided the disk was giving up and
reassigned it as 0Mb.. Required two extra cold boots, then it was back
up to 20Gb. Still no data loss.
I think it would be fun to re-run these tests with MySQL...
Chris
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-02-24 Thread Christopher Kings-Lynne
My results are:
Fisrt, baseline:
* Linux, with fsync (default), write-cache disabled: no data corruption
* Linux, with fsync (default), write-cache enabled: usually no data
corruption, but two runs which had
* Win32, with fsync, write-cache disabled: no data corruption
* Win32, with fsync, write-cache enabled: no data corruption
* Win32, with osync, write cache disabled: no data corruption
* Win32, with osync, write cache enabled: no data corruption. Once I
got:
2005-02-24 12:19:54 LOG:  could not open file C:/Program
Files/PostgreSQL/8.0/data/pg_xlog/00010010 (log file 0,
segment 16): No such file or directory
In case anyone is wondering, you can turn off write caching on FreeBSD, 
for a terrible perfomance loss...

http://freebsd.active-venture.com/handbook/configtuning-disk.html#AEN8015
Chris
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-02-24 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 My results are:
 Fisrt, baseline:
 * Linux, with fsync (default), write-cache disabled: no data corruption
 * Linux, with fsync (default), write-cache enabled: usually no data
 corruption, but two runs which had

That makes sense.

 * Win32, with fsync, write-cache disabled: no data corruption
 * Win32, with fsync, write-cache enabled: no data corruption
 * Win32, with osync, write cache disabled: no data corruption
 * Win32, with osync, write cache enabled: no data corruption. Once I
 got:
 2005-02-24 12:19:54 LOG:  could not open file C:/Program
 Files/PostgreSQL/8.0/data/pg_xlog/00010010 (log file 0,
 segment 16): No such file or directory
   but the data in the database was consistent.

It disturbs me that you couldn't produce data corruption in the cases
where it theoretically should occur.  Seems like this is an indication
that your test was insufficiently severe, or that there is something
going on we don't understand.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-02-24 Thread Magnus Hagander
  * Win32, with fsync, write-cache disabled: no data corruption
  * Win32, with fsync, write-cache enabled: no data corruption
  * Win32, with osync, write cache disabled: no data corruption
  * Win32, with osync, write cache enabled: no data corruption. Once I
  got:
  2005-02-24 12:19:54 LOG:  could not open file C:/Program 
  Files/PostgreSQL/8.0/data/pg_xlog/00010010 
 (log file 
  0, segment 16): No such file or directory
but the data in the database was consistent.
 
 It disturbs me that you couldn't produce data corruption in 
 the cases where it theoretically should occur.  Seems like 
 this is an indication that your test was insufficiently 
 severe, or that there is something going on we don't understand.

The Windows driver knows abotu the write cache, and at least fsync()
pushes through the write cache even if it's there. This seems to
indicate taht O_SYNC at least partiallyi does this as well. This is why
there is no performance difference at all on fsync() with write cache on
or off.

I don't know if this is true for all IDE disks. COuld be that my disk is
particularly well-behaved.

//Magnus

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-02-24 Thread Greg Stark

Magnus Hagander [EMAIL PROTECTED] writes:

 * Linux, with fsync (default), write-cache enabled: usually no data
 corruption, but two runs which had

Are you verifying that all the data that was committed was actually stored? Or
just verifying that the database works properly after rebooting?

I'm a bit surprised that the write-cache lead to a corrupt database, and not
merely lost transactions. I had the impression that drives still handled the
writes in the order received.

You may find that if you check this case again that the usually no data
corruption is actually usually lost transactions but no corruption.

-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-02-24 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 I'm a bit surprised that the write-cache lead to a corrupt database, and not
 merely lost transactions. I had the impression that drives still handled the
 writes in the order received.

There'd be little point in having a cache if they did, I should think.
I thought the point of the cache was to allow the disk to schedule I/O
in an order that minimizes seek time (ie, such a disk has got its own
elevator queue or similar).

 You may find that if you check this case again that the usually no data
 corruption is actually usually lost transactions but no corruption.

That's a good point, but it seems difficult to be sure of the last
reportedly-committed transaction in a powerfail situation.  Maybe if
you drive the test from a client on another machine?

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-02-24 Thread Greg Stark

Tom Lane [EMAIL PROTECTED] writes:

 Greg Stark [EMAIL PROTECTED] writes:
  I'm a bit surprised that the write-cache lead to a corrupt database, and not
  merely lost transactions. I had the impression that drives still handled the
  writes in the order received.
 
 There'd be little point in having a cache if they did, I should think.
 I thought the point of the cache was to allow the disk to schedule I/O
 in an order that minimizes seek time (ie, such a disk has got its own
 elevator queue or similar).

If that were the case then SCSI drives that ship with write caching disabled
and using tagged command queuing instead would perform poorly.

I think the main motivation for write caching on IDE drives is that the IDE
protocol forces commands to be issued synchronously. So you can't send a
second command until the first command has completed. Without write caching
that limits the write bandwidth tremendously. Write caching is being used here
as a poor man's tcq.

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-02-24 Thread Magnus Hagander
 You may find that if you check this case again that the 
usually no data
 corruption is actually usually lost transactions but no 
corruption.

That's a good point, but it seems difficult to be sure of the last
reportedly-committed transaction in a powerfail situation.  Maybe if
you drive the test from a client on another machine?

FYI, that's what I did. Test client ran across the network to the
server, so it could output on the console which transaction was last
reported commityted.

In a couple of cases, the server came up with a transaction the client
had *not* reported as committed. But I think that can be explained by
the commit message not reaching the client over the network before power
went out.

//Magnus

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-02-24 Thread Magnus Hagander
 * Linux, with fsync (default), write-cache enabled: usually no data
 corruption, but two runs which had

Are you verifying that all the data that was committed was 
actually stored? Or
just verifying that the database works properly after rebooting?

I verified the data.


I'm a bit surprised that the write-cache lead to a corrupt 
database, and not
merely lost transactions. I had the impression that drives 
still handled the
writes in the order received.

In this case, it was lost transactions, not data corruption. Should be
more careful. I had copy/pasted the no data corruption, should specify
what was lost.

A couple of the latest transactions were gone, but the database came up
in a consistent state, if a bit old.

Since Linux wasn't the stuff I actually was testing, I didn't run very
many tests on it though.

//Magnus

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-02-24 Thread Greg Stark

Magnus Hagander [EMAIL PROTECTED] writes:

  I'm a bit surprised that the write-cache lead to a corrupt database, and
  not merely lost transactions. I had the impression that drives still
  handled the writes in the order received.
 
 In this case, it was lost transactions, not data corruption. 
 ...
 A couple of the latest transactions were gone, but the database came up
 in a consistent state, if a bit old.

That's interesting. It would be very interesting to know how reliably this is
true. It could potentially vary depending on the drive firmware.

I can't see any painless way to package up this kind of test for people to run
though. Powercycling machines repeatedly really isn't fun and takes a long
time. And testing this on vmware doesn't buy us anything.

-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question

2005-02-20 Thread Magnus Hagander
 Magnus prepared a trivial patch which added the O_SYNC flag 
 for windows and mapped it to FILE_FLAG_WRITE_THROUGH in 
 win32_open.c. 

Attached is this trivial patch. As Merlin says, it needs some more
reliability testing. But the numbers are at least reasonable - it
*seems* like it's doing the right thing (as long as you turn off write
cache). And it's certainly a significant performance increase - it
brings the speed almost up to the same as linux.

For testing, I have built and uploaded binaries from the 8.0 stable
branch with this patch applied. They are available from
http://www.hagander.net/pgsql/. Install the 8.0.1 version first (from
MSI or manually, your choice), then replace postmaster.exe and
postgres.exe with the ones in the ZIP file. If you're running as a
service, make sure to stop the service first.

To make sure it uses the new code, change wal_sync_method to open_sync
in postgresql.conf and restart the service.

The kind of testing we need help is pulling the plug reliability
testing. For this, make sure you have write caching turned off (it's no
the disks properties page in the Device Manager), run a bunch of
transactions on the db and then pull the plug of the machine in the
middle. It should come up with all acknowledged transactions still
applied, and all others not.


//Magnus

---(end of broadcast)---
TIP 8: explain analyze is your friend