Re: [PERFORM] further testing on IDE drives

2003-10-15 Thread Ang Chin Han
Bruce Momjian wrote:

Yes.  If you were doing multiple WAL writes before transaction fsync,
you would be fsyncing every write, rather than doing two writes and
fsync'ing them both.  I wonder if larger transactions would find
open_sync slower?
No hard numbers, but I remember testing fsync vs open_sync something ago 
on 7.3.x.

open_sync was blazingly fast for pgbench, but for when we switched our 
development database over to open_sync, things slowed to a crawl.

This was some months ago, and I might be wrong, so take it with a grain 
of salt. It was on Red Hat 8's Linux kernel 2.4.18, I think. YMMV.

Will be testing it real soon tonight, if possible.



pgp0.pgp
Description: PGP signature


Re: [PERFORM] further testing on IDE drives

2003-10-14 Thread scott.marlowe
On Tue, 14 Oct 2003, Tom Lane wrote:

 scott.marlowe [EMAIL PROTECTED] writes:
  open_sync was WAY faster at this than the other two methods.
 
 Do you not have open_datasync?  That's the preferred method if
 available.

Nope, when I try to start postgresql with it set to that, I get this error 
message:

FATAL:  invalid value for wal_sync_method: open_datasync

This is on RedHat 9, but I have the same problem on a RH 7.2 box as well.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] further testing on IDE drives

2003-10-13 Thread Vivek Khera
 BM == Bruce Momjian [EMAIL PROTECTED] writes:

BM COPY only does fsync on COPY completion, so I am not sure there are
BM enough fsync's there to make a difference.


Perhaps then it is part of the indexing that takes so much time with
the WAL.  When I applied Marc's WAL disabling patch, it shaved nearly
50 minutes off of a 4-hour restore.

I sent to Tom the logs from the restores since he was interested in
figuring out where the time was saved.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] further testing on IDE drives

2003-10-10 Thread scott.marlowe
On Thu, 9 Oct 2003, Bruce Momjian wrote:

 scott.marlowe wrote:
  I was testing to get some idea of how to speed up the speed of pgbench 
  with IDE drives and the write caching turned off in Linux (i.e. hdparm -W0 
  /dev/hdx).
  
  The only parameter that seems to make a noticeable difference was setting 
  wal_sync_method = open_sync.  With it set to either fsync, or fdatasync, 
  the speed with pgbench -c 5 -t 1000 ran from 11 to 17 tps.  With open_sync 
  it jumped to the range of 45 to 52 tps.  with write cache on I was getting 
  280 to 320 tps.  so, not instead of being 20 to 30 times slower, I'm only 
  about 5 times slower, much better.
  
  Now I'm off to start a pgbench -c 10 -t 1 and pull the power cord 
  and see if the data gets corrupted with write caching turned on, i.e. do 
  my hard drives have the ability to write at least some of their cache 
  during spin down.
 
 Is this a reason we should switch to open_sync as a default, if it is
 availble, rather than fsync?  I think we are doing a single write before
 fsync a lot more often than we are doing multiple writes before fsync.

Sounds reasonable to me.  Are there many / any scenarios where a plain 
fsync would be faster than open_sync?


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] further testing on IDE drives

2003-10-10 Thread Bruce Momjian
scott.marlowe wrote:
 On Thu, 9 Oct 2003, Bruce Momjian wrote:
 
  scott.marlowe wrote:
   I was testing to get some idea of how to speed up the speed of pgbench 
   with IDE drives and the write caching turned off in Linux (i.e. hdparm -W0 
   /dev/hdx).
   
   The only parameter that seems to make a noticeable difference was setting 
   wal_sync_method = open_sync.  With it set to either fsync, or fdatasync, 
   the speed with pgbench -c 5 -t 1000 ran from 11 to 17 tps.  With open_sync 
   it jumped to the range of 45 to 52 tps.  with write cache on I was getting 
   280 to 320 tps.  so, not instead of being 20 to 30 times slower, I'm only 
   about 5 times slower, much better.
   
   Now I'm off to start a pgbench -c 10 -t 1 and pull the power cord 
   and see if the data gets corrupted with write caching turned on, i.e. do 
   my hard drives have the ability to write at least some of their cache 
   during spin down.
  
  Is this a reason we should switch to open_sync as a default, if it is
  availble, rather than fsync?  I think we are doing a single write before
  fsync a lot more often than we are doing multiple writes before fsync.
 
 Sounds reasonable to me.  Are there many / any scenarios where a plain 
 fsync would be faster than open_sync?

Yes.  If you were doing multiple WAL writes before transaction fsync,
you would be fsyncing every write, rather than doing two writes and
fsync'ing them both.  I wonder if larger transactions would find
open_sync slower?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] further testing on IDE drives

2003-10-10 Thread scott.marlowe
On Fri, 10 Oct 2003, Josh Berkus wrote:

 Bruce,
 
  Yes.  If you were doing multiple WAL writes before transaction fsync,
  you would be fsyncing every write, rather than doing two writes and
  fsync'ing them both.  I wonder if larger transactions would find
  open_sync slower?
 
 Want me to test?   I've got an ide-based test machine here, and the TPCC 
 databases.

Just make sure the drive's write cache is disabled.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] further testing on IDE drives

2003-10-10 Thread Bruce Momjian
Josh Berkus wrote:
 Bruce,
 
  Yes.  If you were doing multiple WAL writes before transaction fsync,
  you would be fsyncing every write, rather than doing two writes and
  fsync'ing them both.  I wonder if larger transactions would find
  open_sync slower?
 
 Want me to test?   I've got an ide-based test machine here, and the TPCC 
 databases.

I would be interested to see if wal_sync_method = fsync is slower than
wal_sync_method = open_sync.  How often are we doing more then one write
before a fsync anyway?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] further testing on IDE drives

2003-10-10 Thread Josh Berkus
Bruce,

 I would be interested to see if wal_sync_method = fsync is slower than
 wal_sync_method = open_sync.  How often are we doing more then one write
 before a fsync anyway?

OK.   I'll see if I can get to it around my other stuff I have to do this 
weekend.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] further testing on IDE drives

2003-10-10 Thread Vivek Khera
 BM == Bruce Momjian [EMAIL PROTECTED] writes:

 Sounds reasonable to me.  Are there many / any scenarios where a plain 
 fsync would be faster than open_sync?

BM Yes.  If you were doing multiple WAL writes before transaction fsync,
BM you would be fsyncing every write, rather than doing two writes and
BM fsync'ing them both.  I wonder if larger transactions would find
BM open_sync slower?

consider loading a large database from a backup dump.  one big
transaction during the COPY.  I don't know the implications it has on
this scenario, though.

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.Khera Communications, Inc.
Internet: [EMAIL PROTECTED]   Rockville, MD   +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] further testing on IDE drives

2003-10-10 Thread Bruce Momjian
Vivek Khera wrote:
  BM == Bruce Momjian [EMAIL PROTECTED] writes:
 
  Sounds reasonable to me.  Are there many / any scenarios where a plain 
  fsync would be faster than open_sync?
 
 BM Yes.  If you were doing multiple WAL writes before transaction fsync,
 BM you would be fsyncing every write, rather than doing two writes and
 BM fsync'ing them both.  I wonder if larger transactions would find
 BM open_sync slower?
 
 consider loading a large database from a backup dump.  one big
 transaction during the COPY.  I don't know the implications it has on
 this scenario, though.

COPY only does fsync on COPY completion, so I am not sure there are
enough fsync's there to make a difference.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] further testing on IDE drives

2003-10-10 Thread scott.marlowe
On Fri, 10 Oct 2003, Josh Berkus wrote:

 Bruce,
 
  Yes.  If you were doing multiple WAL writes before transaction fsync,
  you would be fsyncing every write, rather than doing two writes and
  fsync'ing them both.  I wonder if larger transactions would find
  open_sync slower?
 
 Want me to test?   I've got an ide-based test machine here, and the TPCC 
 databases.

OK, I decided to do a quick dirty test of things that are big transactions 
in each mode my kernel supports.  I did this:

createdb dbname
time pg_dump -O -h otherserver dbname|psql dbname

then I would drop the db, edit postgresql.conf, and restart the server.

open_sync was WAY faster at this than the other two methods.

open_sync:

1st run:

real11m27.107s
user0m26.570s
sys 0m1.150s

2nd run:

real6m5.712s
user0m26.700s
sys 0m1.700s

fsync:

1st run:

real15m8.127s
user0m26.710s
sys 0m0.990s

2nd run:

real15m8.396s
user0m26.990s
sys 0m1.870s

fdatasync:

1st run:

real15m47.878s
user0m26.570s
sys 0m1.480s

2nd run:


real15m9.402s
user0m27.000s
sys 0m1.660s

I did the first runs in order, then started over, i.e. opensync run1, 
fsync run1, fdatasync run1, opensync run2, etc...

The machine I was restoring to was under no other load.  The machine I was 
reading from had little or no load, but is a production server, so it's 
possible the load there could have had a small effect, but probably not 
this big of a one.

The machine this is one is setup so that the data partition is on a drive 
with write cache enabled, but the pg_xlog and pg_clog directories are on a 
drive with write cache disabled.  Same drive models as listed before in my 
previous test, Seagate generic 80gig IDE drives, model ST380023A.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] further testing on IDE drives

2003-10-09 Thread Bruce Momjian

How did this drive come by default?  Write-cache disabled?

---

scott.marlowe wrote:
 On Thu, 2 Oct 2003, scott.marlowe wrote:
 
  I was testing to get some idea of how to speed up the speed of pgbench 
  with IDE drives and the write caching turned off in Linux (i.e. hdparm -W0 
  /dev/hdx).
  
  The only parameter that seems to make a noticeable difference was setting 
  wal_sync_method = open_sync.  With it set to either fsync, or fdatasync, 
  the speed with pgbench -c 5 -t 1000 ran from 11 to 17 tps.  With open_sync 
  it jumped to the range of 45 to 52 tps.  with write cache on I was getting 
  280 to 320 tps.  so, not instead of being 20 to 30 times slower, I'm only 
  about 5 times slower, much better.
  
  Now I'm off to start a pgbench -c 10 -t 1 and pull the power cord 
  and see if the data gets corrupted with write caching turned on, i.e. do 
  my hard drives have the ability to write at least some of their cache 
  during spin down.
 
 OK, back from testing.
 
 Information:  Dual PIV system with a pair of 80 gig IDE drives, model 
 number: ST380023A (seagate).  File system is ext3 and is on a seperate 
 drive from the OS.
 
 These drives DO NOT write cache when they lose power.  Testing was done by 
 issuing a 'hdparm -W0/1 /dev/hdx' command where x is the real drive 
 letter, and 0 or 1 was chosen in place of 0/1.  Then I'd issue a 'pgbench 
 -c 50 -t 1' command, wait for a few minutes, then pull the power 
 cord.
 
 I'm running RH linux 9.0 stock install, kernel: 2.4.20-8smp.
 
 Three times pulling the plug with 'hdparm -W0 /dev/hdx' resulted in a 
 machine that would boot up, recover with journal, and a database that came 
 up within about 30 seconds, with all the accounts still intact.
 
 Switching the caching back on with 'hdparm -W1 /dev/hdx' and doing the 
 same 'pgbench -c 50 -t 1' resulted in a corrupted database each 
 time.
 
 Also, I tried each of the following fsync methods: fsync, fdatasync, and
 open_sync with write caching turned off.  Each survived a power off test 
 with no corruption of the database.  fsync and fdatasync result in 11 to 
 17 tps with 'pgbench -c 5 -t 500' while open_sync resulted in 45 to 55 
 tps, as mentioned in the previous post.
 
 I'd be interested in hearing from other folks which sync method works 
 for them and whether or not there are any IDE drives out there that can 
 write their cache to the platters on power off when caching is enabled.
 
 
 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html