Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-23 Thread Kevin Grittner
Greg Smith g...@2ndquadrant.com wrote:
 
 You can easily quantify if the BGW is aggressive enough.  Buffers
 leave the cache three ways, and they each show up as separate
 counts in pg_stat_bgwriter:  buffers_checkpoint, buffers_clean
 (the BGW), and buffers_backend (the queries).  Cranking it up
 further tends to shift writes out of buffers_backend, which are
 the ones you want to avoid, toward buffers_clean instead.  If
 buffers_backend is already low on a percentage basis compared to
 the other two, there's little benefit in trying to make the BGW do
 more.
 
Here are the values from our two largest and busiest systems (where
we found the pg_xlog placement to matter so much).  It looks to me
like a more aggressive bgwriter would help, yes?
 
cir= select * from pg_stat_bgwriter ;
-[ RECORD 1 ]--+
checkpoints_timed  | 125996
checkpoints_req| 16932
buffers_checkpoint | 342972024
buffers_clean  | 343634920
maxwritten_clean   | 9928
buffers_backend| 575589056
buffers_alloc  | 52397855471
 
cir= select * from pg_stat_bgwriter ;
-[ RECORD 1 ]--+
checkpoints_timed  | 125992
checkpoints_req| 16840
buffers_checkpoint | 260358442
buffers_clean  | 474768152
maxwritten_clean   | 9778
buffers_backend| 565837397
buffers_alloc  | 71463873477
 
Current settings:
 
bgwriter_delay = '200ms'
bgwriter_lru_maxpages = 1000
bgwriter_lru_multiplier = 4
 
Any suggestions on how far to push it?
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-23 Thread Ben Chobot
On Feb 23, 2010, at 2:23 PM, Kevin Grittner wrote:

 
 Here are the values from our two largest and busiest systems (where
 we found the pg_xlog placement to matter so much).  It looks to me
 like a more aggressive bgwriter would help, yes?
 
 cir= select * from pg_stat_bgwriter ;
 -[ RECORD 1 ]--+
 checkpoints_timed  | 125996
 checkpoints_req| 16932
 buffers_checkpoint | 342972024
 buffers_clean  | 343634920
 maxwritten_clean   | 9928
 buffers_backend| 575589056
 buffers_alloc  | 52397855471
 
 cir= select * from pg_stat_bgwriter ;
 -[ RECORD 1 ]--+
 checkpoints_timed  | 125992
 checkpoints_req| 16840
 buffers_checkpoint | 260358442
 buffers_clean  | 474768152
 maxwritten_clean   | 9778
 buffers_backend| 565837397
 buffers_alloc  | 71463873477
 
 Current settings:
 
 bgwriter_delay = '200ms'
 bgwriter_lru_maxpages = 1000
 bgwriter_lru_multiplier = 4
 
 Any suggestions on how far to push it?

I don't know how far to push it, but you could start by reducing the delay time 
and observe how that affects performance.

Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-12 Thread Alvaro Herrera
Kevin Grittner wrote:
 Hannu Krosing  wrote:
  
  Can it be, that each request does at least 1 write op (update
  session or log something) ?
  
 Well, the web application connects through a login which only has
 SELECT rights; but, in discussing a previous issue we've pretty well
 established that it's not unusual for a read to force a dirty buffer
 to write to the OS.  Perhaps this is the issue here again.  Nothing
 is logged on the database server for every request.

I don't think it explains it, because dirty buffers are obviously
written to the data area, not pg_xlog.

 I wonder if it might also pay to make the background writer even more
 aggressive than we have, so that SELECT-only queries don't spend so
 much time writing pages.

That's worth trying.

 Anyway, given that these are replication
 targets, and aren't the database of origin for any data of their
 own, I guess there's no reason not to try asynchronous commit. 

Yeah; since the transactions only ever write commit records to WAL, it
wouldn't matter a bit that they are lost on crash.  And you should see
an improvement, because they wouldn't have to flush at all.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-12 Thread Alvaro Herrera
Alvaro Herrera wrote:
 Kevin Grittner wrote:

  Anyway, given that these are replication
  targets, and aren't the database of origin for any data of their
  own, I guess there's no reason not to try asynchronous commit. 
 
 Yeah; since the transactions only ever write commit records to WAL, it
 wouldn't matter a bit that they are lost on crash.  And you should see
 an improvement, because they wouldn't have to flush at all.

Actually, a transaction that performed no writes doesn't get a commit
WAL record written, so it shouldn't make any difference at all.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-12 Thread Alvaro Herrera
Kevin Grittner wrote:
 Alvaro Herrera alvhe...@commandprompt.com wrote:

  Actually, a transaction that performed no writes doesn't get a
  commit WAL record written, so it shouldn't make any difference at
  all.
  
 Well, concurrent to the web application is the replication.  Would
 asynchronous commit of that potentially alter the pattern of writes
 such that it had less impact on the reads?

Well, certainly async commit would completely change the pattern of
writes: it would give the controller an opportunity to reorder them
according to some scheduler.  Otherwise they are strictly serialized.

 I'm thinking, again, of
 why the placement of the pg_xlog on a separate file system made such
 a dramatic difference to the read-only response time -- might it
 make less difference if the replication was using asynchronous
 commit?

Yeah, I think it would have been less notorious, but this is all
theoretical.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-12 Thread Greg Smith

Kevin Grittner wrote:

I wonder if it might also pay to make the background writer even more
aggressive than we have, so that SELECT-only queries don't spend so
much time writing pages.
You can easily quantify if the BGW is aggressive enough.  Buffers leave 
the cache three ways, and they each show up as separate counts in 
pg_stat_bgwriter:  buffers_checkpoint, buffers_clean (the BGW), and 
buffers_backend (the queries).  Cranking it up further tends to shift 
writes out of buffers_backend, which are the ones you want to avoid, 
toward buffers_clean instead.  If buffers_backend is already low on a 
percentage basis compared to the other two, there's little benefit in 
trying to make the BGW do more.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-11 Thread Bruce Momjian
Kevin Grittner wrote:
 Jesper Krogh jes...@krogh.cc wrote:
  
  Sorry if it is obvious.. but what filesystem/OS are you using and
  do you have BBU-writeback on the main data catalog also?
  
 Sorry for not providing more context.
  
 ATHENA:/var/pgsql/data # uname -a
 Linux ATHENA 2.6.16.60-0.39.3-smp #1 SMP Mon May 11 11:46:34 UTC
 2009 x86_64 x86_64 x86_64 GNU/Linux
 ATHENA:/var/pgsql/data # cat /etc/SuSE-release
 SUSE Linux Enterprise Server 10 (x86_64)
 VERSION = 10
 PATCHLEVEL = 2
  
 File system is xfs noatime,nobarrier for all data; OS is on ext3.  I
 *think* the pg_xlog mirrored pair is hanging off the same
 BBU-writeback controller as the big RAID, but I'd have to track down
 the hardware tech to confirm, and he's out today.  System has 16
 Xeon CPUs and 64 GB RAM.

I would be surprised if the RAID controller had a BBU-writeback cache. 
I don't think having xlog share a BBU-writeback makes things slower, and
if it does, I would love for someone to explain why.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-11 Thread Scott Marlowe
On Thu, Feb 11, 2010 at 4:29 AM, Bruce Momjian br...@momjian.us wrote:
 Kevin Grittner wrote:
 Jesper Krogh jes...@krogh.cc wrote:

  Sorry if it is obvious.. but what filesystem/OS are you using and
  do you have BBU-writeback on the main data catalog also?

 Sorry for not providing more context.

 ATHENA:/var/pgsql/data # uname -a
 Linux ATHENA 2.6.16.60-0.39.3-smp #1 SMP Mon May 11 11:46:34 UTC
 2009 x86_64 x86_64 x86_64 GNU/Linux
 ATHENA:/var/pgsql/data # cat /etc/SuSE-release
 SUSE Linux Enterprise Server 10 (x86_64)
 VERSION = 10
 PATCHLEVEL = 2

 File system is xfs noatime,nobarrier for all data; OS is on ext3.  I
 *think* the pg_xlog mirrored pair is hanging off the same
 BBU-writeback controller as the big RAID, but I'd have to track down
 the hardware tech to confirm, and he's out today.  System has 16
 Xeon CPUs and 64 GB RAM.

 I would be surprised if the RAID controller had a BBU-writeback cache.
 I don't think having xlog share a BBU-writeback makes things slower, and
 if it does, I would love for someone to explain why.

I believe in the past when this discussion showed up it was mainly due
to them being on the same file system (and then not with pg_xlog
separate) that made the biggest difference.  I recall there being a
noticeable performance gain from having two file systems on the same
logical RAID device even.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-11 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Jesper Krogh jes...@krogh.cc wrote:
  
 Sorry if it is obvious.. but what filesystem/OS are you using and
 do you have BBU-writeback on the main data catalog also?
 
 File system is xfs noatime,nobarrier for all data; OS is on ext3. 
 I *think* the pg_xlog mirrored pair is hanging off the same
 BBU-writeback controller as the big RAID, but I'd have to track
 down the hardware tech to confirm
 
Another example of why I shouldn't trust my memory.  Per the
hardware tech:
 
 
OS:  /dev/sda   is RAID1  -  2  x  2.5 15k SAS disk
pg_xlog: /dev/sdb   is RAID1  -  2  x  2.5 15k SAS disk
 
These reside on a ServeRAID-MR10k controller with 256MB BB cache.
 
 
data:/dev/sdc   is RAID5  -  30 x 3.5 15k SAS disk
 
These reside on the DS3200 disk subsystem with 512MB BB cache per
controller and redundant drive loops.
 
 
At least I had the file systems and options right.  ;-)
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-11 Thread Alvaro Herrera
Kevin Grittner wrote:

 Another example of why I shouldn't trust my memory.  Per the
 hardware tech:
  
  
 OS:  /dev/sda   is RAID1  -  2  x  2.5 15k SAS disk
 pg_xlog: /dev/sdb   is RAID1  -  2  x  2.5 15k SAS disk
  
 These reside on a ServeRAID-MR10k controller with 256MB BB cache.
  
  
 data:/dev/sdc   is RAID5  -  30 x 3.5 15k SAS disk
  
 These reside on the DS3200 disk subsystem with 512MB BB cache per
 controller and redundant drive loops.

Hmm, so maybe the performance benefit is not from it being on a separate
array, but from it being RAID1 instead of RAID5?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-11 Thread Aidan Van Dyk
* Alvaro Herrera alvhe...@commandprompt.com [100211 12:58]:
 Hmm, so maybe the performance benefit is not from it being on a separate
 array, but from it being RAID1 instead of RAID5?

Or the cumulative effects of:
1) Dedicated spindles/Raid1
2) More BBU cache available (I can't imagine the OS pair writing much)
3) not being queued behind data writes before getting to controller
3) Not waiting for BBU cache to be available (which is shared with all data
   writes) which requires RAID5 writes to complete...

Really, there's *lots* of variables here.  The basics being that WAL on
the same FS as data, on a RAID5, even with BBU is worse than WAL on a
dedicated set of RAID1 spindles with it's own BBU.

Wow!

;-)


-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-11 Thread Kevin Grittner
Aidan Van Dyk ai...@highrise.ca wrote:
 Alvaro Herrera alvhe...@commandprompt.com wrote:
 Hmm, so maybe the performance benefit is not from it being on a
 separate array, but from it being RAID1 instead of RAID5?
 
 Or the cumulative effects of:
 1) Dedicated spindles/Raid1
 2) More BBU cache available (I can't imagine the OS pair writing
much)
 3) not being queued behind data writes before getting to
controller
 3) Not waiting for BBU cache to be available (which is shared with
all data writes) which requires RAID5 writes to complete...
 
 Really, there's *lots* of variables here.  The basics being that
 WAL on the same FS as data, on a RAID5, even with BBU is worse
 than WAL on a dedicated set of RAID1 spindles with it's own BBU.
 
 Wow!
 
Sure, OK, but what surprised me was that a set of 15 read-only
queries (with pretty random reads) took almost twice as long when
the WAL files were on the same file system.  That's with OS writes
being only about 10% of reads, and *that's* with 128 GB of RAM which
keeps a lot of the reads from having to go to the disk.  I would not
have expected that a read-mostly environment like this would be that
sensitive to the WAL file placement.  (OK, I *did* request the
separate file system for them anyway, but I thought it was going to
be a marginal benefit, not something this big.)
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-09 Thread Kevin Grittner
Due to an error during an update to the system kernel on a database
server backing a web application, we ran for a month (mid-December
to mid-January) with the WAL files in the pg_xlog subdirectory on
the same 40-spindle array as the data -- only the OS was on a
separate mirrored pair of drives.  When we found the problem, we
moved the pg_xlog directory back to its own mirrored pair of drives
and symlinked to it.  It's a pretty dramatic difference.
 
Graph attached, this is response time for a URL request (like what a
user from the Internet would issue) which runs 15 queries and
formats the results.  Response time is 85 ms without putting WAL on
its own RAID; 50 ms with it on its own RAID.  This is a real-world,
production web site with 1.3 TB data, millions of hits per day, and
it's an active replication target 24/7.
 
Frankly, I was quite surprised by this, since some of the benchmarks
people have published on the effects of using a separate RAID for
the WAL files have only shown a one or two percent difference when
using a hardware RAID controller with BBU cache configured for
write-back.
 
No assistance needed; just posting a performance data point.
 
-Kevin
attachment: webresponse.png
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-09 Thread Jesper Krogh

 Frankly, I was quite surprised by this, since some of the benchmarks
 people have published on the effects of using a separate RAID for
 the WAL files have only shown a one or two percent difference when
 using a hardware RAID controller with BBU cache configured for
 write-back.

Hi Kevin.

Nice report, but just a few questions.

Sorry if it is obvious.. but what filesystem/OS are you using and do you
 have BBU-writeback on the main data catalog also?

Jesper


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-09 Thread Kevin Grittner
Jesper Krogh jes...@krogh.cc wrote:
 
 Sorry if it is obvious.. but what filesystem/OS are you using and
 do you have BBU-writeback on the main data catalog also?
 
Sorry for not providing more context.
 
ATHENA:/var/pgsql/data # uname -a
Linux ATHENA 2.6.16.60-0.39.3-smp #1 SMP Mon May 11 11:46:34 UTC
2009 x86_64 x86_64 x86_64 GNU/Linux
ATHENA:/var/pgsql/data # cat /etc/SuSE-release
SUSE Linux Enterprise Server 10 (x86_64)
VERSION = 10
PATCHLEVEL = 2
 
File system is xfs noatime,nobarrier for all data; OS is on ext3.  I
*think* the pg_xlog mirrored pair is hanging off the same
BBU-writeback controller as the big RAID, but I'd have to track down
the hardware tech to confirm, and he's out today.  System has 16
Xeon CPUs and 64 GB RAM.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-09 Thread Amitabh Kant
On Tue, Feb 9, 2010 at 10:03 PM, Kevin Grittner kevin.gritt...@wicourts.gov
 wrote:

 Jesper Krogh jes...@krogh.cc wrote:
 File system is xfs noatime,nobarrier for all data; OS is on ext3.  I
 *think* the pg_xlog mirrored pair is hanging off the same
 BBU-writeback controller as the big RAID, but I'd have to track down
 the hardware tech to confirm, and he's out today.  System has 16
 Xeon CPUs and 64 GB RAM.

 -Kevin


Hi Kevin

Just curious if you have a 16 physical CPU's or 16 cores on 4 CPU/8 cores
over 2 CPU with HT.

With regards

Amitabh Kant


Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-09 Thread Kevin Grittner
Amitabh Kant amitabhk...@gmail.com wrote:
 
 Just curious if you have a 16 physical CPU's or 16 cores on 4
 CPU/8 cores over 2 CPU with HT.
 
Four quad core CPUs:
 
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Xeon(R) CPU   X7350  @ 2.93GHz
stepping: 11
cpu MHz : 2931.978
cache size  : 4096 KB
physical id : 3
siblings: 4
core id : 0
cpu cores   : 4
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance