Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!
Greg Smith g...@2ndquadrant.com wrote: You can easily quantify if the BGW is aggressive enough. Buffers leave the cache three ways, and they each show up as separate counts in pg_stat_bgwriter: buffers_checkpoint, buffers_clean (the BGW), and buffers_backend (the queries). Cranking it up further tends to shift writes out of buffers_backend, which are the ones you want to avoid, toward buffers_clean instead. If buffers_backend is already low on a percentage basis compared to the other two, there's little benefit in trying to make the BGW do more. Here are the values from our two largest and busiest systems (where we found the pg_xlog placement to matter so much). It looks to me like a more aggressive bgwriter would help, yes? cir= select * from pg_stat_bgwriter ; -[ RECORD 1 ]--+ checkpoints_timed | 125996 checkpoints_req| 16932 buffers_checkpoint | 342972024 buffers_clean | 343634920 maxwritten_clean | 9928 buffers_backend| 575589056 buffers_alloc | 52397855471 cir= select * from pg_stat_bgwriter ; -[ RECORD 1 ]--+ checkpoints_timed | 125992 checkpoints_req| 16840 buffers_checkpoint | 260358442 buffers_clean | 474768152 maxwritten_clean | 9778 buffers_backend| 565837397 buffers_alloc | 71463873477 Current settings: bgwriter_delay = '200ms' bgwriter_lru_maxpages = 1000 bgwriter_lru_multiplier = 4 Any suggestions on how far to push it? -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!
On Feb 23, 2010, at 2:23 PM, Kevin Grittner wrote: Here are the values from our two largest and busiest systems (where we found the pg_xlog placement to matter so much). It looks to me like a more aggressive bgwriter would help, yes? cir= select * from pg_stat_bgwriter ; -[ RECORD 1 ]--+ checkpoints_timed | 125996 checkpoints_req| 16932 buffers_checkpoint | 342972024 buffers_clean | 343634920 maxwritten_clean | 9928 buffers_backend| 575589056 buffers_alloc | 52397855471 cir= select * from pg_stat_bgwriter ; -[ RECORD 1 ]--+ checkpoints_timed | 125992 checkpoints_req| 16840 buffers_checkpoint | 260358442 buffers_clean | 474768152 maxwritten_clean | 9778 buffers_backend| 565837397 buffers_alloc | 71463873477 Current settings: bgwriter_delay = '200ms' bgwriter_lru_maxpages = 1000 bgwriter_lru_multiplier = 4 Any suggestions on how far to push it? I don't know how far to push it, but you could start by reducing the delay time and observe how that affects performance.
Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!
Kevin Grittner wrote: Hannu Krosing wrote: Can it be, that each request does at least 1 write op (update session or log something) ? Well, the web application connects through a login which only has SELECT rights; but, in discussing a previous issue we've pretty well established that it's not unusual for a read to force a dirty buffer to write to the OS. Perhaps this is the issue here again. Nothing is logged on the database server for every request. I don't think it explains it, because dirty buffers are obviously written to the data area, not pg_xlog. I wonder if it might also pay to make the background writer even more aggressive than we have, so that SELECT-only queries don't spend so much time writing pages. That's worth trying. Anyway, given that these are replication targets, and aren't the database of origin for any data of their own, I guess there's no reason not to try asynchronous commit. Yeah; since the transactions only ever write commit records to WAL, it wouldn't matter a bit that they are lost on crash. And you should see an improvement, because they wouldn't have to flush at all. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!
Alvaro Herrera wrote: Kevin Grittner wrote: Anyway, given that these are replication targets, and aren't the database of origin for any data of their own, I guess there's no reason not to try asynchronous commit. Yeah; since the transactions only ever write commit records to WAL, it wouldn't matter a bit that they are lost on crash. And you should see an improvement, because they wouldn't have to flush at all. Actually, a transaction that performed no writes doesn't get a commit WAL record written, so it shouldn't make any difference at all. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!
Kevin Grittner wrote: Alvaro Herrera alvhe...@commandprompt.com wrote: Actually, a transaction that performed no writes doesn't get a commit WAL record written, so it shouldn't make any difference at all. Well, concurrent to the web application is the replication. Would asynchronous commit of that potentially alter the pattern of writes such that it had less impact on the reads? Well, certainly async commit would completely change the pattern of writes: it would give the controller an opportunity to reorder them according to some scheduler. Otherwise they are strictly serialized. I'm thinking, again, of why the placement of the pg_xlog on a separate file system made such a dramatic difference to the read-only response time -- might it make less difference if the replication was using asynchronous commit? Yeah, I think it would have been less notorious, but this is all theoretical. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!
Kevin Grittner wrote: I wonder if it might also pay to make the background writer even more aggressive than we have, so that SELECT-only queries don't spend so much time writing pages. You can easily quantify if the BGW is aggressive enough. Buffers leave the cache three ways, and they each show up as separate counts in pg_stat_bgwriter: buffers_checkpoint, buffers_clean (the BGW), and buffers_backend (the queries). Cranking it up further tends to shift writes out of buffers_backend, which are the ones you want to avoid, toward buffers_clean instead. If buffers_backend is already low on a percentage basis compared to the other two, there's little benefit in trying to make the BGW do more. -- Greg Smith2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!
Kevin Grittner wrote: Jesper Krogh jes...@krogh.cc wrote: Sorry if it is obvious.. but what filesystem/OS are you using and do you have BBU-writeback on the main data catalog also? Sorry for not providing more context. ATHENA:/var/pgsql/data # uname -a Linux ATHENA 2.6.16.60-0.39.3-smp #1 SMP Mon May 11 11:46:34 UTC 2009 x86_64 x86_64 x86_64 GNU/Linux ATHENA:/var/pgsql/data # cat /etc/SuSE-release SUSE Linux Enterprise Server 10 (x86_64) VERSION = 10 PATCHLEVEL = 2 File system is xfs noatime,nobarrier for all data; OS is on ext3. I *think* the pg_xlog mirrored pair is hanging off the same BBU-writeback controller as the big RAID, but I'd have to track down the hardware tech to confirm, and he's out today. System has 16 Xeon CPUs and 64 GB RAM. I would be surprised if the RAID controller had a BBU-writeback cache. I don't think having xlog share a BBU-writeback makes things slower, and if it does, I would love for someone to explain why. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!
On Thu, Feb 11, 2010 at 4:29 AM, Bruce Momjian br...@momjian.us wrote: Kevin Grittner wrote: Jesper Krogh jes...@krogh.cc wrote: Sorry if it is obvious.. but what filesystem/OS are you using and do you have BBU-writeback on the main data catalog also? Sorry for not providing more context. ATHENA:/var/pgsql/data # uname -a Linux ATHENA 2.6.16.60-0.39.3-smp #1 SMP Mon May 11 11:46:34 UTC 2009 x86_64 x86_64 x86_64 GNU/Linux ATHENA:/var/pgsql/data # cat /etc/SuSE-release SUSE Linux Enterprise Server 10 (x86_64) VERSION = 10 PATCHLEVEL = 2 File system is xfs noatime,nobarrier for all data; OS is on ext3. I *think* the pg_xlog mirrored pair is hanging off the same BBU-writeback controller as the big RAID, but I'd have to track down the hardware tech to confirm, and he's out today. System has 16 Xeon CPUs and 64 GB RAM. I would be surprised if the RAID controller had a BBU-writeback cache. I don't think having xlog share a BBU-writeback makes things slower, and if it does, I would love for someone to explain why. I believe in the past when this discussion showed up it was mainly due to them being on the same file system (and then not with pg_xlog separate) that made the biggest difference. I recall there being a noticeable performance gain from having two file systems on the same logical RAID device even. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!
Kevin Grittner kevin.gritt...@wicourts.gov wrote: Jesper Krogh jes...@krogh.cc wrote: Sorry if it is obvious.. but what filesystem/OS are you using and do you have BBU-writeback on the main data catalog also? File system is xfs noatime,nobarrier for all data; OS is on ext3. I *think* the pg_xlog mirrored pair is hanging off the same BBU-writeback controller as the big RAID, but I'd have to track down the hardware tech to confirm Another example of why I shouldn't trust my memory. Per the hardware tech: OS: /dev/sda is RAID1 - 2 x 2.5 15k SAS disk pg_xlog: /dev/sdb is RAID1 - 2 x 2.5 15k SAS disk These reside on a ServeRAID-MR10k controller with 256MB BB cache. data:/dev/sdc is RAID5 - 30 x 3.5 15k SAS disk These reside on the DS3200 disk subsystem with 512MB BB cache per controller and redundant drive loops. At least I had the file systems and options right. ;-) -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!
Kevin Grittner wrote: Another example of why I shouldn't trust my memory. Per the hardware tech: OS: /dev/sda is RAID1 - 2 x 2.5 15k SAS disk pg_xlog: /dev/sdb is RAID1 - 2 x 2.5 15k SAS disk These reside on a ServeRAID-MR10k controller with 256MB BB cache. data:/dev/sdc is RAID5 - 30 x 3.5 15k SAS disk These reside on the DS3200 disk subsystem with 512MB BB cache per controller and redundant drive loops. Hmm, so maybe the performance benefit is not from it being on a separate array, but from it being RAID1 instead of RAID5? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!
* Alvaro Herrera alvhe...@commandprompt.com [100211 12:58]: Hmm, so maybe the performance benefit is not from it being on a separate array, but from it being RAID1 instead of RAID5? Or the cumulative effects of: 1) Dedicated spindles/Raid1 2) More BBU cache available (I can't imagine the OS pair writing much) 3) not being queued behind data writes before getting to controller 3) Not waiting for BBU cache to be available (which is shared with all data writes) which requires RAID5 writes to complete... Really, there's *lots* of variables here. The basics being that WAL on the same FS as data, on a RAID5, even with BBU is worse than WAL on a dedicated set of RAID1 spindles with it's own BBU. Wow! ;-) -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. signature.asc Description: Digital signature
Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!
Aidan Van Dyk ai...@highrise.ca wrote: Alvaro Herrera alvhe...@commandprompt.com wrote: Hmm, so maybe the performance benefit is not from it being on a separate array, but from it being RAID1 instead of RAID5? Or the cumulative effects of: 1) Dedicated spindles/Raid1 2) More BBU cache available (I can't imagine the OS pair writing much) 3) not being queued behind data writes before getting to controller 3) Not waiting for BBU cache to be available (which is shared with all data writes) which requires RAID5 writes to complete... Really, there's *lots* of variables here. The basics being that WAL on the same FS as data, on a RAID5, even with BBU is worse than WAL on a dedicated set of RAID1 spindles with it's own BBU. Wow! Sure, OK, but what surprised me was that a set of 15 read-only queries (with pretty random reads) took almost twice as long when the WAL files were on the same file system. That's with OS writes being only about 10% of reads, and *that's* with 128 GB of RAM which keeps a lot of the reads from having to go to the disk. I would not have expected that a read-mostly environment like this would be that sensitive to the WAL file placement. (OK, I *did* request the separate file system for them anyway, but I thought it was going to be a marginal benefit, not something this big.) -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] moving pg_xlog -- yeah, it's worth it!
Due to an error during an update to the system kernel on a database server backing a web application, we ran for a month (mid-December to mid-January) with the WAL files in the pg_xlog subdirectory on the same 40-spindle array as the data -- only the OS was on a separate mirrored pair of drives. When we found the problem, we moved the pg_xlog directory back to its own mirrored pair of drives and symlinked to it. It's a pretty dramatic difference. Graph attached, this is response time for a URL request (like what a user from the Internet would issue) which runs 15 queries and formats the results. Response time is 85 ms without putting WAL on its own RAID; 50 ms with it on its own RAID. This is a real-world, production web site with 1.3 TB data, millions of hits per day, and it's an active replication target 24/7. Frankly, I was quite surprised by this, since some of the benchmarks people have published on the effects of using a separate RAID for the WAL files have only shown a one or two percent difference when using a hardware RAID controller with BBU cache configured for write-back. No assistance needed; just posting a performance data point. -Kevin attachment: webresponse.png -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!
Frankly, I was quite surprised by this, since some of the benchmarks people have published on the effects of using a separate RAID for the WAL files have only shown a one or two percent difference when using a hardware RAID controller with BBU cache configured for write-back. Hi Kevin. Nice report, but just a few questions. Sorry if it is obvious.. but what filesystem/OS are you using and do you have BBU-writeback on the main data catalog also? Jesper -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!
Jesper Krogh jes...@krogh.cc wrote: Sorry if it is obvious.. but what filesystem/OS are you using and do you have BBU-writeback on the main data catalog also? Sorry for not providing more context. ATHENA:/var/pgsql/data # uname -a Linux ATHENA 2.6.16.60-0.39.3-smp #1 SMP Mon May 11 11:46:34 UTC 2009 x86_64 x86_64 x86_64 GNU/Linux ATHENA:/var/pgsql/data # cat /etc/SuSE-release SUSE Linux Enterprise Server 10 (x86_64) VERSION = 10 PATCHLEVEL = 2 File system is xfs noatime,nobarrier for all data; OS is on ext3. I *think* the pg_xlog mirrored pair is hanging off the same BBU-writeback controller as the big RAID, but I'd have to track down the hardware tech to confirm, and he's out today. System has 16 Xeon CPUs and 64 GB RAM. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!
On Tue, Feb 9, 2010 at 10:03 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Jesper Krogh jes...@krogh.cc wrote: File system is xfs noatime,nobarrier for all data; OS is on ext3. I *think* the pg_xlog mirrored pair is hanging off the same BBU-writeback controller as the big RAID, but I'd have to track down the hardware tech to confirm, and he's out today. System has 16 Xeon CPUs and 64 GB RAM. -Kevin Hi Kevin Just curious if you have a 16 physical CPU's or 16 cores on 4 CPU/8 cores over 2 CPU with HT. With regards Amitabh Kant
Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!
Amitabh Kant amitabhk...@gmail.com wrote: Just curious if you have a 16 physical CPU's or 16 cores on 4 CPU/8 cores over 2 CPU with HT. Four quad core CPUs: vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz stepping: 11 cpu MHz : 2931.978 cache size : 4096 KB physical id : 3 siblings: 4 core id : 0 cpu cores : 4 -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance