Re: [PERFORM] Really bad diskio
At Fri, 15 Jul 2005 14:39:36 -0600, Ron Wills wrote: I just wanted to thank everyone for their help. I believe we found a solution that will help with this problem, with the hardware configuration and caching the larger tables into smaller data sets. A valuable lesson learned from this ;) Hello all I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and an 3Ware SATA raid. Currently the database is only 16G with about 2 tables with 50+ row, one table 20+ row and a few small tables. The larger tables get updated about every two hours. The problem I having with this server (which is in production) is the disk IO. On the larger tables I'm getting disk IO wait averages of ~70-90%. I've been tweaking the linux kernel as specified in the PostgreSQL documentations and switched to the deadline scheduler. Nothing seems to be fixing this. The queries are as optimized as I can get them. fsync is off in an attempt to help preformance still nothing. Are there any setting I should be look at the could improve on this??? Thanks for and help in advance. Ron ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Really bad diskio
Ron Wills wrote: Hello all I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and an 3Ware SATA raid. 2 drives? 4 drives? 8 drives? RAID 1? 0? 10? 5? Currently the database is only 16G with about 2 tables with 50+ row, one table 20+ row and a few small tables. The larger tables get updated about every two hours. The problem I having with this server (which is in production) is the disk IO. On the larger tables I'm getting disk IO wait averages of ~70-90%. I've been tweaking the linux kernel as specified in the PostgreSQL documentations and switched to the deadline scheduler. Nothing seems to be fixing this. The queries are as optimized as I can get them. fsync is off in an attempt to help preformance still nothing. Are there any setting I should be look at the could improve on this??? Thanks for and help in advance. Ron ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Really bad diskio
Hello all I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and an 3Ware SATA raid. Currently the database is only 16G with about 2 tables with 50+ row, one table 20+ row and a few small tables. The larger tables get updated about every two hours. The problem I having with this server (which is in production) is the disk IO. On the larger tables I'm getting disk IO wait averages of ~70-90%. I've been tweaking the linux kernel as specified in the PostgreSQL documentations and switched to the deadline scheduler. Nothing seems to be fixing this. The queries are as optimized as I can get them. fsync is off in an attempt to help preformance still nothing. Are there any setting I should be look at the could improve on this??? Thanks for and help in advance. Ron ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Really bad diskio
At Fri, 15 Jul 2005 13:45:07 -0700, Joshua D. Drake wrote: Ron Wills wrote: Hello all I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and an 3Ware SATA raid. 2 drives? 4 drives? 8 drives? 3 drives raid 5. I don't believe it's the raid. I've tested this by moving the database to the mirrors software raid where the root is found and onto the the SATA raid. Neither relieved the IO problems. I was also was thinking this could be from the transactional subsystem getting overloaded? There are several automated processes that use the DB. Most are just selects, but the data updates and one that updates the smaller tables that are the heavier queries. On their own they seem to work ok, (still high IO, but fairly quick). But if even the simplest select is called during the heavier operation, then everything goes out through the roof. Maybe there's something I'm missing here as well? RAID 1? 0? 10? 5? Currently the database is only 16G with about 2 tables with 50+ row, one table 20+ row and a few small tables. The larger tables get updated about every two hours. The problem I having with this server (which is in production) is the disk IO. On the larger tables I'm getting disk IO wait averages of ~70-90%. I've been tweaking the linux kernel as specified in the PostgreSQL documentations and switched to the deadline scheduler. Nothing seems to be fixing this. The queries are as optimized as I can get them. fsync is off in an attempt to help preformance still nothing. Are there any setting I should be look at the could improve on this??? Thanks for and help in advance. Ron ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Really bad diskio
On Jul 15, 2005, at 2:39 PM, Ron Wills wrote: Hello all I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and an 3Ware SATA raid. Operating System? Which file system are you using? I was having a similar problem just a few days ago and learned that ext3 was the culprit. -Dan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Really bad diskio
On Fri, Jul 15, 2005 at 03:04:35PM -0600, Ron Wills wrote: Ron Wills wrote: Hello all I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and an 3Ware SATA raid. 3 drives raid 5. I don't believe it's the raid. I've tested this by moving the database to the mirrors software raid where the root is found and onto the the SATA raid. Neither relieved the IO problems. What filesystem is this? -- Alvaro Herrera (alvherre[a]alvh.no-ip.org) Si no sabes adonde vas, es muy probable que acabes en otra parte. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Really bad diskio
On Fri, 2005-07-15 at 15:04 -0600, Ron Wills wrote: At Fri, 15 Jul 2005 13:45:07 -0700, Joshua D. Drake wrote: Ron Wills wrote: Hello all I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and an 3Ware SATA raid. 2 drives? 4 drives? 8 drives? 3 drives raid 5. I don't believe it's the raid. I've tested this by moving the database to the mirrors software raid where the root is found and onto the the SATA raid. Neither relieved the IO problems. Hard or soft RAID? Which controller? Many of the 3Ware controllers (85xx and 95xx) have extremely bad RAID 5 performance. Did you take any pgbench or other benchmark figures before you started using the DB? -jwb ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Really bad diskio
At Fri, 15 Jul 2005 14:00:07 -0700, Jeffrey W. Baker wrote: On Fri, 2005-07-15 at 14:39 -0600, Ron Wills wrote: Hello all I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and an 3Ware SATA raid. Currently the database is only 16G with about 2 tables with 50+ row, one table 20+ row and a few small tables. The larger tables get updated about every two hours. The problem I having with this server (which is in production) is the disk IO. On the larger tables I'm getting disk IO wait averages of ~70-90%. I've been tweaking the linux kernel as specified in the PostgreSQL documentations and switched to the deadline scheduler. Nothing seems to be fixing this. The queries are as optimized as I can get them. fsync is off in an attempt to help preformance still nothing. Are there any setting I should be look at the could improve on this??? Can you please characterize this a bit better? Send the output of vmstat or iostat over several minutes, or similar diagnostic information. Also please describe your hardware more. Here's a bit of a dump of the system that should be useful. Processors x2: vendor_id : AuthenticAMD cpu family : 6 model : 8 model name : AMD Athlon(tm) MP 2400+ stepping: 1 cpu MHz : 2000.474 cache size : 256 KB MemTotal: 903804 kB Mandrake 10.0 Linux kernel 2.6.3-19mdk The raid controller, which is using the hardware raid configuration: 3ware 9000 Storage Controller device driver for Linux v2.26.02.001. scsi0 : 3ware 9000 Storage Controller 3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xe802, IRQ: 17. 3w-9xxx: scsi0: Firmware FE9X 2.02.00.011, BIOS BE9X 2.02.01.037, Ports: 4. Vendor: 3ware Model: Logical Disk 00 Rev: 1.00 Type: Direct-Access ANSI SCSI revision: 00 SCSI device sda: 624955392 512-byte hdwr sectors (319977 MB) SCSI device sda: drive cache: write back, no read (daft) This is also on a 3.6 reiser filesystem. Here's the iostat for 10mins every 10secs. I've removed the stats from the idle drives to reduce the size of this email. Linux 2.6.3-19mdksmp (photo_server) 07/15/2005 avg-cpu: %user %nice%sys %iowait %idle 2.851.532.15 39.52 53.95 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 82.49 4501.73 188.38 1818836580 76110154 avg-cpu: %user %nice%sys %iowait %idle 0.300.001.00 96.302.40 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 87.80 6159.20 340.00 61592 3400 avg-cpu: %user %nice%sys %iowait %idle 2.500.001.45 94.351.70 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 89.60 5402.40 320.80 54024 3208 avg-cpu: %user %nice%sys %iowait %idle 1.000.101.35 97.550.00 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 105.20 5626.40 332.80 56264 3328 avg-cpu: %user %nice%sys %iowait %idle 0.400.001.00 87.40 11.20 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 92.61 4484.32 515.48 44888 5160 avg-cpu: %user %nice%sys %iowait %idle 0.450.001.00 92.665.89 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 89.10 4596.00 225.60 45960 2256 avg-cpu: %user %nice%sys %iowait %idle 0.300.000.80 96.302.60 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 86.49 3877.48 414.01 38736 4136 avg-cpu: %user %nice%sys %iowait %idle 0.500.001.00 98.150.35 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 97.10 4710.49 405.19 47152 4056 avg-cpu: %user %nice%sys %iowait %idle 0.350.001.00 98.650.00 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 93.30 5324.80 186.40 53248 1864 avg-cpu: %user %nice%sys %iowait %idle 0.400.001.10 96.701.80 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 117.88 5481.72 402.80 54872 4032 avg-cpu: %user %nice%sys %iowait %idle 0.500.001.05 98.300.15 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 124.00 6081.60 403.20 60816 4032 avg-cpu: %user %nice%sys %iowait %idle 8.750.002.55 84.464.25 Device:tps Blk_read/s Blk_wrtn/s
Re: [PERFORM] Really bad diskio
At Fri, 15 Jul 2005 14:17:34 -0700, Jeffrey W. Baker wrote: On Fri, 2005-07-15 at 15:04 -0600, Ron Wills wrote: At Fri, 15 Jul 2005 13:45:07 -0700, Joshua D. Drake wrote: Ron Wills wrote: Hello all I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and an 3Ware SATA raid. 2 drives? 4 drives? 8 drives? 3 drives raid 5. I don't believe it's the raid. I've tested this by moving the database to the mirrors software raid where the root is found and onto the the SATA raid. Neither relieved the IO problems. Hard or soft RAID? Which controller? Many of the 3Ware controllers (85xx and 95xx) have extremely bad RAID 5 performance. Did you take any pgbench or other benchmark figures before you started using the DB? No, unfortunatly, I'm more or less just the developer for the automation systems and admin the system to keep everything going. I have very little say in the hardware used and I don't have any physical access to the machine, it's found a province over :P. But, for what the system, this IO seems unreasonable. I run development on a 1.4Ghz Athlon, Gentoo system, with no raid and I can't reproduce this kind of IO :(. -jwb ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Really bad diskio
On Fri, 2005-07-15 at 15:29 -0600, Ron Wills wrote: Here's a bit of a dump of the system that should be useful. Processors x2: vendor_id : AuthenticAMD cpu family : 6 model : 8 model name : AMD Athlon(tm) MP 2400+ stepping: 1 cpu MHz : 2000.474 cache size : 256 KB MemTotal: 903804 kB Mandrake 10.0 Linux kernel 2.6.3-19mdk The raid controller, which is using the hardware raid configuration: 3ware 9000 Storage Controller device driver for Linux v2.26.02.001. scsi0 : 3ware 9000 Storage Controller 3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xe802, IRQ: 17. 3w-9xxx: scsi0: Firmware FE9X 2.02.00.011, BIOS BE9X 2.02.01.037, Ports: 4. Vendor: 3ware Model: Logical Disk 00 Rev: 1.00 Type: Direct-Access ANSI SCSI revision: 00 SCSI device sda: 624955392 512-byte hdwr sectors (319977 MB) SCSI device sda: drive cache: write back, no read (daft) This is also on a 3.6 reiser filesystem. Here's the iostat for 10mins every 10secs. I've removed the stats from the idle drives to reduce the size of this email. Linux 2.6.3-19mdksmp (photo_server) 07/15/2005 avg-cpu: %user %nice%sys %iowait %idle 2.851.532.15 39.52 53.95 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 82.49 4501.73 188.38 1818836580 76110154 avg-cpu: %user %nice%sys %iowait %idle 0.300.001.00 96.302.40 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 87.80 6159.20 340.00 61592 3400 These I/O numbers are not so horrible, really. 100% iowait is not necessarily a symptom of misconfiguration. It just means you are disk limited. With a database 20 times larger than main memory, this is no surprise. If I had to speculate about the best way to improve your performance, I would say: 1a) Get a better RAID controller. The 3ware hardware RAID5 is very bad. 1b) Get more disks. 2) Get a (much) newer kernel. 3) Try XFS or JFS. Reiser3 has never looked good in my pgbench runs By the way, are you experiencing bad application performance, or are you just unhappy with the iostat figures? Regards, jwb ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Really bad diskio
At Fri, 15 Jul 2005 14:53:26 -0700, Jeffrey W. Baker wrote: On Fri, 2005-07-15 at 15:29 -0600, Ron Wills wrote: Here's a bit of a dump of the system that should be useful. Processors x2: vendor_id : AuthenticAMD cpu family : 6 model : 8 model name : AMD Athlon(tm) MP 2400+ stepping: 1 cpu MHz : 2000.474 cache size : 256 KB MemTotal: 903804 kB Mandrake 10.0 Linux kernel 2.6.3-19mdk The raid controller, which is using the hardware raid configuration: 3ware 9000 Storage Controller device driver for Linux v2.26.02.001. scsi0 : 3ware 9000 Storage Controller 3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xe802, IRQ: 17. 3w-9xxx: scsi0: Firmware FE9X 2.02.00.011, BIOS BE9X 2.02.01.037, Ports: 4. Vendor: 3ware Model: Logical Disk 00 Rev: 1.00 Type: Direct-Access ANSI SCSI revision: 00 SCSI device sda: 624955392 512-byte hdwr sectors (319977 MB) SCSI device sda: drive cache: write back, no read (daft) This is also on a 3.6 reiser filesystem. Here's the iostat for 10mins every 10secs. I've removed the stats from the idle drives to reduce the size of this email. Linux 2.6.3-19mdksmp (photo_server) 07/15/2005 avg-cpu: %user %nice%sys %iowait %idle 2.851.532.15 39.52 53.95 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 82.49 4501.73 188.38 1818836580 76110154 avg-cpu: %user %nice%sys %iowait %idle 0.300.001.00 96.302.40 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 87.80 6159.20 340.00 61592 3400 These I/O numbers are not so horrible, really. 100% iowait is not necessarily a symptom of misconfiguration. It just means you are disk limited. With a database 20 times larger than main memory, this is no surprise. If I had to speculate about the best way to improve your performance, I would say: 1a) Get a better RAID controller. The 3ware hardware RAID5 is very bad. 1b) Get more disks. 2) Get a (much) newer kernel. 3) Try XFS or JFS. Reiser3 has never looked good in my pgbench runs Not good news :(. I can't change the hardware, hopefully a kernel update and XFS of JFS will make an improvement. I was hoping for software raid (always has worked well), but the client didn't feel conforable with it :P. By the way, are you experiencing bad application performance, or are you just unhappy with the iostat figures? It's affecting the whole system. It is sending the load averages through the roof (from 4 to 12) and processes that would take only a few minutes starts going over an hour, until it clears up. Well, I guess I'll have to drum up some more programming magic... and I'm starting to run out of tricks... I love my job some day :$ Regards, jwb ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings