Re: [PERFORM] How to improve Postgres performance
On Tue, 30 Aug 2005, Hemant Pandey wrote: So please tell me how can i improve database performance through configuration parameters. I had tried to change parameters in postgresql.conf file but of no avail. Now i am trying to Auto Vacuum, but don't know how to run Auto Vacuum. The most important part is that you need to run VACUUM ANALYZE regulary. Vacuum can be started each night in a cron job, started from pg_autovacuum when it thinks it's needed, or started in some other way. In any case, it has to be run whenever the data in the database have changed enough. The parameters in the config that is most important in my experience is effective_cache_size and shared_buffers. This is a text I like (it's for pg 7.4 but still useful): http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html -- /Dennis Björklund ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] shared buffers
Carlos Henrique Reimer wrote: I forgot to say that it´s a 12GB database... Ok, I´ll set shared buffers to 30.000 pages but even so meminfo and top shouldn´t show some shared pages? I heard something about that Redhat 9 can´t handle very well RAM higher than 2GB. Is it right? Thanks in advance! RH9, like any 32-bit OS, is limited to 2GB address space w/o special tricks. However, it can access 2GB for the OS disk cache using PAE if you are running the bigmem kernel. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] RAID Configuration Sugestion
Hello, We are about to install a new PostgreSQL server, and despite of being a very humble configuration compared to the ones we see in the list, it's the biggest one we've got till now. The server is a Dual Xeon 3.0 with 2 GB RAM and two SCSI disks. Our main doubt is what is the best configuration for the disks. We are thinking about use them in a RAID-0 array. Is this the best option? What do you suggest on partitioning? Separate partitions for the OS, data and pg_xlog? We'll have some time to work on performance tests, and if someone is interested we can provide our results. Thanks in advance, Alvaro ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] RAID Configuration Sugestion
On Tue, Aug 30, 2005 at 09:37:17 -0300, Alvaro Nunes Melo [EMAIL PROTECTED] wrote: The server is a Dual Xeon 3.0 with 2 GB RAM and two SCSI disks. Our main doubt is what is the best configuration for the disks. We are thinking about use them in a RAID-0 array. Is this the best option? What do you suggest on partitioning? Separate partitions for the OS, data and pg_xlog? You don't have a lot of options with just two disks. What are you trying to accomplish with raid? Raid 0 will possibly give you some speed up, while raid 1 will give you some fault tolerance, some speed of of reads, but cost you half your disk space. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Observation about db response time
Hello Friends, We were having a database in pgsql7.4.2 The database was responding very slowly even after full vacuum analyze (select count(*) from some_table_having_18000_records was taking 18 Sec). We took a backup of that db and restored it back. Now the same db on same PC is responding fast (same query is taking 18 ms). But we can't do the same as a solution of slow response. Do anybody has faced similar problem? Is this due to any internal problem of pgsql? Is there any clue to fasten the database? Regards, akshay --- Akshay Mathur SMTS, Product Verification AirTight Networks, Inc. (www.airtightnetworks.net) O: +91 20 2588 1555 ext 205 F: +91 20 2588 1445
Re: [PERFORM] Observation about db response time
On Tue, 30 Aug 2005 18:35:30 +0530 Akshay Mathur [EMAIL PROTECTED] wrote: Hello Friends, We were having a database in pgsql7.4.2 The database was responding very slowly even after full vacuum analyze (select count(*) from some_table_having_18000_records was taking 18 Sec). We took a backup of that db and restored it back. Now the same db on same PC is responding fast (same query is taking 18 ms). But we can't do the same as a solution of slow response. Do anybody has faced similar problem? Is this due to any internal problem of pgsql? Is there any clue to fasten the database? This could be because you don't have max_fsm_pages and max_fsm_relations setup correctly or are not doing full vacuums often enough. If your database deletes a ton of data as a matter of course then sometimes a full vacuum will not clear up as much space as it could. Try increasing those configuration values and doing vacuums more often. If you should also explore upgrading to the latest 8.0 as you will no doubt see noticeable speed improvements. - Frank Wiles [EMAIL PROTECTED] http://www.wiles.org - ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] RAID Configuration Sugestion
Please keep replies copied to the list so that others may contribute to and learn from the discussion. On Tue, Aug 30, 2005 at 10:15:13 -0300, Alvaro Nunes Melo [EMAIL PROTECTED] wrote: Hello Bruno, Bruno Wolff III wrote: On Tue, Aug 30, 2005 at 09:37:17 -0300, Alvaro Nunes Melo [EMAIL PROTECTED] wrote: The server is a Dual Xeon 3.0 with 2 GB RAM and two SCSI disks. Our main doubt is what is the best configuration for the disks. We are thinking about use them in a RAID-0 array. Is this the best option? What do you suggest on partitioning? Separate partitions for the OS, data and pg_xlog? Our main goal is performance speedup. Disk space might not be a problem. I've read a lot here about movig pg_xlog to different partitions, and we'll surely make tests to see what configuration might be better. This isn't a very good mix of hardware for running postgres. Xeons have some context switching issues for which you will probably see some speed up in 8.1. (So if you aren't going into production for sevral months you might want to be using 8.1beta.) Having only two disk drives is also not a good idea. With what you have you either want to use raid 0 and not worry too much about how the disks are partitioned or use one disk for wal logging and the other for other stuff. There are other people on the list who can probably give you a better idea of which of these options is likely to be better in your case. However, they may need to know more about your raid controller. In particular how much battery backed memory does it have and its model. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] OSX Performance
On Aug 29, 2005, at 1:57 PM, Tom Lane wrote: You must have CFLAGS set to empty in your build environment, because configure will certainly default to -O2 if not overridden. It works fine for me on OS X. Maybe you want to trace through the configure script and see why it's doing something else? /me hangs head in shame. Yes. I'd been futzing with various settings and had CFLAGS set to empty instead of cleared out. 8.0.3 and -snapshot (8/29) both seem to now compile with -O2 Anyway, I tried putting together a nice self-data-producing test case but that didn't cause the bug. So I'm trying to get this dump as small as possible (I'll email you a url later). To tide things over, here's the gprof (and shark) output for my query of doom. linux box: 6.36 0.41 0.41 240694 0.00 0.00 _bt_compare 5.97 0.79 0.38 907242 0.00 0.00 AllocSetAlloc 4.55 1.07 0.29 135008 0.00 0.00 hash_any 4.16 1.34 0.27 185684 0.00 0.00 MemoryContextAllocZeroAlig ned 3.30 1.55 0.2139152 0.01 0.01 localsub 2.98 1.74 0.19 1213172 0.00 0.00 AllocSetFreeIndex 2.83 1.92 0.1852695 0.00 0.00 nocachegetattr 2.75 2.10 0.17 134775 0.00 0.00 hash_search 2.51 2.25 0.1647646 0.00 0.01 StrategyBufferLookup 2.28 2.40 0.1471990 0.00 0.00 fmgr_isbuiltin 2.20 2.54 0.1433209 0.00 0.00 _bt_moveright 1.88 2.66 0.1278864 0.00 0.00 comparetup_heap 1.57 2.76 0.1063485 0.00 0.00 SearchCatCache 1.41 2.85 0.0939152 0.00 0.00 timesub 1.26 2.93 0.08 325246 0.00 0.00 tas 1.26 3.01 0.08 305883 0.00 0.00 AllocSetFree 1.26 3.09 0.08 162622 0.00 0.00 LWLockAcquire and on osx: (self, total, library, func) 29.0%29.0%postmaster_bt_checkkeys 15.6%15.6%postmasterFunctionCall2 10.4%10.4%libSystem.B.dylib__isnand 9.5%9.5%postmastertimestamp_cmp_internal 9.3%9.3%postmaster_bt_step 5.3%5.3%postmastertimestamp_le 4.9%4.9%postmaster_bt_next 3.6%3.6%postmasterdyld_stub___isnand 3.1%3.1%postmastertimestamp_gt 1.9%1.9%postmasterint4eq 1.3%1.3%postmasterBufferGetBlockNumber 0.6%0.6%postmasterLWLockAcquire 0.5%0.5%postmasterLWLockRelease 0.4%0.4%postmasterhash_search On my failed simulated attempt here's what things looked liek (the data should have been relatively similar). linux: 5.39 0.28 0.28 852086 0.00 0.00 AllocSetAlloc 4.90 0.53 0.25 130165 0.00 0.00 hash_any 4.12 0.73 0.21 214061 0.00 0.00 _bt_compare 4.12 0.94 0.2139152 0.01 0.01 localsub 4.02 1.15 0.20 160487 0.00 0.00 MemoryContextAllocZeroAlig ned 3.24 1.31 0.17 1157316 0.00 0.00 AllocSetFreeIndex 3.14 1.48 0.1664375 0.00 0.00 fmgr_isbuiltin 2.55 1.60 0.1356142 0.00 0.00 SearchCatCache 2.35 1.73 0.12 130076 0.00 0.00 hash_search 1.76 1.81 0.0939152 0.00 0.00 timesub 1.67 1.90 0.09 221469 0.00 0.00 timestamp_cmp_internal 1.67 1.99 0.0956069 0.00 0.00 MemoryContextCreate 1.57 2.06 0.08 145787 0.00 0.00 LWLockRelease 1.37 2.13 0.07 289119 0.00 0.00 pfree 1.37 2.21 0.07 8002 0.01 0.02 ExecMakeFunctionResult 1.37 2.27 0.07 8000 0.01 0.22 ExecInitIndexScan 1.18 2.33 0.06 291574 0.00 0.00 tas and on osx: (which runs very fast, usually a couple hundred ms faster than the linux box) 5.9%5.9%postmasterLWLockAcquire 5.2%5.2%postmasterAllocSetAlloc 4.9%4.9%postmasterLWLockRelease 3.9%3.9%postmasterhash_any 3.6%3.6%postmaster_bt_compare 2.9%2.9%postmasterhash_search 2.6%2.6%postmasterMemoryContextAllocZeroAligned 2.6%2.6%postmasterExecInitExpr 2.0%2.0%mach_kernelml_set_interrupts_enabled 2.0%2.0%postmasterfmgr_info_cxt_security 2.0%2.0%postmasterAllocSetFree 1.6%1.6%postmasterMemoryContextAlloc 1.6%1.6%postmasterFunctionCall2 1.6%1.6%postmasterAllocSetDelete 1.6%1.6%libSystem.B.dylib__isnand which to me anyway, looks like basically the same profile. So there must be something about the exact nature of hte data that is kicking it in the nuts. I tried making a
Re: [PERFORM] RAID Configuration Sugestion
At 08:37 AM 8/30/2005, Alvaro Nunes Melo wrote: Hello, We are about to install a new PostgreSQL server, and despite of being a very humble configuration compared to the ones we see in the list, it's the biggest one we've got till now. The server is a Dual Xeon 3.0 with 2 GB RAM and two SCSI disks. Our main doubt is what is the best configuration for the disks. We are thinking about use them in a RAID-0 array. Is this the best option? What do you suggest on partitioning? Separate partitions for the OS, data and pg_xlog? This is _very_ modest HW. Unless your DB and/or DB load is similarly modest, you are not going to be happy with the performance of your DBMS. At a minimum, for safety reasons you want 4 HDs: 2 for a RAID 1 set for the DB, and 2 for a RAID 1 set for the OS + pg_xlog. 2 extra HDs, even SCSI HDs, is cheap. Especially when compared to the cost of corrupted or lost data. HD's and RAM are cheap enough that you should be able to upgrade in more ways, but do at least that upgrade! Beyond that, the best ways to spend you limited $ are highly dependent on your exact DB and its usage pattern. Ron Peacetree ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Observation about db response time
On Aug 30, 2005, at 9:05 AM, Akshay Mathur wrote:We were having a database in pgsql7.4.2 The database was responding very slowly even after full vacuum analyze (select count(*) from some_table_having_18000_records was taking 18 Sec).On a 7.4.2 db, there should probably be no index bloat, but there could be. Does REINDEX on your tables help? If not, then VACUUM FULL followed by REINDEX may help. The latter should result in nearly the same as your dump+restore. And you need to run vacuum often enough to keep your tables from bloating. How often that is depends on your update/delete rate.Also, updating to 8.0 may help. Vivek Khera, Ph.D. +1-301-869-4449 x806
Re: [PERFORM] Need indexes on empty tables for good performance ?
Lenard, Rohan (Rohan) wrote: I've read that indexes aren't used for COUNT(*) and I've noticed (7.3.x) with EXPLAIN that indexes never seem to be used on empty tables - is there any reason to have indexes on empty tables, or will postgresql never use them. You could add a row, vacuum analyze, delete the row, etc Then you are fine until you vacuum analyze again ;-) This is a feature designed to prevent really bad plans when you are loading tables with data. However, you are right. It can create bad plans sometimes. Any chance one can eventually come up with a way to tell the planner that an empty table is expected not to grow? Otherwise, I can see nightmares in a data warehouse environment where you have an empty parent table... Best Wishes, Chris Travers Metatron Technology Consulting begin:vcard fn:Chris Travers n:Travers;Chris email;internet:[EMAIL PROTECTED] x-mozilla-html:FALSE version:2.1 end:vcard ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Observation about db response time
Hello Friends, We were having a database in pgsql7.4. The database was responding very slowly even after full vacuum (select count(*) from some_table_having_18000_records was taking 18 Sec). We took a backup of that db and restored it back. Now the same db on same PC is responding fast (same query is taking 18 ms). But we can't do the same as a solution of slow response. Do anybody has faced similar problem? Is this due to any internal problem of pgsql? Is there any clue to fasteen the database? Regards, akshay ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] High load and iowait but no disk access
We have been trying to pinpoint what originally seem to be a I/O bottleneck but which now seems to be an issue with either Postgresql or RHES 3. We have the following test environment on which we can reproduce the problem: 1) Test System A Dell 6650 Quad Xeon Pentium 4 8 Gig of RAM OS: RHES 3 update 2 Storage: NetApp FAS270 connected using an FC card using 10 disks 2) Test System B Dell Dual Xeon Pentium III 2 Gig o RAM OS: RHES 3 update 2 Storage: NetApp FAS920 connected using an FC card using 28 disks Our Database size is around 30G. The behavior we see is that when running queries that do random reads on disk, IOWAIT goes over 80% and actual disk IO falls to a crawl at a throughput bellow 3000kB/s (We usually average 4 kB/s to 8 kB/s on sequential read operations on the netapps) The stats of the NetApp do confirm that it is sitting idle. Doing an strace on the Postgresql process shows that is it doing seeks and reads. So my question is where is this iowait time spent ? Is there a way to pinpoint the problem in more details ? We are able to reproduce this behavior with Postgresql 7.4.8 and 8.0.3 I have included the output of top,vmstat,strace and systat from the Netapp from System B while running a single query that generates this behavior. Rémy top output: 06:27:28 up 5 days, 16:59, 6 users, load average: 1.04, 1.30, 1.01 72 processes: 71 sleeping, 1 running, 0 zombie, 0 stopped CPU states: cpuusernice systemirq softirq iowaitidle total2.7%0.0%1.0% 0.1% 0.2% 46.0% 49.5% cpu000.2%0.0%0.2% 0.0% 0.2%2.2% 97.2% cpu015.3%0.0%1.9% 0.3% 0.3% 89.8%1.9% Mem: 2061696k av, 2043936k used, 17760k free, 0k shrd,3916k buff 1566332k actv, 296648k in_d, 30504k in_c Swap: 16771584k av, 21552k used, 16750032k free 1933772k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 30960 postgres 15 0 13424 10M 9908 D 2.7 0.5 2:00 1 postmaster 30538 root 15 0 1080 764 524 S 0.7 0.0 0:43 0 sshd 1 root 15 0 496 456 436 S 0.0 0.0 0:08 0 init 2 root RT 0 00 0 SW0.0 0.0 0:00 0 migration/0 3 root RT 0 00 0 SW0.0 0.0 0:00 1 migration/1 4 root 15 0 00 0 SW0.0 0.0 0:01 0 keventd 5 root 34 19 00 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0 6 root 34 19 00 0 SWN 0.0 0.0 0:00 1 ksoftirqd/1 9 root 15 0 00 0 SW0.0 0.0 0:24 1 bdflush 7 root 15 0 00 0 SW0.0 0.0 6:53 1 kswapd 8 root 15 0 00 0 SW0.0 0.0 8:44 1 kscand 10 root 15 0 00 0 SW0.0 0.0 0:13 0 kupdated 11 root 25 0 00 0 SW0.0 0.0 0:00 0 mdrecoveryd 17 root 15 0 00 0 SW0.0 0.0 0:00 0 ahc_dv_0 vmstat output procs memory swap io system cpu r b swpd free buff cache si sobibo incs us sy id wa 0 1 21552 17796 4872 193192823 3 1 27 6 2 1 7 3 0 1 21552 18044 4880 193165200 1652 0 397 512 1 2 50 47 0 1 21552 17976 4896 193166400 2468 0 407 552 2 2 50 47 1 0 21552 17984 4896 193160800 2124 0 418 538 3 3 48 46 0 1 21552 18028 4900 193153600 1592 0 385 509 1 3 50 46 0 1 21552 18040 4916 193148800 1620 820 419 581 2 2 50 46 0 1 21552 17968 4916 193153604 1708 4 402 554 3 1 50 46 1 1 21552 18052 4916 193138800 1772 0 409 531 3 1 49 47 0 1 21552 17912 4924 193149200 1772 0 408 565 3 1 48 48 0 1 21552 17932 4932 193144004 1356 4 391 545 5 0 49 46 0 1 21552 18320 4944 193101604 1500 840 414 571 1 1 48 50 0 1 21552 17872 4944 193144000 2116 0 392 496 1 5 46 48 0 1 21552 18060 4944 193123200 2232 0 423 597 1 2 48 49 1 1 21552 17684 4944 193158400 1752 0 395 537 1 1 50 48 0 1 21552 18000 4944 193124000 1576 0 401 549 0 1 50 49 NetApp stats: CPU NFS CIFS HTTP TotalNet kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk DAFS FCP iSCSI FCP kB/s in out read write read write age hit time ty util in out 2% 0 0 0 139 0 0 2788 0 0 0 3 96% 0% - 15% 0 139 0 3 2277 2% 0 0 0 144 0 0 2504 0 0 0 3 96% 0% - 18% 0 144 0 3 2150 2% 0 0 0 130 0 0 2212 0 0 0 3 96% 0% - 13% 0 130 0 3 1879 3% 0 0 0 169 0 0 2937 80 0
Re: [PERFORM] RAID Configuration Sugestion
On Tue, Aug 30, 2005 at 09:37:17 -0300, Alvaro Nunes Melo [EMAIL PROTECTED] wrote: The server is a Dual Xeon 3.0 with 2 GB RAM and two SCSI disks. Our main doubt is what is the best configuration for the disks. We are thinking about use them in a RAID-0 array. Is this the best option? What do you suggest on partitioning? Separate partitions for the OS, data and pg_xlog? Our main goal is performance speedup. Disk space might not be a problem. I've read a lot here about movig pg_xlog to different partitions, and we'll surely make tests to see what configuration might be better. I've set up several servers with a config like this. Its not ideal, but there's no reason you can't enjoy the benefits of a snappy application. The best results I've had involve dedicating one drive to OS, swap, logs, tmp and everything and dedicate one drive to postgres. If you use *nix you can mount the second drive as /var/lib/pgsql (or where ever postgres lives on your server) with noatime as a mount option. In retrospect, you might have saved the money on the second CPU and gotten two more hard drives, but if you're running a dual task server (i.e. LAMP) you may appreciate the second CPU. The beauty of a server like this is that it puts more of the wizardry of creating a fast application into the hands of the app developer, which results in a better db schema, optimized queries and generally *thinking* about the performance of the code. I personally feel that to be a very rewarding aspect of my job. (As a hobby I program microntrollers that run at 4MHz and have only 256 bytes of RAM, so that could just be me.;-) -- Matthew Nuzum www.bearfruit.org ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Need indexes on empty tables for good performance ?
Actually the indexes on the child table do seem to get used - I just wanted to make sure there was no penalty not havingindexes on the empty parent tables. You are right - the parentis the best way to get at the unknown children ... From: Thomas F. O'Connell [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 30, 2005 6:15 AMTo: Lenard, Rohan (Rohan)Cc: pgsql-performance@postgresql.orgSubject: Re: [PERFORM] Need indexes on empty tables for good performance ? Rohan, You should note that in Postgres, indexes are not inherited by child tables. Also, it seems difficult to select from a child table whose name you don't know unless you access the parent. And if you are accessing the data via the parent, I'm reasonably certain that you will find that indexes aren't used (even if they exist on the children) as a result of the way the children are accessed. -- Thomas F. O'Connell Co-Founder, Information Architect Sitening, LLC Strategic Open Source: Open Your i http://www.sitening.com/ 110 30th Avenue North, Suite 6 Nashville, TN 37203-6320 615-469-5150615-469-5151 (fax) On Aug 22, 2005, at 10:41 PM, Lenard, Rohan (Rohan) wrote: I've read that indexes aren't used for COUNT(*) and I've noticed (7.3.x) with EXPLAIN that indexes never seem to be used on empty tables - is there any reason to have indexes on empty tables, or will postgresql never use them. This is not as silly as it sounds - with table inheritance you might have table children with the data and a parent that is empty. It'd be nice to make sure postgresql knows to never really look at the parent - especially is you don't know the names of all the children .. Thoughts ? thx, Rohan
Re: [PERFORM] High load and iowait but no disk access
On 30-Aug-05, at 12:15, Tom Lane wrote: =?ISO-8859-1?Q?R=E9my_Beaumont?= [EMAIL PROTECTED] writes: The stats of the NetApp do confirm that it is sitting idle. Really? CPU NFS CIFS HTTP TotalNet kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk DAFS FCP iSCSI FCP kB/s in out read write read write age hit time ty util in out 2% 0 0 0 139 0 0 2788 0 0 0 3 96% 0% - 15% 0 139 0 3 2277 2% 0 0 0 144 0 0 2504 0 0 0 3 96% 0% - 18% 0 144 0 3 2150 2% 0 0 0 130 0 0 2212 0 0 0 3 96% 0% - 13% 0 130 0 3 1879 3% 0 0 0 169 0 0 2937 80 0 0 3 96% 0% - 13% 0 169 0 4 2718 2% 0 0 0 139 0 0 2448 0 0 0 3 96% 0% - 12% 0 139 0 3 2096 I know zip about NetApps, but doesn't the 8th column indicate pretty steady disk reads? Yes, but they are very low. At 15% usage, it's pretty much sitting idle if you consider that the OS reports that one of the processor is spending more then 80% of it's time in IOwait. Rémy regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] High load and iowait but no disk access
On Mon, Aug 29, 2005 at 09:42:46AM -0400, Rémy Beaumont wrote: We have been trying to pinpoint what originally seem to be a I/O bottleneck but which now seems to be an issue with either Postgresql or RHES 3. Nope, it's an IO bottleneck. The behavior we see is that when running queries that do random reads on disk, IOWAIT goes over 80% and actual disk IO falls to a crawl at a throughput bellow 3000kB/s That's the sign of an IO bottleneck. The stats of the NetApp do confirm that it is sitting idle. Doing an strace on the Postgresql process shows that is it doing seeks and reads. So my question is where is this iowait time spent ? Waiting for the seeks. postgres doesn't do async io, so it requests a block, waits for it to come in, then requests another block, etc. The utilization on the netapp isn't going to be high because it doesn't have a queue of requests and can't do readahead because the IO is random. The only way to improve the situation would be to reduce the latency of the seeks. If I read the numbers right you're only getting about 130 seeks/s, which ain't great. I don't know how much latency the netapp adds in the this configuration; have you tried benchmarking direct-attach disks? Mike Stone ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] High load and iowait but no disk access
=?ISO-8859-1?Q?R=E9my_Beaumont?= [EMAIL PROTECTED] writes: The stats of the NetApp do confirm that it is sitting idle. Really? CPU NFS CIFS HTTP TotalNet kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk DAFS FCP iSCSI FCP kB/s in out read write read write age hit time ty util in out 2% 0 0 0 139 0 0 2788 0 0 0 3 96% 0% - 15% 0 139 0 3 2277 2% 0 0 0 144 0 0 2504 0 0 0 3 96% 0% - 18% 0 144 0 3 2150 2% 0 0 0 130 0 0 2212 0 0 0 3 96% 0% - 13% 0 130 0 3 1879 3% 0 0 0 169 0 0 2937 80 0 0 3 96% 0% - 13% 0 169 0 4 2718 2% 0 0 0 139 0 0 2448 0 0 0 3 96% 0% - 12% 0 139 0 3 2096 I know zip about NetApps, but doesn't the 8th column indicate pretty steady disk reads? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] High load and iowait but no disk access
=?ISO-8859-1?Q?R=E9my_Beaumont?= [EMAIL PROTECTED] writes: On 30-Aug-05, at 12:15, Tom Lane wrote: I know zip about NetApps, but doesn't the 8th column indicate pretty steady disk reads? Yes, but they are very low. Sure, but that's more or less what you'd expect if the thing is randomly seeking all over the disk :-(. Just because it's a NetApp doesn't mean it's got zero seek time. You did not say what sort of query this is, but I gather that it's doing an indexscan on a table that is not at all in index order. Possible solutions involve reverting to a seqscan (have you forced the planner to choose an indexscan here, either directly or by lowering random_page_cost?) or CLUSTERing the table by the index (which would need to be repeated periodically, so it's not a great answer). regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] High load and iowait but no disk access
On 30-Aug-05, at 12:29, Tom Lane wrote: =?ISO-8859-1?Q?R=E9my_Beaumont?= [EMAIL PROTECTED] writes: On 30-Aug-05, at 12:15, Tom Lane wrote: I know zip about NetApps, but doesn't the 8th column indicate pretty steady disk reads? Yes, but they are very low. Sure, but that's more or less what you'd expect if the thing is randomly seeking all over the disk :-(. Just because it's a NetApp doesn't mean it's got zero seek time. Per NetApp, the disk utilization percentage they report does include seek time, not just read/write operations. NetApp has been involved in trying to figure out what is going on and their claim is that the NetApp filer is not IO bound. You did not say what sort of query this is, but I gather that it's doing an indexscan on a table that is not at all in index order. Yes, most of those queries are doing an indexscan. It's a fresh restore of our production database that we have vacuumed/analyzed. Possible solutions involve reverting to a seqscan (have you forced the planner to choose an indexscan here, either directly or by lowering random_page_cost?) No. or CLUSTERing the table by the index (which would need to be repeated periodically, so it's not a great answer). Will try to cluster the tables and see if it changes anything. Still doesn't explain what is going on with those seeks. Thanks, Rémy regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] RAID Configuration Sugestion
Ron wrote: At 08:37 AM 8/30/2005, Alvaro Nunes Melo wrote: Hello, We are about to install a new PostgreSQL server, and despite of being a very humble configuration compared to the ones we see in the list, it's the biggest one we've got till now. The server is a Dual Xeon 3.0 with 2 GB RAM and two SCSI disks. Our main doubt is what is the best configuration for the disks. We are thinking about use them in a RAID-0 array. Is this the best option? What do you suggest on partitioning? Separate partitions for the OS, data and pg_xlog? This is _very_ modest HW. Unless your DB and/or DB load is similarly modest, you are not going to be happy with the performance of your DBMS. Well that is a pretty blanket statement. I have many customers who happily run in less hardware that what is mentioned above. It all depends on the application itself and how the database is utilized. At a minimum, for safety reasons you want 4 HDs: 2 for a RAID 1 set for the DB, and 2 for a RAID 1 set for the OS + pg_xlog. 2 extra HDs, even SCSI HDs, is cheap. Especially when compared to the cost of corrupted or lost data. Your real test is going to be prototyping the performance you need. A single RAID 1 mirror (don't use RAID 0) may be more than enough. However based on the fact that you speced Xeons my guess is you spent money on CPUs when you should have spent money on hard drives. If you still have the budget, I would suggest considering either what Ron suggested or possibly using a 4 drive RAID 10 instead. If you can't afford to put a couple more SCSI disks it may be worth while to put a software RAID 1 with ATA disks for the OS and swap and then use straight SCSI hardware RAID 1 for the DB. That will allow you to push any swap operations off to the OS disks without sacrificing the performance and reliability of the database itself. Sincerely, Joshua D. Drake HD's and RAM are cheap enough that you should be able to upgrade in more ways, but do at least that upgrade! Beyond that, the best ways to spend you limited $ are highly dependent on your exact DB and its usage pattern. Ron Peacetree ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Need indexes on empty tables for good performance ?
Lenard, Rohan (Rohan) wrote: Actually the indexes on the child table do seem to get used - I just wanted to make sure there was no penalty not having indexes on the empty parent tables. You are right - the parent is the best way to get at the unknown children ... Indexes are created in the inheritance process, iirc. However, index entries are not inherited, which means that index-based unique constraints don't properly get inherited. Best Wishes, Chris Travers Metatron Technology Consulting ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] RAID Configuration Sugestion
At 12:56 PM 8/30/2005, Joshua D. Drake wrote: Ron wrote: At 08:37 AM 8/30/2005, Alvaro Nunes Melo wrote: Hello, We are about to install a new PostgreSQL server, and despite of being a very humble configuration compared to the ones we see in the list, it's the biggest one we've got till now. The server is a Dual Xeon 3.0 with 2 GB RAM and two SCSI disks. Our main doubt is what is the best configuration for the disks. We are thinking about use them in a RAID-0 array. Is this the best option? What do you suggest on partitioning? Separate partitions for the OS, data and pg_xlog? This is _very_ modest HW. Unless your DB and/or DB load is similarly modest, you are not going to be happy with the performance of your DBMS. Well that is a pretty blanket statement. I have many customers who happily run in less hardware that what is mentioned above. It all depends on the application itself and how the database is utilized. If your customers run happily on 2 HD's, then IME they have very modest DB storage and/or DB performance needs. For safety reasons, the best thing to do if you only have 2 HD's is to run them as a RAID 1 with everything on them. The slightly better performing but considerably less safe alternative is to put the OS + logs on 1 HD and the DB on the other. Any resemblance to a semi-serious OLTP load will reduce either such system to an HD IO bound one with poor IO rates. If, as above, your DBMS is bounded by the performance of one HD, then you are AT BEST getting the raw IO rate of such a device: say ~70-80MB/s in average sustained raw sequential IO. Files system overhead and any seeking behavior will rapidly reduce that number to considerably less. Consider that the CPU - memory IO subsystem is easily capable of ~3.2GBps. So you are talking about slowing the DB server to at most ~1/40, maybe even as little as ~1/200, its potential under such circumstances. If your DB can fit completely in RAM and/or does light duty write IO, this may not be a serious issue. OTOH, once you start using those HD's to any reasonable extent, most of the rest of the investment you've made in server HW is wasted. As I keep saying, the highest priority in purchasing a DBMS is to make sure you have enough HD IO bandwidth. RAM comes second, and CPU is a distant third. At a minimum, for safety reasons you want 4 HDs: 2 for a RAID 1 set for the DB, and 2 for a RAID 1 set for the OS + pg_xlog. 2 extra HDs, even SCSI HDs, is cheap. Especially when compared to the cost of corrupted or lost data. Your real test is going to be prototyping the performance you need. A single RAID 1 mirror (don't use RAID 0) may be more than enough. However based on the fact that you speced Xeons my guess is you spent money on CPUs when you should have spent money on hard drives. I agree with Josh on both points. Don't use RAID 0 for persistent data unless you like losing data. Spend more on HDs and RAM and less on CPU's (fast FSB is far more important than high clock rate. In general buy the highest FSB with the slowest clock rate.). If fact, if you are that strapped for cash, exchange those 2 SCSI HD's for their $ equivalent in SATA HD's. The extra spindles will be well worth it. If you still have the budget, I would suggest considering either what Ron suggested or possibly using a 4 drive RAID 10 instead. IME, with only 4 HDs, it's usually better to split them them into two RAID 1's (one for the db, one for everything else including the logs) than it is to put everything on one RAID 10. YMMV. Ron Peacetree ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] High load and iowait but no disk access
Remy, The behavior we see is that when running queries that do random reads on disk, IOWAIT goes over 80% and actual disk IO falls to a crawl at a throughput bellow 3000kB/s (We usually average 4 kB/s to 8 kB/s on sequential read operations on the netapps) This seems pretty low for a NetApp -- you should be able to manage up to 180mb/s, if not higher. Are you sure it's configured correctly? -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] High load and iowait but no disk access
Have you tried a different kernel? We run with a netapp over NFS without any issues, but we have seen high IO-wait on other Dell boxes (running and not running postgres) and RHES 3. We have replaced a Dell PowerEdge 350 running RH 7.3 with a PE750 with more memory running RHES3 and it be bogged down with IO waits due to syslog messages writing to the disk, the old slower server could handle it fine. I don't know if it is a Dell thing or a RH kernel, but we try different kernels on our boxes to try to find one that works better. We have not found one that stands out over anotherconsistently but we have been moving away from Update 2 kernel (2.4.21-15.ELsmp) due to server lockup issues. Unfortunately we get the best disk throughput on our few remaining 7.3 boxes. Woody IGLASS Networks www.iglass.net From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rémy BeaumontSent: Monday, August 29, 2005 9:43 AMTo: pgsql-performance@postgresql.orgSubject: [PERFORM] High load and iowait but no disk access We have been trying to pinpoint what originally seem to be a I/O bottleneck but which now seems to be an issue with either Postgresql or RHES 3.We have the following test environment on which we can reproduce the problem:1) Test System ADell 6650 Quad Xeon Pentium 48 Gig of RAMOS: RHES 3 update 2Storage: NetApp FAS270 connected using an FC card using 10 disks2) Test System BDell Dual Xeon Pentium III2 Gig o RAMOS: RHES 3 update 2Storage: NetApp FAS920 connected using an FC card using 28 disksOur Database size is around 30G. The behavior we see is that when running queries that do random reads on disk, IOWAIT goes over 80% and actual disk IO falls to a crawl at a throughput bellow 3000kB/s (We usually average 4 kB/s to 8 kB/s on sequential read operations on the netapps)The stats of the NetApp do confirm that it is sitting idle. Doing an strace on the Postgresql process shows that is it doing seeks and reads.So my question is where is this iowait time spent ?Is there a way to pinpoint the problem in more details ?We are able to reproduce this behavior with Postgresql 7.4.8 and 8.0.3I have included the output of top,vmstat,strace and systat from the Netapp from System B while running a single query that generates this behavior.Rémytop output:06:27:28 up 5 days, 16:59, 6 users, load average: 1.04, 1.30, 1.0172 processes: 71 sleeping, 1 running, 0 zombie, 0 stoppedCPU states: cpu user nice system irq softirq iowait idletotal 2.7% 0.0% 1.0% 0.1% 0.2% 46.0% 49.5%cpu00 0.2% 0.0% 0.2% 0.0% 0.2% 2.2% 97.2%cpu01 5.3% 0.0% 1.9% 0.3% 0.3% 89.8% 1.9%Mem: 2061696k av, 2043936k used, 17760k free, 0k shrd, 3916k buff1566332k actv, 296648k in_d, 30504k in_cSwap: 16771584k av, 21552k used, 16750032k free 1933772k cachedPID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND30960 postgres 15 0 13424 10M 9908 D 2.7 0.5 2:00 1 postmaster30538 root 15 0 1080 764 524 S 0.7 0.0 0:43 0 sshd1 root 15 0 496 456 436 S 0.0 0.0 0:08 0 init2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/03 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/14 root 15 0 0 0 0 SW 0.0 0.0 0:01 0 keventd5 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/06 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd/19 root 15 0 0 0 0 SW 0.0 0.0 0:24 1 bdflush7 root 15 0 0 0 0 SW 0.0 0.0 6:53 1 kswapd8 root 15 0 0 0 0 SW 0.0 0.0 8:44 1 kscand10 root 15 0 0 0 0 SW 0.0 0.0 0:13 0 kupdated11 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd17 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 ahc_dv_0vmstat output procs memory swap io system cpur b swpd free buff cache si so bi bo in cs us sy id wa0 1 21552 17796 4872 1931928 2 3 3 1 27 6 2 1 7 30 1 21552 18044 4880 1931652 0 0 1652 0 397 512 1 2 50 470 1 21552 17976 4896 1931664 0 0 2468 0 407 552 2 2 50 471 0 21552 17984 4896 1931608 0 0 2124 0 418 538 3 3 48 460 1 21552 18028 4900 1931536 0 0 1592 0 385 509 1 3 50 460 1 21552 18040 4916 1931488 0 0 1620 820 419 581 2 2 50 460 1 21552 17968 4916 1931536 0 4 1708 4 402 554 3 1 50 461 1 21552 18052 4916 1931388 0 0 1772 0 409 531 3 1 49 470 1 21552 17912 4924 1931492 0 0 1772 0 408 565 3 1 48 480 1 21552 17932 4932 1931440 0 4 1356 4 391 545 5 0 49 460 1 21552 18320 4944 1931016 0 4 1500 840 414 571 1 1 48 500 1 21552 17872 4944 1931440 0 0 2116 0 392 496 1 5 46 480 1 21552 18060 4944 1931232 0 0 2232 0 423 597 1 2 48 491 1 21552 17684 4944 1931584 0 0 1752 0 395 537 1 1 50 480 1 21552 18000 4944 1931240 0 0 1576 0 401 549 0 1 50 49NetApp stats:CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk DAFS FCP iSCSI FCP kB/sin out read write read write age hit time ty util in out2% 0 0 0 139 0 0 2788 0 0 0 3 96% 0% - 15% 0 139 0 3 22772% 0 0 0 144 0 0 2504 0 0 0 3 96% 0% - 18% 0 144 0 3 21502% 0 0 0 130 0 0 2212 0 0 0 3 96% 0% - 13% 0 130 0 3 18793% 0 0 0 169 0 0 2937 80 0 0 3 96% 0% - 13% 0 169 0 4 27182% 0 0 0 139 0 0 2448 0 0 0 3 96% 0% - 12% 0 139 0 3 20962% 0 0 0 137 0 0 2116 0 0 0 3 96% 0% - 10% 0
Re: [PERFORM] High load and iowait but no disk access
On 30-Aug-05, at 14:32, Josh Berkus wrote: Remy, The behavior we see is that when running queries that do random reads on disk, IOWAIT goes over 80% and actual disk IO falls to a crawl at a throughput bellow 3000kB/s (We usually average 4 kB/s to 8 kB/s on sequential read operations on the netapps) This seems pretty low for a NetApp -- you should be able to manage up to 180mb/s, if not higher. Are you sure it's configured correctly? Hi Josh, The config has been reviewed by NetApp. We do get rates higher then 80mb/s, but on average, that's what we get. Do you have NetApp filers deployed ? How many spindles do you have in your volume ? On which OS are you running Postgres ? Thanks, Rémy -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Observation about db response time
On Sat, 27 Aug 2005 21:28:57 +0530 (IST) [EMAIL PROTECTED] threw this fish to the penguins: Hello Friends, We were having a database in pgsql7.4. The database was responding very slowly even after full vacuum (select count(*) from some_table_having_18000_records was taking 18 Sec). One comment here: select count(*) may seem like a good benchmark, but it's not generally. If your application really depends on this number, fine. Otherwise, you should measure performance with a real query from your application. The select count(*) can be very slow because it does not use indexes. We took a backup of that db and restored it back. Now the same db on same PC is responding fast (same query is taking 18 ms). This sounds like some index is getting gooped up. If you do a lot of deleting from tables, your indexes can collect dead space that vacuum can not reclaim. Try in sql reindex table my_slow_table for a suspect table. In the contrib directory of the postgresql distribution there is a script called reindexdb. You can run this to reindex your whole database. I also wonder about file system slowdowns. What hardware/OS/filesystem are you using? -- George -- Are the gods not just? Oh no, child. What would become of us if they were? (CSL) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] High load and iowait but no disk access
On 30-Aug-05, at 14:46, Anjan Dave wrote: I have seen references of changing the kernel io scheduler at boot time…not sure if it applies to RHEL3.0, or will help, but try setting ‘elevator=deadline’ during boot time or via grub.conf. That's only for RHEL 4.0. Have you tried running a simple ‘dd’ on the LUN? We get amazing performance using dd. The drives are in RAID10 configuration, right? NetApp has their own type of raid format (RAID4 aka WAFL) Rémy Thanks, Anjan From: Woody Woodring [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 30, 2005 2:30 PM To: 'Rémy Beaumont'; pgsql-performance@postgresql.org Subject: Re: [PERFORM] High load and iowait but no disk access Have you tried a different kernel? We run with a netapp over NFS without any issues, but we have seen high IO-wait on other Dell boxes (running and not running postgres) and RHES 3. We have replaced a Dell PowerEdge 350 running RH 7.3 with a PE750 with more memory running RHES3 and it be bogged down with IO waits due to syslog messages writing to the disk, the old slower server could handle it fine. I don't know if it is a Dell thing or a RH kernel, but we try different kernels on our boxes to try to find one that works better. We have not found one that stands out over another consistently but we have been moving away from Update 2 kernel (2.4.21-15.ELsmp) due to server lockup issues. Unfortunately we get the best disk throughput on our few remaining 7.3 boxes. Woody IGLASS Networks www.iglass.net From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rémy Beaumont Sent: Monday, August 29, 2005 9:43 AM To: pgsql-performance@postgresql.org Subject: [PERFORM] High load and iowait but no disk access We have been trying to pinpoint what originally seem to be a I/O bottleneck but which now seems to be an issue with either Postgresql or RHES 3. We have the following test environment on which we can reproduce the problem: 1) Test System A Dell 6650 Quad Xeon Pentium 4 8 Gig of RAM OS: RHES 3 update 2 Storage: NetApp FAS270 connected using an FC card using 10 disks 2) Test System B Dell Dual Xeon Pentium III 2 Gig o RAM OS: RHES 3 update 2 Storage: NetApp FAS920 connected using an FC card using 28 disks Our Database size is around 30G. The behavior we see is that when running queries that do random reads on disk, IOWAIT goes over 80% and actual disk IO falls to a crawl at a throughput bellow 3000kB/s (We usually average 4 kB/s to 8 kB/s on sequential read operations on the netapps) The stats of the NetApp do confirm that it is sitting idle. Doing an strace on the Postgresql process shows that is it doing seeks and reads. So my question is where is this iowait time spent ? Is there a way to pinpoint the problem in more details ? We are able to reproduce this behavior with Postgresql 7.4.8 and 8.0.3 I have included the output of top,vmstat,strace and systat from the Netapp from System B while running a single query that generates this behavior. Rémy top output: 06:27:28 up 5 days, 16:59, 6 users, load average: 1.04, 1.30, 1.01 72 processes: 71 sleeping, 1 running, 0 zombie, 0 stopped CPU states: cpu user nice system irq softirq iowait idle total 2.7% 0.0% 1.0% 0.1% 0.2% 46.0% 49.5% cpu00 0.2% 0.0% 0.2% 0.0% 0.2% 2.2% 97.2% cpu01 5.3% 0.0% 1.9% 0.3% 0.3% 89.8% 1.9% Mem: 2061696k av, 2043936k used, 17760k free, 0k shrd, 3916k buff 1566332k actv, 296648k in_d, 30504k in_c Swap: 16771584k av, 21552k used, 16750032k free 1933772k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 30960 postgres 15 0 13424 10M 9908 D 2.7 0.5 2:00 1 postmaster 30538 root 15 0 1080 764 524 S 0.7 0.0 0:43 0 sshd 1 root 15 0 496 456 436 S 0.0 0.0 0:08 0 init 2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0 3 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/1 4 root 15 0 0 0 0 SW 0.0 0.0 0:01 0 keventd 5 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0 6 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd/1 9 root 15 0 0 0 0 SW 0.0 0.0 0:24 1 bdflush 7 root 15 0 0 0 0 SW 0.0 0.0 6:53 1 kswapd 8 root 15 0 0 0 0 SW 0.0 0.0 8:44 1 kscand 10 root 15 0 0 0 0 SW 0.0 0.0 0:13 0 kupdated 11 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd 17 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 ahc_dv_0 vmstat output procs memory swap io system cpu r b swpd free buff cache si so bi bo in cs us sy id wa 0 1 21552 17796 4872 1931928 2 3 3 1 27 6 2 1 7 3 0 1 21552 18044 4880 1931652 0 0 1652 0 397 512 1 2 50 47 0 1 21552 17976 4896 1931664 0 0 2468 0 407 552 2 2 50 47 1 0 21552 17984 4896 1931608 0 0 2124 0 418 538 3 3 48 46 0 1 21552 18028 4900 1931536 0 0 1592 0 385 509 1 3 50 46 0 1 21552 18040 4916 1931488 0 0 1620 820 419 581 2 2 50 46 0 1 21552 17968 4916 1931536 0 4 1708 4 402 554 3 1 50 46 1 1 21552 18052 4916 1931388 0 0 1772 0 409 531 3 1 49 47 0 1 21552 17912 4924 1931492 0 0 1772 0 408 565 3 1 48 48 0 1 21552 17932 4932 1931440 0 4 1356 4 391 545 5 0
Re: [PERFORM] RAID Configuration Sugestion
On 8/30/05, Ron [EMAIL PROTECTED] wrote: If you still have the budget, I would suggest considering either what Ron suggested or possibly using a 4 drive RAID 10 instead. IME, with only 4 HDs, it's usually better to split them them into two RAID 1's (one for the db, one for everything else including the logs) than it is to put everything on one RAID 10. YMMV. This coresponds to what I have observed as well. Of course, we all know that work loads varry. Just a note for the OP who has only two drives, there are tools for a variety of OSs that monitor the S.M.A.R.T. features of the drive and give an early warning in case it senses impending failure. I've caught two drives before failure with these types of tools. Also note that when reading discussions of this nature you must take into consideration the value of your data. For some people, restoring from a nightly backup is inconvienent, but not life-or-death. Some people even do twice-daily backups so that in case of a failure they can recover with little loss of data. This might be a good way to mitigate the cost of expensive server hardware. If you cannot afford to lose any data then you need to consider it imperitive to use some type of RAID setup (not RAID 0) and to achieve great performance you'll want more than 2 drives. -- Matthew Nuzum www.bearfruit.org ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] High load and iowait but no disk access
I have seen references of changing the kernel io scheduler at boot timenot sure if it applies to RHEL3.0, or will help, but try setting elevator=deadline during boot time or via grub.conf. Have you tried running a simple dd on the LUN? The drives are in RAID10 configuration, right? Thanks, Anjan From: Woody Woodring [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 30, 2005 2:30 PM To: 'Rémy Beaumont'; pgsql-performance@postgresql.org Subject: Re: [PERFORM] High load and iowait but no disk access Have you tried a different kernel? We run with a netapp over NFS without any issues, but we have seen high IO-wait on other Dell boxes (running and not running postgres) and RHES 3. We have replaced a Dell PowerEdge 350 running RH 7.3 with a PE750 with more memory running RHES3 and it be bogged down with IO waits due to syslog messages writing to the disk, the old slower server could handle it fine. I don't know if it is a Dell thing or a RH kernel, but we try different kernels on our boxes to try to find one that works better. We have not found one that stands out over anotherconsistently but we have been moving away from Update 2 kernel (2.4.21-15.ELsmp) due to server lockup issues. Unfortunately we get the best disk throughput on our few remaining 7.3 boxes. Woody IGLASS Networks www.iglass.net From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rémy Beaumont Sent: Monday, August 29, 2005 9:43 AM To: pgsql-performance@postgresql.org Subject: [PERFORM] High load and iowait but no disk access We have been trying to pinpoint what originally seem to be a I/O bottleneck but which now seems to be an issue with either Postgresql or RHES 3. We have the following test environment on which we can reproduce the problem: 1) Test System A Dell 6650 Quad Xeon Pentium 4 8 Gig of RAM OS: RHES 3 update 2 Storage: NetApp FAS270 connected using an FC card using 10 disks 2) Test System B Dell Dual Xeon Pentium III 2 Gig o RAM OS: RHES 3 update 2 Storage: NetApp FAS920 connected using an FC card using 28 disks Our Database size is around 30G. The behavior we see is that when running queries that do random reads on disk, IOWAIT goes over 80% and actual disk IO falls to a crawl at a throughput bellow 3000kB/s (We usually average 4 kB/s to 8 kB/s on sequential read operations on the netapps) The stats of the NetApp do confirm that it is sitting idle. Doing an strace on the Postgresql process shows that is it doing seeks and reads. So my question is where is this iowait time spent ? Is there a way to pinpoint the problem in more details ? We are able to reproduce this behavior with Postgresql 7.4.8 and 8.0.3 I have included the output of top,vmstat,strace and systat from the Netapp from System B while running a single query that generates this behavior. Rémy top output: 06:27:28 up 5 days, 16:59, 6 users, load average: 1.04, 1.30, 1.01 72 processes: 71 sleeping, 1 running, 0 zombie, 0 stopped CPU states: cpu user nice system irq softirq iowait idle total 2.7% 0.0% 1.0% 0.1% 0.2% 46.0% 49.5% cpu00 0.2% 0.0% 0.2% 0.0% 0.2% 2.2% 97.2% cpu01 5.3% 0.0% 1.9% 0.3% 0.3% 89.8% 1.9% Mem: 2061696k av, 2043936k used, 17760k free, 0k shrd, 3916k buff 1566332k actv, 296648k in_d, 30504k in_c Swap: 16771584k av, 21552k used, 16750032k free 1933772k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 30960 postgres 15 0 13424 10M 9908 D 2.7 0.5 2:00 1 postmaster 30538 root 15 0 1080 764 524 S 0.7 0.0 0:43 0 sshd 1 root 15 0 496 456 436 S 0.0 0.0 0:08 0 init 2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0 3 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/1 4 root 15 0 0 0 0 SW 0.0 0.0 0:01 0 keventd 5 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0 6 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd/1 9 root 15 0 0 0 0 SW 0.0 0.0 0:24 1 bdflush 7 root 15 0 0 0 0 SW 0.0 0.0 6:53 1 kswapd 8 root 15 0 0 0 0 SW 0.0 0.0 8:44 1 kscand 10 root 15 0 0 0 0 SW 0.0 0.0 0:13 0 kupdated 11 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd 17 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 ahc_dv_0 vmstat output procs memory swap io system cpu r b swpd free buff cache si so bi bo in cs us sy id wa 0 1 21552 17796 4872 1931928 2 3 3 1 27 6 2 1 7 3 0 1 21552 18044 4880 1931652 0 0 1652 0 397 512 1 2 50 47 0 1 21552 17976 4896 1931664 0 0 2468 0 407 552 2 2 50 47 1 0 21552 17984 4896 1931608 0 0 2124 0 418 538 3 3 48 46 0 1 21552 18028 4900 1931536 0 0 1592 0 385 509 1 3 50 46 0 1 21552 18040 4916 1931488 0 0 1620 820 419 581 2 2 50 46 0 1 21552 17968 4916 1931536 0 4 1708 4 402 554 3 1 50 46 1 1 21552 18052 4916 1931388 0 0 1772 0 409 531 3 1 49 47 0 1 21552 17912 4924 1931492 0 0 1772 0 408 565 3 1 48 48 0 1 21552 17932 4932 1931440 0 4 1356 4 391 545 5 0 49 46 0 1 21552 18320 4944 1931016 0 4 1500 840 414 571 1 1 48 50 0 1 21552 17872 4944 1931440 0 0 2116 0 392 496 1 5 46 48 0 1 21552 18060 4944 1931232 0 0 2232 0 423 597 1 2 48 49 1 1 21552 17684 4944 1931584 0 0 1752 0 395 537 1 1 50 48 0 1
Re: [PERFORM] RAID Configuration Sugestion
If you still have the budget, I would suggest considering either what Ron suggested or possibly using a 4 drive RAID 10 instead. IME, with only 4 HDs, it's usually better to split them them into two RAID 1's (one for the db, one for everything else including the logs) than it is to put everything on one RAID 10. YMMV. Really? That's interesting. My experience is different, I assume SCSI? Software/Hardware Raid? Sincerely, Joshua D. Drake Ron Peacetree ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] High load and iowait but no disk access
This might be optimal behavior from the hardware. Random reads are hard to optimize for--except if you have enough physical memory to hold the entire dataset. Cached reads (either in array controller or OS buffer cache) should return nearly immediately. But random reads probably aren't cached. And any read-ahead alogorithms or other types of performance enhancements in the hardware or OS go out the window--because the behavior isn't predictable. Each time a drive spindle needs to move to a new track, it requires at least a couple of miliseconds. Sequential reads only require this movement infrequently. But random reads may be forcing this movement for every IO operation. Since the bottleneck in random reads is the physical hard drives themselves, everything else stands around waiting. Fancy hardware can optimize everything else -- writes with write cache, sequential reads with read-ahead and read cache. But there's no real solution to a purely random read workload except perhaps creating different disk groups to help avoid spindle contention. I like this tool: http://www.soliddata.com/products/iotest.html It allows you to select pure workloads (read/write/sequential/random), and it runs against raw devices, so you bypass the OS buffer cache. When I've run it I've always seen sequential activity get much much higher throughput than random. Quoting Anjan Dave [EMAIL PROTECTED]: I have seen references of changing the kernel io scheduler at boot time...not sure if it applies to RHEL3.0, or will help, but try setting 'elevator=deadline' during boot time or via grub.conf. Have you tried running a simple 'dd' on the LUN? The drives are in RAID10 configuration, right? Thanks, Anjan _ From: Woody Woodring [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 30, 2005 2:30 PM To: 'Rémy Beaumont'; pgsql-performance@postgresql.org Subject: Re: [PERFORM] High load and iowait but no disk access Have you tried a different kernel? We run with a netapp over NFS without any issues, but we have seen high IO-wait on other Dell boxes (running and not running postgres) and RHES 3. We have replaced a Dell PowerEdge 350 running RH 7.3 with a PE750 with more memory running RHES3 and it be bogged down with IO waits due to syslog messages writing to the disk, the old slower server could handle it fine. I don't know if it is a Dell thing or a RH kernel, but we try different kernels on our boxes to try to find one that works better. We have not found one that stands out over another consistently but we have been moving away from Update 2 kernel (2.4.21-15.ELsmp) due to server lockup issues. Unfortunately we get the best disk throughput on our few remaining 7.3 boxes. Woody IGLASS Networks www.iglass.net _ From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rémy Beaumont Sent: Monday, August 29, 2005 9:43 AM To: pgsql-performance@postgresql.org Subject: [PERFORM] High load and iowait but no disk access We have been trying to pinpoint what originally seem to be a I/O bottleneck but which now seems to be an issue with either Postgresql or RHES 3. We have the following test environment on which we can reproduce the problem: 1) Test System A Dell 6650 Quad Xeon Pentium 4 8 Gig of RAM OS: RHES 3 update 2 Storage: NetApp FAS270 connected using an FC card using 10 disks 2) Test System B Dell Dual Xeon Pentium III 2 Gig o RAM OS: RHES 3 update 2 Storage: NetApp FAS920 connected using an FC card using 28 disks Our Database size is around 30G. The behavior we see is that when running queries that do random reads on disk, IOWAIT goes over 80% and actual disk IO falls to a crawl at a throughput bellow 3000kB/s (We usually average 4 kB/s to 8 kB/s on sequential read operations on the netapps) The stats of the NetApp do confirm that it is sitting idle. Doing an strace on the Postgresql process shows that is it doing seeks and reads. So my question is where is this iowait time spent ? Is there a way to pinpoint the problem in more details ? We are able to reproduce this behavior with Postgresql 7.4.8 and 8.0.3 I have included the output of top,vmstat,strace and systat from the Netapp from System B while running a single query that generates this behavior. Rémy top output: 06:27:28 up 5 days, 16:59, 6 users, load average: 1.04, 1.30, 1.01 72 processes: 71 sleeping, 1 running, 0 zombie, 0 stopped CPU states: cpu user nice system irq softirq iowait idle total 2.7% 0.0% 1.0% 0.1% 0.2% 46.0% 49.5% cpu00 0.2% 0.0% 0.2% 0.0% 0.2% 2.2% 97.2% cpu01 5.3% 0.0% 1.9% 0.3% 0.3% 89.8% 1.9% Mem: 2061696k av, 2043936k used, 17760k free, 0k shrd, 3916k buff 1566332k actv, 296648k in_d, 30504k in_c Swap: 16771584k av, 21552k used, 16750032k free 1933772k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 30960 postgres 15 0 13424 10M 9908 D 2.7 0.5 2:00 1
[PERFORM] When to do a vacuum for highly active table
We have a highly active table that has virtually all entries updated every 5 minutes. Typical size of the table is 50,000 entries, and entries have grown fat. We are currently vaccuming hourly, and towards the end of the hour we are seeing degradation, when compared to the top of the hour. Vaccum is slowly killing our system, as it is starting to take up to 10 minutes, and load at the time of vacuum is 6+ on a Linux box. During the vacuum, overall system is goin unresponsive, then comes back once vacuum completes. If we run vacuum less frequently, degradation continues to the point that we can't keep up with the throughput, plus vacuum takes longer anyway. Becoming quite a pickle:-) We are thinking of splitting the table in two: the part the updates often, and the part the updates infrequently as we suspect that record size impacts vacuum. Any ideas? Thanks, Mark - ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] When to do a vacuum for highly active table
Markus Benne [EMAIL PROTECTED] writes: We have a highly active table that has virtually all entries updated every 5 minutes. Typical size of the table is 50,000 entries, and entries have grown fat. We are currently vaccuming hourly, and towards the end of the hour we are seeing degradation, when compared to the top of the hour. On something like this, you really need to be vacuuming more often not less so; I'd think about how to do it every five or ten minutes rather than backing off. With only 50K rows it should really not take more than a couple of seconds to do the vacuum. When you wait till there are 600K dead rows, it's going to take awhile, plus you are suffering across-the-board performance degradation from all the dead rows. If you are using PG 8.0, there are some vacuum cost knobs you can fiddle with to slow down vacuum so it doesn't impose as much I/O load. Ideally you could get it to where you could run vacuum as often as you need to without noticing much impact on foreground processing. If you're not using 8.0 ... maybe it's time to update. Another thing you might want to do is look at vacuum verbose output, which will give you some idea of the time spent in each step. It might be there are specific aspects that could be improved. We are thinking of splitting the table in two: the part the updates often, and the part the updates infrequently as we suspect that record size impacts vacuum. You just said that virtually all rows update constantly --- where's the infrequent part? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] When to do a vacuum for highly active table
On Tue, Aug 30, 2005 at 05:29:17PM -0400, Tom Lane wrote: Markus Benne [EMAIL PROTECTED] writes: We have a highly active table that has virtually all entries updated every 5 minutes. Typical size of the table is 50,000 entries, and entries have grown fat. ... We are thinking of splitting the table in two: the part the updates often, and the part the updates infrequently as we suspect that record size impacts vacuum. You just said that virtually all rows update constantly --- where's the infrequent part? I think he means splitting it vertically, instead of horizontally, and it sounds like an excellent idea, if a large enough portion of each record is in fact mostly fixed. Otherwise, PostgreSQL is copying data multiple times, only to have the data expire as part of a dead row. I've already started to notice such issues with postgresql - but more because I'm using low-end hardware, and I'm projecting the effect for when our database becomes much larger with much higher demand on the database. This is the sort of scenario where a database without transactional integrity would significantly out-perform one designed around it. If records are fixed sized, and updated in place, these problems would occur far less often. Is it heresy to suggest MySQL in here? :-) I switched from MySQL to PostgreSQL several months ago, and haven't looked back - but they do work differently, and for certain uses, one can destroy the other. Using a MyISAM table would be the way I would go with this sort of problem. Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] 'Real' auto vacuum?
This is a wild and crazy thought which I am sure is invalid for some good reason. But why can't postgres just vacuum itself as it goes along? When a row is orphaned it's added to a list of possibly available rows. When a new row is needed the list of possible rows is examined and the first one with a transaction id less then the lowest running transaction id is chosen to be the new row? These rows can be in a heap so it's really fast to find one. Like magic - no more vacuuming. No more holes for people to fall into. Is this an oversimplification of the problem? Ralph ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] 'Real' auto vacuum?
On Wed, Aug 31, 2005 at 10:21:20AM +1200, Ralph Mason wrote: This is a wild and crazy thought which I am sure is invalid for some good reason. But why can't postgres just vacuum itself as it goes along? When a row is orphaned it's added to a list of possibly available rows. When a new row is needed the list of possible rows is examined and the first one with a transaction id less then the lowest running transaction id is chosen to be the new row? These rows can be in a heap so it's really fast to find one. Like magic - no more vacuuming. No more holes for people to fall into. Yes please. :-) Is this an oversimplification of the problem? But, yeah. It's probably not that easy, especially with really big databases. Where is this free list stored? How efficient is it to keep track of the lowest running transaction at all times? How does one synchronize access to this free list, to ensure that processes don't block up waiting for access to the free list? Is the fre list journalled to prevent corruption, and the accidental re-use of a still in use row? And, there would be a cost to scanning this list on every insert or update. As an outsider (like you?) I see the current model as a design flaw as well. A neat and tidy model on paper. Not so nice in real life. The need to vacuum in batch mode, to keep the database from dying, seems intuitively bad. I think there must be answers to this problem. Even simple optimizations, such as defining a table such that any delete or update within a table, upon commit, will attempt to vacuum just the rows that should not be considered free for any new transactions. If it's in use by an active transaction, oh well. It can be picked up by a batch run of vacuum. If it's free though - let's do it now. I think any optimizations we come up with, will be more happily accepted with a working patch that causes no breakage... :-) Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] 'Real' auto vacuum?
Ralph, When a row is orphaned it's added to a list of possibly available rows. When a new row is needed the list of possible rows is examined and the first one with a transaction id less then the lowest running transaction id is chosen to be the new row? These rows can be in a heap so it's really fast to find one. This is the long-term plan.However, it's actually a lot harder than it sounds. Patches welcome. -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] RAID Configuration Sugestion
At 03:27 PM 8/30/2005, Joshua D. Drake wrote: If you still have the budget, I would suggest considering either what Ron suggested or possibly using a 4 drive RAID 10 instead. IME, with only 4 HDs, it's usually better to split them them into two RAID 1's (one for the db, one for everything else including the logs) than it is to put everything on one RAID 10. YMMV. Really? That's interesting. My experience is different, I assume SCSI? Software/Hardware Raid? The issue exists regardless of technologies used, although the technology used does affect when things become an irritation or serious problem. The issue with everything on the same HD set seems to be that under light loads anything works reasonably well, but as load increases contention between DB table access, OS access, and xlog writes can cause performance problems. In particular, _everything_ else hangs while logs are being written with everything on the same HD set. Thus leaving you with the nasty choices of small log writes that cause more seeking behavior, and the resultant poor overall HD IO performance, or large log writes that basically freeze the server until they are done. Having the logs on a different HD, and if possible different IO bus, reduces this effect to a minimum and seems to be a better choice than the shared everything approach. Although this effect seems largest when there are fewest HDs, the general pattern is that one should use as many spindles as one can make use of and that they should be as dedicated as possible in their purpose(s). That's why the TPC bench marked systems tend to have literally 100's of HD's and they tend to be split into very focused purposes. Ron Peacetree ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] When to do a vacuum for highly active table
[EMAIL PROTECTED] (Markus Benne) writes: We have a highly active table that has virtually all entries updated every 5 minutes. Typical size of the table is 50,000 entries, and entries have grown fat. We are currently vaccuming hourly, and towards the end of the hour we are seeing degradation, when compared to the top of the hour. You're not vacuuming the table nearly often enough. You should vacuum this table every five minutes, and possibly more often than that. [We have some tables like that, albeit smaller than 50K entries, which we vacuum once per minute in production...] We are thinking of splitting the table in two: the part the updates often, and the part the updates infrequently as we suspect that record size impacts vacuum. There's *some* merit to that. You might discover that there's a hot spot that needs to be vacuumed once per minute. But it may be simpler to just hit the table with a vacuum once every few minutes even though some tuples are seldom updated. -- output = reverse(gro.gultn @ enworbbc) http://cbbrowne.com/info/spreadsheets.html Signs of a Klingon Programmer #3: By filing this TPR you have challenged the honor of my family. Prepare to die! ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] When to do a vacuum for highly active table
[EMAIL PROTECTED] writes: I think he means splitting it vertically, instead of horizontally, and it sounds like an excellent idea, if a large enough portion of each record is in fact mostly fixed. Otherwise, PostgreSQL is copying data multiple times, only to have the data expire as part of a dead row. Only up to a point. Fields that are wide enough to get toasted out-of-line (multiple Kb) do not get physically copied if there's a row update that doesn't affect them. We don't really have enough information about his table to guess whether there's any point in manually partitioning the columns, but my leaning would be probably not --- the overhead in joining the resulting two tables would be high enough that you'd need a heck of a big improvement to justify it. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] 'Real' auto vacuum?
[EMAIL PROTECTED] wrote: But, yeah. It's probably not that easy, especially with really big databases. Where is this free list stored? How efficient is it to keep track of the lowest running transaction at all times? How does one synchronize access to this free list, to ensure that processes don't block up waiting for access to the free list? Is the fre list journalled to prevent corruption, and the accidental re-use of a still in use row? And, there would be a cost to scanning this list on every insert or update. I suspect the freelist could be stored as an index, and just handily postgres supports those out of the box.There would be a cost yes, but then what is the cost of adding pages to the file all the time? I guess as with all things there is no one size fits all, so perhaps you could turn it off - although I expect for 99.9% of the cases 'on' would be the better choice. If it gets broken there is already the reindex code that can fix it. A coherency / fixing / recover of a table command would probably be a useful tool anyway. As an outsider (like you?) I see the current model as a design flaw as well. A neat and tidy model on paper. Not so nice in real life. The need to vacuum in batch mode, to keep the database from dying, seems intuitively bad. We have a script that vacuums the database every 5 minutes, excessive - yes, but turns out that any less is no good really. I think that this is sub optimal, the DB work keeps running, but the vacuum can slow down other tasks. It also probably flushes data that we would need out of the page cache so it can look at data that isn't used often as the vacuum runs. Not the most optimal data access pattern I could imagine. I think there must be answers to this problem. Even simple optimizations, such as defining a table such that any delete or update within a table, upon commit, will attempt to vacuum just the rows that should not be considered free for any new transactions. If it's in use by an active transaction, oh well. It can be picked up by a batch run of vacuum. If it's free though - let's do it now. Anything would be good - I think it's the achilles heel of postgres. Perhaps there is something simple like that could fix 95% of the problem. I think any optimizations we come up with, will be more happily accepted with a working patch that causes no breakage... :-) I am sure they would. Cheers Ralph ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] RAID Configuration Sugestion
On Tue, Aug 30, 2005 at 07:02:28PM -0400, Ron wrote: purpose(s). That's why the TPC bench marked systems tend to have literally 100's of HD's and they tend to be split into very focused purposes. Of course, TPC benchmark systems are constructed such that cost and storage capacity are irrelevant--in the real world things tend to be more complicated. Mike Stone ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] RAID Configuration Sugestion
At 08:04 PM 8/30/2005, Michael Stone wrote: On Tue, Aug 30, 2005 at 07:02:28PM -0400, Ron wrote: purpose(s). That's why the TPC bench marked systems tend to have literally 100's of HD's and they tend to be split into very focused purposes. Of course, TPC benchmark systems are constructed such that cost and storage capacity are irrelevant--in the real world things tend to be more complicated. The scary thing is that I've worked on RW production systems that bore a striking resemblance to a TPC benchmark system. As you can imagine, they uniformly belonged to BIG organizations (read: lot's 'o $$$) who were using the systems for mission critical stuff where either it was company existence threatening for the system to be done, or they would lose much $$$ per min of down time, or both. Financial institutions, insurance companies, central data mines for Fortune 2000 companies, etc _all_ build systems that push the state of the art in how much storage can be managed and how many HDs, CPUs, RAM DIMMs, etc are usable. Historically, this has been the sole province of Oracle and DB2 on the SW side and equally outrageously priced custom HW. Clearly, I'd like to see PostgreSQL change that ;-) Ron Peacetree ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] RAID Configuration Sugestion
On Tue, Aug 30, 2005 at 08:41:40PM -0400, Ron wrote: The scary thing is that I've worked on RW production systems that bore a striking resemblance to a TPC benchmark system. As you can imagine, they uniformly belonged to BIG organizations (read: lot's 'o $$$) who were using the systems for mission critical stuff where either it was company existence threatening for the system to be done, or they would lose much $$$ per min of down time, or both. Yeah, and that market is relevant to someone with one dell server and 2 hard disks how? Mike Stone ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] RAID Configuration Sugestion
At 08:43 PM 8/30/2005, Michael Stone wrote: On Tue, Aug 30, 2005 at 08:41:40PM -0400, Ron wrote: The scary thing is that I've worked on RW production systems that bore a striking resemblance to a TPC benchmark system. As you can imagine, they uniformly belonged to BIG organizations (read: lot's 'o $$$) who were using the systems for mission critical stuff where either it was company existence threatening for the system to be done, or they would lose much $$$ per min of down time, or both. Yeah, and that market is relevant to someone with one dell server and 2 hard disks how? Because successful small companies that _start_ with one small server and 2 HDs grow to _become_ companies that need far more HW; ...and in the perfect world their SW scales to their increased needs... _Without_ exponentially increasing their costs or overhead (as Oracle and DB2 currently do) THIS is the real long term promise of OS DBMS. Ron Peacetree ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] 'Real' auto vacuum?
When a row is orphaned it's added to a list of possibly available rows. When a new row is needed the list of possible rows is examined and the first one with a transaction id less then the lowest running transaction id is chosen to be the new row? These rows can be in a heap so it's really fast to find one. This is the long-term plan.However, it's actually a lot harder than it sounds. Patches welcome. Some ETA? Since that would be the most welcome addition for us. We have few very heavily updated databases where table bloat and constant vacuuming is killing performance. Mindaugas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly