Re: [PERFORM] select count(*) very slow on an already vacuumed table.
The problem is that i want to know if i need a Hardware upgrade at the moment. Eg i have another table rfis which contains ~ .6 million records. SELECT count(*) from rfis where sender_uid > 0; ++ | count | ++ | 564870 | ++ Time: 117560.635 ms Which is approximate 4804 records per second. Is it an acceptable performance on the hardware below: RAM: 2 GB DISKS: ultra160 , 10 K , 18 GB Processor: 2* 2.0 Ghz Xeon What kind of upgrades shoud be put on the server for it to become reasonable fast. Regds mallah. Richard Huxton wrote: On Wednesday 14 April 2004 18:53, Rajesh Kumar Mallah wrote: Hi I have .5 million rows in a table. My problem is select count(*) takes ages. VACUUM FULL does not help. can anyone please tell me how to i enhance the performance of the setup. SELECT count(*) from eyp_rfi; If this is the actual query you're running, and you need a guaranteed accurate result, then you only have one option: write a trigger function to update a table_count table with every insert/delete to eyp_rfi. There is loads of info on this (and why it isn't as simple as you might think) in the archives. First though: 1. Is this the actual query, or just a representation? 2. Do you need an accurate figure or just something "near enough"? ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[PERFORM] linux distro for better pg performance
Hi, I am using pg from 3 y. and generaly I do not have big problems with it. I am searching for best pg distro to run pg (7.4.1). At the moment I am using RedHat AS 3.0, but I think it have some performance problems (I am not sure). My configuration: P4 2.8 GHz 1 GB RAM 120 GB IDE 7200 disk. Kernel version 2.4.21-4.EL (it is the installation vesrion for rh 3.0) . My problems: If I run some query with many reads, I see a massive disk transfer : procs memory swap io system cpu r b swpd free buff cache si sobibo incs us sy id wa 0 0 0 261724 3252 67074800 0 4 10519 0 0 100 0 0 0 0 261724 3252 67074800 0 0 10111 0 0 100 0 0 0 0 261724 3260 67074800 0 4 10419 0 0 100 0 0 1 0 259684 3268 67411200 964 7 13157 0 0 95 4 1 0 0 119408 3288 80854000 27960 0 572 630 13 14 24 49 1 1 0 15896 3292 91443600 7984 44744 531 275 11 18 24 47 0 2 0 16292 3296 92499600 4145 6413 384 176 2 5 0 92 0 1 0 19928 3316 92884400 11805 13335 497 388 5 9 5 81 0 3 0 19124 3296 92445200 3153 19164 287 295 5 11 16 68 0 1 0 15956 3304 93298400 536 6812 366 123 4 6 3 87 0 2 0 24956 3300 92141600 1931 22936 And if I run top, I see a big iowait % (some times 70-80) and very low user % (10-15). I readet many docs about this problem, but do not find any solution. My question: If some one is using RH 3.0, pls post some result or suggestions for it performance with pg . What is the best linux distro for pg? Can I get better performance by using 15K SCSI disk ? Or it will be better to have more RAM (2 or 3 GB) ? regards, ivan. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] select count(*) very slow on an already vacuumed table.
The relation size for this table is 1.7 GB tradein_clients=# SELECT public.relation_size ('general.rfis'); +--+ | relation_size| +--+ |1,762,639,872 | +--+ (1 row) Regds mallah. Rajesh Kumar Mallah wrote: The problem is that i want to know if i need a Hardware upgrade at the moment. Eg i have another table rfis which contains ~ .6 million records. SELECT count(*) from rfis where sender_uid > 0; ++ | count | ++ | 564870 | ++ Time: 117560.635 ms Which is approximate 4804 records per second. Is it an acceptable performance on the hardware below: RAM: 2 GB DISKS: ultra160 , 10 K , 18 GB Processor: 2* 2.0 Ghz Xeon What kind of upgrades shoud be put on the server for it to become reasonable fast. Regds mallah. Richard Huxton wrote: On Wednesday 14 April 2004 18:53, Rajesh Kumar Mallah wrote: Hi I have .5 million rows in a table. My problem is select count(*) takes ages. VACUUM FULL does not help. can anyone please tell me how to i enhance the performance of the setup. SELECT count(*) from eyp_rfi; If this is the actual query you're running, and you need a guaranteed accurate result, then you only have one option: write a trigger function to update a table_count table with every insert/delete to eyp_rfi. There is loads of info on this (and why it isn't as simple as you might think) in the archives. First though: 1. Is this the actual query, or just a representation? 2. Do you need an accurate figure or just something "near enough"? ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] select count(*) very slow on an already vacuumed table.
On Thursday 15 April 2004 08:10, Rajesh Kumar Mallah wrote: > The problem is that i want to know if i need a Hardware upgrade > at the moment. > > Eg i have another table rfis which contains ~ .6 million records. > SELECT count(*) from rfis where sender_uid > 0; > Time: 117560.635 ms > > Which is approximate 4804 records per second. Is it an acceptable > performance on the hardware below: > > RAM: 2 GB > DISKS: ultra160 , 10 K , 18 GB > Processor: 2* 2.0 Ghz Xeon Hmm - doesn't seem good, does it? If you run it again, is it much faster (since the data should be cached then)? What does "vmstat 10" show while you're running the query? One thing you should have done is read the performance tuning guide at: http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php The default values are very conservative, and you will need to change them. > What kind of upgrades shoud be put on the server for it to become > reasonable fast. If you've only got one disk, then a second disk for OS/logging. Difficult to say more without knowing numbers of users/activity etc. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] [ SOLVED ] select count(*) very slow on an already
Hi, The problem was solved by reloading the Table. the query now takes only 3 seconds. But that is not a solution. The problem is that such phenomenon obscures our judgement used in optimising queries and database. If a query runs slow we really cant tell if its a problem with query itself , hardware or dead rows. I already did vacumm full on the table but it still did not have that effect on performance. In fact the last figures were after doing a vacuum full. Can there be any more elegent solution to this problem. Regds Mallah. Richard Huxton wrote: On Thursday 15 April 2004 08:10, Rajesh Kumar Mallah wrote: The problem is that i want to know if i need a Hardware upgrade at the moment. Eg i have another table rfis which contains ~ .6 million records. SELECT count(*) from rfis where sender_uid > 0; Time: 117560.635 ms Which is approximate 4804 records per second. Is it an acceptable performance on the hardware below: RAM: 2 GB DISKS: ultra160 , 10 K , 18 GB Processor: 2* 2.0 Ghz Xeon Hmm - doesn't seem good, does it? If you run it again, is it much faster (since the data should be cached then)? What does "vmstat 10" show while you're running the query? One thing you should have done is read the performance tuning guide at: http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php The default values are very conservative, and you will need to change them. What kind of upgrades shoud be put on the server for it to become reasonable fast. If you've only got one disk, then a second disk for OS/logging. Difficult to say more without knowing numbers of users/activity etc. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] select count(*) very slow on an already vacuumed table.
Richard Huxton wrote: On Thursday 15 April 2004 08:10, Rajesh Kumar Mallah wrote: The problem is that i want to know if i need a Hardware upgrade at the moment. Eg i have another table rfis which contains ~ .6 million records. SELECT count(*) from rfis where sender_uid > 0; Time: 117560.635 ms Which is approximate 4804 records per second. Is it an acceptable performance on the hardware below: RAM: 2 GB DISKS: ultra160 , 10 K , 18 GB Processor: 2* 2.0 Ghz Xeon Hmm - doesn't seem good, does it? If you run it again, is it much faster (since the data should be cached then)? What does "vmstat 10" show while you're running the query? One thing you should have done is read the performance tuning guide at: http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php The default values are very conservative, and you will need to change them. Hi, Thanks for the interest . my config are not the default ones. i was running iostat while running the query. Looks like one of the disks doesnt' go past a read performance of 20 ,000 KBytes/sec while the other disk it goes as high as 40,000 . What i am ding currently is loading the table in both the disks and compare the table scan speeds. The performance is definitely better in the newly loaded table in the other disk . the load in server is 13 because i am simultaneously re-loading the data in other table. rt2=# SELECT count(*) from rfis where sender_uid > 0; ++ | count | ++ | 564870 | ++ (1 row) Time: 10288.359 ms rt2=# shall post the comparitive details under normal load soon regds mallah. What kind of upgrades shoud be put on the server for it to become reasonable fast. If you've only got one disk, then a second disk for OS/logging. Difficult to say more without knowing numbers of users/activity etc. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[PERFORM] Toooo many context switches (maybe SLES8?)
Hi, we have a complex modperl database application using postgresql 7.4.1 on a new Dual Xeon MP Machine with SLES8 which seems to generate too much context switches (way more than 100.000) on higher load (meaning system load > 2). System response times significantly slow down then. We have tuned parameters for weeks now but could not come up with better results. It seems that we have had better performance on an older Dual XEON DP Machine running on RedHat 7.3. Here is the config: database machine on SuSE SLES 8: F-S Primergy RX600 2x XEON MP 2.5GHz 8GB RAM Hardware Raid 1+0 140GB Kernel 2.4.21-169-smp Postgresql 7.4.1 (self compiled) with max_connections = 170 shared_buffers = 4 effective_cache_size = 80 sort_mem = 3 vacuum_mem = 42 max_fsm_relations = 2000 max_fsm_pages = 20 random_page_cost = 4 checkpoint_segments = 24 wal_buffers = 32 modperl application machine on RH 7.3: F-S Primergy RX200 2x XEON DP 2.4 GHz 4 GB RAM Kernel 2.4.18-10smp, RedHat 7.3 Apache 1.3.27 setup: MinSpareServers 15 MaxSpareServers 30 StartServers 15 MaxClients 80 MaxRequestsPerChild 100 vmstat 1 excerpt: procs ---memory-- ---swap-- -io --system-- cpu r b swpd free buff cache si sobibo incs us sy id wa 1 0 4868 242372 179488 69423160012 8 18 9 6 2 92 0 2 1 4868 242204 179488 69425000064 500 701 117921 35 18 48 0 0 1 4868 242032 179392 69415600016 316 412 132295 28 25 47 0 1 0 4872 242396 179164 693377600 128 276 474 69708 21 24 56 0 3 0 4872 242536 179164 693380800 0 240 412 113643 27 27 46 0 2 0 4872 242872 179092 69317080048 1132 521 127916 24 24 53 0 0 0 4876 242876 179092 69275120048 532 504 117868 32 21 47 0 0 0 4876 242504 179096 692756000 0 188 412 127147 34 20 47 0 1 0 4876 242152 179096 69278560096 276 529 117684 28 23 49 0 2 0 4876 242864 179096 69283840088 560 507 135717 38 19 43 0 1 0 4876 242848 179096 69285200064 232 433 151380 32 20 48 0 4 0 4876 242832 179144 69289160016 10380 2913 112583 28 20 52 0 4 0 4876 242720 179144 692924000 196 0 329 154821 32 18 50 0 3 2 4876 243576 179144 692940800 0 460 451 160287 29 18 52 0 3 0 4876 243292 179180 69294680016 436 614 51894 15 5 80 0 0 0 4876 243884 179180 692958000 0 236 619 154168 29 21 49 0 2 1 4876 243864 179180 692986000 128 380 493 155903 31 19 50 0 2 0 4876 244720 179180 69302760016 1208 561 129336 27 16 56 0 2 0 4876 247204 179180 693030000 0 0 361 146268 33 20 47 0 3 0 4876 248620 179180 693037200 0 168 346 155915 32 12 56 0 2 0 4876 250476 179180 693043600 0 184 328 163842 35 20 46 0 0 0 4876 250496 179180 69306520048 260 450 144930 31 15 53 0 1 0 4876 252236 179180 69307320016 244 577 167259 35 15 50 0 0 0 4876 252236 179180 693078000 0 464 622 165488 31 15 54 0 1 0 4876 252268 179180 693081200 0 132 460 153381 34 15 52 0 2 0 4876 252268 179180 693096400 0 216 312 141009 31 19 50 0 1 0 4876 252264 179180 693098000 056 275 153143 33 20 47 0 2 0 4876 252212 179180 69312120096 296 400 133982 32 18 50 0 1 0 4876 252264 179180 693133200 0 300 416 136034 32 18 50 0 1 1 4876 252264 179180 693133200 0 236 377 143300 34 22 44 0 4 0 4876 254876 179180 693137200 0 124 446 118117 34 20 45 0 1 0 4876 254876 179180 69314920016 144 462 140499 38 16 46 0 2 0 4876 255860 179180 69315720016 144 674 126250 33 20 47 0 1 0 4876 255860 179180 69317880048 264 964 115679 36 13 51 0 3 0 4876 255864 179180 693180400 0 100 597 127619 36 19 46 0 5 1 4876 255864 179180 69319240072 352 559 151620 34 18 48 0 2 0 4876 255860 179184 69321000096 120 339 137821 34 20 47 0 0 0 4876 255860 179184 693215600 8 168 469 125281 36 21 43 0 2 0 4876 256092 179184 693244400 112 328 446 137939 34 19 48 0 2 0 4876 256092 179184 69324840016 184 382 141800 35 16 49 0 3 0 4876 256464 179184 69327160016 356 448 134238 30 18 51 0 5 0 4876 256464 179184 69328920096 600 476 142838 34 20 46 0 1 0 4876 256464 179184 69330120016 176 589 138546 35 22 43 0 2 0 4876 256436 179184 6933096006076 396 93110 42 17 41 0 1 0 4876 256464 179184 693348400 212 276 442 83060 45 11 44 0 5 0 4876 257
Re: [PERFORM] [ SOLVED ] select count(*) very slow on an already
Hi , I am not sure, but I remember the same problem. It was ot 7.3.x version and and I needet to reindex the table. I think after 7.4 vacuum also work correct with reindex. But I am not sure. regards, ivan. Rajesh Kumar Mallah wrote: > Hi, > > The problem was solved by reloading the Table. > the query now takes only 3 seconds. But that is > not a solution. > > The problem is that such phenomenon obscures our > judgement used in optimising queries and database. > > If a query runs slow we really cant tell if its a problem > with query itself , hardware or dead rows. > > I already did vacumm full on the table but it still did not > have that effect on performance. > In fact the last figures were after doing a vacuum full. > > Can there be any more elegent solution to this problem. > > Regds > Mallah. > > Richard Huxton wrote: > > >On Thursday 15 April 2004 08:10, Rajesh Kumar Mallah wrote: > > > > > >>The problem is that i want to know if i need a Hardware upgrade > >>at the moment. > >> > >>Eg i have another table rfis which contains ~ .6 million records. > >> > >> > > > > > > > >>SELECT count(*) from rfis where sender_uid > 0; > >> > >> > > > > > > > >>Time: 117560.635 ms > >> > >>Which is approximate 4804 records per second. Is it an acceptable > >>performance on the hardware below: > >> > >>RAM: 2 GB > >>DISKS: ultra160 , 10 K , 18 GB > >>Processor: 2* 2.0 Ghz Xeon > >> > >> > > > >Hmm - doesn't seem good, does it? If you run it again, is it much faster > >(since the data should be cached then)? What does "vmstat 10" show while > >you're running the query? > > > >One thing you should have done is read the performance tuning guide at: > > http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php > >The default values are very conservative, and you will need to change them. > > > > > > > >>What kind of upgrades shoud be put on the server for it to become > >>reasonable fast. > >> > >> > > > >If you've only got one disk, then a second disk for OS/logging. Difficult to > >say more without knowing numbers of users/activity etc. > > > > > > > > ---(end of broadcast)--- > TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] [ SOLVED ] select count(*) very slow on an already
Rajesh Kumar Mallah wrote: Hi, The problem was solved by reloading the Table. the query now takes only 3 seconds. But that is not a solution. If dropping/recreating the table improves things, then we can reasonably assume that the table is pretty active with updates/inserts. Correct? The problem is that such phenomenon obscures our judgement used in optimising queries and database. Lots of phenomenon obscure that ... If a query runs slow we really cant tell if its a problem with query itself , hardware or dead rows. I already did vacumm full on the table but it still did not have that effect on performance. In fact the last figures were after doing a vacuum full. If the data gets too fragmented, a vacuum may not be enough. Also, read up on the recommendations _against_ vacuum full (recommending only using vacuum on databases) With full, vacuum condenses the database, which may actually hurt performance. A regular vacuum just fixes things up, and may leave unused space lying around. However, this should apparently achieve a balance between usage and vacuum. See the docs, they are much better at describing this than I am. Can there be any more elegent solution to this problem. As a guess, look into CLUSTER (a Postgres SQL command). CLUSTER will basically recreate the table while ordering rows based on an index. (this might benefit you in other ways as well) Don't forget to analyze after cluster. If the problem is caused by frequent updates/inserts, you may find that re-clustering the table on a certain schedule is worthwhile. Be warned, this suggestion is based on an educated guess, I make no guarantees that it will help your problem. Read the docs on cluster and come to your own conclusions. Regds Mallah. Richard Huxton wrote: On Thursday 15 April 2004 08:10, Rajesh Kumar Mallah wrote: The problem is that i want to know if i need a Hardware upgrade at the moment. Eg i have another table rfis which contains ~ .6 million records. SELECT count(*) from rfis where sender_uid > 0; Time: 117560.635 ms Which is approximate 4804 records per second. Is it an acceptable performance on the hardware below: RAM: 2 GB DISKS: ultra160 , 10 K , 18 GB Processor: 2* 2.0 Ghz Xeon Hmm - doesn't seem good, does it? If you run it again, is it much faster (since the data should be cached then)? What does "vmstat 10" show while you're running the query? One thing you should have done is read the performance tuning guide at: http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php The default values are very conservative, and you will need to change them. What kind of upgrades shoud be put on the server for it to become reasonable fast. If you've only got one disk, then a second disk for OS/logging. Difficult to say more without knowing numbers of users/activity etc. -- Bill Moran Potential Technologies http://www.potentialtech.com ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] linux distro for better pg performance
I am searching for best pg distro to run pg (7.4.1). This is generally based upon opinion. Honestly though, your kernel version is more important for performance than the distro. Personally I use gentoo, love gentoo, and would recommend very few other distros (Slackware) for servers. RedHat and others seem to include kitchensinkd, when it's not needed. At the moment I am using RedHat AS 3.0, but I think it have some performance problems (I am not sure). My configuration: P4 2.8 GHz 1 GB RAM 120 GB IDE 7200 disk. Your IDE drive is the biggest hardward bottleneck here. RPM's and bus transfers are slower than SCSI or SATA. Kernel version 2.4.21-4.EL (it is the installation vesrion for rh 3.0) . Jump to 2.6, it's much better for performance related issues, in my experience. My problems: If I run some query with many reads, I see a massive disk transfer : procs memory swap io system cpu r b swpd free buff cache si sobibo incs us sy id wa 0 0 0 261724 3252 67074800 0 4 10519 0 0 100 0 0 0 0 261724 3252 67074800 0 0 10111 0 0 100 0 0 0 0 261724 3260 67074800 0 4 10419 0 0 100 0 0 1 0 259684 3268 67411200 964 7 13157 0 0 95 4 1 0 0 119408 3288 80854000 27960 0 572 630 13 14 24 49 1 1 0 15896 3292 91443600 7984 44744 531 275 11 18 24 47 0 2 0 16292 3296 92499600 4145 6413 384 176 2 5 0 92 0 1 0 19928 3316 92884400 11805 13335 497 388 5 9 5 81 0 3 0 19124 3296 92445200 3153 19164 287 295 5 11 16 68 0 1 0 15956 3304 93298400 536 6812 366 123 4 6 3 87 0 2 0 24956 3300 92141600 1931 22936 And if I run top, I see a big iowait % (some times 70-80) and very low user % (10-15). again, this is your harddrive, and the kernel can play into that. I readet many docs about this problem, but do not find any solution. My question: If some one is using RH 3.0, pls post some result or suggestions for it performance with pg . What is the best linux distro for pg? There's no best, just personal preference. Can I get better performance by using 15K SCSI disk ? Absolutely Or it will be better to have more RAM (2 or 3 GB) ? Better to have a fast drive, but more ram can be helpful. regards, ivan. HTH, Gavin ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Toooo many context switches (maybe SLES8?)
Dirk Lutzebäck wrote: postgresql 7.4.1 a new Dual Xeon MP too much context switches (way more than 100.000) on higher load (meaning system load > 2). I believe this was fixed in 7.4.2, although I can't seem to find it in the release notes. Joe ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] [ SOLVED ] select count(*) very slow on an already
Bill Moran wrote: Rajesh Kumar Mallah wrote: Hi, The problem was solved by reloading the Table. the query now takes only 3 seconds. But that is not a solution. If dropping/recreating the table improves things, then we can reasonably assume that the table is pretty active with updates/inserts. Correct? Yes the table results from an import process and under goes lots of inserts and updates , but thats before the vacuum full operation. the table is not accessed during vacuum. What i want to know is is there any wat to automate the dumping and reload of a table individually. will the below be safe and effective: begin work; create table new_tab AS select * from tab; truncate table tab; insert into tab select * from new_tab; drop table new_tab; commit; analyze tab; i havenot tried it but plan to do so. but i feel insert would take ages to update the indexes if any. BTW is there any way to disable checks and triggers on a table temporarily while loading data (is updating reltriggers in pg_class safe?) The problem is that such phenomenon obscures our judgement used in optimising queries and database. Lots of phenomenon obscure that ... true. but there should not be too many. If a query runs slow we really cant tell if its a problem with query itself , hardware or dead rows. I already did vacumm full on the table but it still did not have that effect on performance. In fact the last figures were after doing a vacuum full. If the data gets too fragmented, a vacuum may not be enough. Also, read up on the recommendations _against_ vacuum full (recommending only using vacuum on databases) With full, vacuum condenses the database, which may actually hurt performance. A regular vacuum just fixes things up, and may leave unused space lying around. However, this should apparently achieve a balance between usage and vacuum. See the docs, they are much better at describing this than I am. i understand simultaneous vacuum and usage detoriates performance mostly. but this case is different. Can there be any more elegent solution to this problem. As a guess, look into CLUSTER (a Postgres SQL command). CLUSTER will basically recreate the table while ordering rows based on an index. (this might benefit you in other ways as well) Don't forget to analyze after cluster. If the problem is caused by frequent updates/inserts, you may find that re-clustering the table on a certain schedule is worthwhile. i could consider that option also. Be warned, this suggestion is based on an educated guess, I make no guarantees that it will help your problem. Read the docs on cluster and come to your own conclusions. Thanks . Regds mallah. Regds Mallah. Richard Huxton wrote: On Thursday 15 April 2004 08:10, Rajesh Kumar Mallah wrote: The problem is that i want to know if i need a Hardware upgrade at the moment. Eg i have another table rfis which contains ~ .6 million records. SELECT count(*) from rfis where sender_uid > 0; Time: 117560.635 ms Which is approximate 4804 records per second. Is it an acceptable performance on the hardware below: RAM: 2 GB DISKS: ultra160 , 10 K , 18 GB Processor: 2* 2.0 Ghz Xeon Hmm - doesn't seem good, does it? If you run it again, is it much faster (since the data should be cached then)? What does "vmstat 10" show while you're running the query? One thing you should have done is read the performance tuning guide at: http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php The default values are very conservative, and you will need to change them. What kind of upgrades shoud be put on the server for it to become reasonable fast. If you've only got one disk, then a second disk for OS/logging. Difficult to say more without knowing numbers of users/activity etc. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Toooo many context switches (maybe SLES8?)
Joe, do you know where I should look in the 7.4.2 code to find this out? Dirk Joe Conway wrote: Dirk Lutzebäck wrote: postgresql 7.4.1 a new Dual Xeon MP too much context switches (way more than 100.000) on higher load (meaning system load > 2). I believe this was fixed in 7.4.2, although I can't seem to find it in the release notes. Joe ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Toooo many context switches (maybe SLES8?)
Dirk Lutzebäck wrote: Joe, do you know where I should look in the 7.4.2 code to find this out? I think I was wrong. I just looked in CVS and found the commit I was thinking about: http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/backend/storage/lmgr/s_lock.c.diff?r1=1.22&r2=1.23 http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/include/storage/s_lock.h.diff?r1=1.123&r2=1.124 = Revision 1.23 / (download) - [select for diffs] , Sat Dec 27 20:58:58 2003 UTC (3 months, 2 weeks ago) by tgl Changes since 1.22: +5 -1 lines Diff to previous 1.22 Improve spinlock code for recent x86 processors: insert a PAUSE instruction in the s_lock() wait loop, and use test before test-and-set in TAS() macro to avoid unnecessary bus traffic. Patch from Manfred Spraul, reworked a bit by Tom. = I thought this had been committed to the 7.4 stable branch as well, but it appears not. Joe ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.
On Wed, Apr 14, 2004 at 21:12:18 +0100, Simon Riggs <[EMAIL PROTECTED]> wrote: > > I guess what I'm saying is it's not how many people you've got working > on the optimizer, its how many accurate field reports of less-than > perfect optimization reach them. In that case, PostgreSQL is likely in a > better position than Microsoft, since the accessibility of the pg > discussion lists makes such cases much more likely to get aired. > > Any thoughts? I have seen exactly this happen a number of times over the last several years. However there is still only one Tom Lane implementing the improvements. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Toooo many context switches (maybe SLES8?)
Joe, > I believe this was fixed in 7.4.2, although I can't seem to find it in > the release notes. Depends on the cause of the issue. If it's the same issue that I'm currently struggling with, it's not fixed. -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] [ SOLVED ] select count(*) very slow on an already
On Thursday 15 April 2004 17:19, Rajesh Kumar Mallah wrote: > Bill Moran wrote: > > Rajesh Kumar Mallah wrote: > >> Hi, > >> > >> The problem was solved by reloading the Table. > >> the query now takes only 3 seconds. But that is > >> not a solution. > > > > If dropping/recreating the table improves things, then we can reasonably > > assume that the table is pretty active with updates/inserts. Correct? > > Yes the table results from an import process and under goes lots > of inserts and updates , but thats before the vacuum full operation. > the table is not accessed during vacuum. What i want to know is > is there any wat to automate the dumping and reload of a table > individually. will the below be safe and effective: Shouldn't be necessary assuming you vacuum (not full) regularly. However, looking back at your original posting, the vacuum output doesn't seem to show any rows that need removing. # VACUUM full verbose eyp_rfi; INFO: vacuuming "public.eyp_rfi" INFO: "eyp_rfi": found 0 removable, 505960 nonremovable row versions in 71987 pages DETAIL: 0 dead row versions cannot be removed yet. Since your select count(*) showed 505960 rows, I can't see how dropping/replacing could make a difference on a sequential scan. Since we're not using any indexes I don't see how it could be related to that. > begin work; > create table new_tab AS select * from tab; > truncate table tab; > insert into tab select * from new_tab; > drop table new_tab; > commit; > analyze tab; > > i havenot tried it but plan to do so. > but i feel insert would take ages to update > the indexes if any. It will have to update them, which will take time. > BTW > > is there any way to disable checks and triggers on > a table temporarily while loading data (is updating > reltriggers in pg_class safe?) You can take a look at pg_restore and copy how it does it. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] [ SOLVED ] select count(*) very slow on an already
On Apr 15, 2004, at 12:44 PM, Richard Huxton wrote: On Thursday 15 April 2004 17:19, Rajesh Kumar Mallah wrote: Bill Moran wrote: BTW is there any way to disable checks and triggers on a table temporarily while loading data (is updating reltriggers in pg_class safe?) You can take a look at pg_restore and copy how it does it. Does SET CONSTRAINT take care of checks within the transaction? Triggers would be a different matter... Mark ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] linux distro for better pg performance
On Thu, 2004-04-15 at 06:39, Gavin M. Roy wrote: > Your IDE drive is the biggest hardward bottleneck here. RPM's and bus > transfers are slower than SCSI or SATA. Individual disk throughput generally has very little bearing on database performance compared to other factors. In fact, IDE bandwidth performance is perfectly adequate for databases, and for database purposes indistinguishable from SATA. I would say that average access and read/write completion times, especially under load, are by far the most limiting factors, and disk RPM is only one component of this. In fact, disk RPM is a very expensive way to get marginally better throughput in this regard, and I would suggest 10k rather than 15k drives for the money. There are really only two features that are worth buying in your disk subsystem which many people ignore: TCQ and independently managed I/O with a large battery-backed write-back cache. Currently, the only place to really get this is with SCSI RAID. You can get 10k SATA drives, so when you are buying SCSI you are really buying these features. Do these features make a difference? Far more than you would imagine. On one postgres server I just upgraded, we went from a 3Ware 8x7200-RPM RAID-10 configuration to an LSI 320-2 SCSI 3x10k RAID-5, with 256M cache, and got a 3-5x performance improvement in the disk subsystem under full database load. SCSI RAID can service a lot of I/O requests far more efficiently than current IDE/SATA RAID controllers, and it shows in the stats. Under these types of loads, the actually bandwidth utilized by the disks doesn't come anywhere close to even their rated performance, never mind the theoretical performance of the bus. Service times for IDE/SATA RAID increases dramatically under load, whereas SCSI tends not to under the same load. Considering that very good SCSI RAID controllers (e.g. the LSI 320-2 that I mention above) are only marginally more expensive than nominally equivalent IDE/SATA controller solutions, using SCSI RAID with 10k drives is pretty much the price-performance sweet spot if you use your disk system hard (like we do). For databases with low disk I/O intensity, stay with IDE/SATA and save a little money. For databases that have high disk I/O intensity, use SCSI. The price premium for SCSI is about 50%, but the performance difference is an integer factor under load. j. andrew rogers ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] [ SOLVED ] select count(*) very slow on an already
Rajesh Kumar Mallah wrote: Bill Moran wrote: Rajesh Kumar Mallah wrote: Hi, The problem was solved by reloading the Table. the query now takes only 3 seconds. But that is not a solution. If dropping/recreating the table improves things, then we can reasonably assume that the table is pretty active with updates/inserts. Correct? Yes the table results from an import process and under goes lots of inserts and updates , but thats before the vacuum full operation. the table is not accessed during vacuum. What i want to know is is there any wat to automate the dumping and reload of a table individually. will the below be safe and effective: The CLUSTER command I described is one way of doing this. It essentially automates the task of copying the table, dropping the old one, and recreating it. If the data gets too fragmented, a vacuum may not be enough. Also, read up on the recommendations _against_ vacuum full (recommending only using vacuum on databases) With full, vacuum condenses the database, which may actually hurt performance. A regular vacuum just fixes things up, and may leave unused space lying around. However, this should apparently achieve a balance between usage and vacuum. See the docs, they are much better at describing this than I am. i understand simultaneous vacuum and usage detoriates performance mostly. but this case is different. Just want to make sure we're on the same page here. I'm not talking about vacuuming simultaneous with anything. I'm simply saying that "vacuum full" isn't always the best choice. You should probably only be doing "vacuum". The reason and details for this are in the admin docs. -- Bill Moran Potential Technologies http://www.potentialtech.com ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] [ SOLVED ] select count(*) very slow on an already
Bill, if you had alot of updates and deletions and wanted to optimize your table, can you just issue the cluster command. Will the cluster command rewrite the table without the obsolete data that a vacuum flags or do you need to issue a vacuum first? Dan. -Original Message- From: Bill Moran [mailto:[EMAIL PROTECTED] Sent: Thursday, April 15, 2004 2:49 PM To: Rajesh Kumar Mallah Cc: Postgres Performance Subject: Re: [PERFORM] [ SOLVED ] select count(*) very slow on an already Rajesh Kumar Mallah wrote: > Bill Moran wrote: > >> Rajesh Kumar Mallah wrote: >> >>> Hi, >>> >>> The problem was solved by reloading the Table. >>> the query now takes only 3 seconds. But that is >>> not a solution. >> >> If dropping/recreating the table improves things, then we can reasonably >> assume that the table is pretty active with updates/inserts. Correct? > > Yes the table results from an import process and under goes lots > of inserts and updates , but thats before the vacuum full operation. > the table is not accessed during vacuum. What i want to know is > is there any wat to automate the dumping and reload of a table > individually. will the below be safe and effective: The CLUSTER command I described is one way of doing this. It essentially automates the task of copying the table, dropping the old one, and recreating it. >> If the data gets too fragmented, a vacuum may not be enough. Also, read >> up on the recommendations _against_ vacuum full (recommending only using >> vacuum on databases) With full, vacuum condenses the database, which may >> actually hurt performance. A regular vacuum just fixes things up, and >> may leave unused space lying around. However, this should apparently >> achieve a balance between usage and vacuum. See the docs, they are much >> better at describing this than I am. >> > i understand simultaneous vacuum and usage detoriates performance mostly. > but this case is different. Just want to make sure we're on the same page here. I'm not talking about vacuuming simultaneous with anything. I'm simply saying that "vacuum full" isn't always the best choice. You should probably only be doing "vacuum". The reason and details for this are in the admin docs. -- Bill Moran Potential Technologies http://www.potentialtech.com ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Toooo many context switches (maybe SLES8?)
Joe Conway <[EMAIL PROTECTED]> writes: >> Improve spinlock code for recent x86 processors: insert a PAUSE >> instruction in the s_lock() wait loop, and use test before test-and-set >> in TAS() macro to avoid unnecessary bus traffic. Patch from Manfred >> Spraul, reworked a bit by Tom. > I thought this had been committed to the 7.4 stable branch as well, but > it appears not. I am currently chasing what seems to be the same issue: massive context swapping on a dual Xeon system. I tried back-patching the above-mentioned patch ... it helps a little but by no means solves the problem ... regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.
> Bruno Wolff > Simon Riggs <[EMAIL PROTECTED]> wrote: > > > > I guess what I'm saying is it's not how many people you've > got working > > on the optimizer, its how many accurate field reports of less-than > > perfect optimization reach them. In that case, PostgreSQL > is likely in a > > better position than Microsoft, since the accessibility of the pg > > discussion lists makes such cases much more likely to get aired. > > > > Any thoughts? > > I have seen exactly this happen a number of times over the > last several > years. However there is still only one Tom Lane implementing the > improvements. > ...and very few Mr.Microsofts too. [I'm uncomfortable with, and it was not my intent, to discuss such an issue with direct reference to particular individuals. There is no intent to critiscise or malign anybody named] Regards, Simon ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [ SOLVED ] select count(*) very slow on an already
Shea,Dan [CIS] wrote: Bill, if you had alot of updates and deletions and wanted to optimize your table, can you just issue the cluster command. Will the cluster command rewrite the table without the obsolete data that a vacuum flags or do you need to issue a vacuum first? From the reference docs: "During the cluster operation, a temporary copy of the table is created that contains the table data in the index order. Temporary copies of each index on the table are created as well. Therefore, you need free space on disk at least equal to the sum of the table size and the index sizes. "CLUSTER preserves GRANT, inheritance, index, foreign key, and other ancillary information about the table. "Because the optimizer records statistics about the ordering of tables, it is advisable to run ANALYZE on the newly clustered table. Otherwise, the optimizer may make poor choices of query plans." The primary reason CLUSTER exists is to allow you to physically reorder a table based on a key. This should provide a performance improvement if data with the same key is accessed all at once. (i.e. if you do "SELECT * FROM table WHERE key=5" and it returns 100 rows, those 100 rows are guaranteed to be all on the same part of the disk after CLUSTER, thus a performance improvement should result.) Updates and inserts will add data in the next available space in a table with no regard for any keys, and _may_ require running all over the disk to retrieve the data in the previous example query. I doubt if CLUSTER is an end-all optimization tool. The specific reason I suggested it was because the original poster was asking for an easier way to drop/recreate a table (as prior experimentation had shown this to improve performance) I can't think of anything easier than "CLUSTER ON " Since CLUSTER recreates the table, it implicitly removes the dead tuples. However, it's going to be a LOT slower than vacuum, so if dead tuples are the main problem, vacuum is still the way to go. -- Bill Moran Potential Technologies http://www.potentialtech.com ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Toooo many context switches (maybe SLES8?)
Folks, > I am currently chasing what seems to be the same issue: massive context > swapping on a dual Xeon system. I tried back-patching the above-mentioned > patch ... it helps a little but by no means solves the problem ... BTW, I'm currently pursuing the possibility that this has something to do with the ServerWorks chipset on those motherboards. If anyone knows a high-end hardware+linux kernel geek I can corner, I'd appreciate it. Maybe I should contact OSDL ... -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.
Simon, > Is the problem "a person interested" or is there another issue there? IMHO, it's "a person interested". > Treating the optimizer as a black box is something I'm very used to from > other RDBMS. My question is, how can you explicitly re-write a query now > to "improve" it? If there's no way of manipulating queries without > actually re-writing the optimizer, we're now in a position where we > aren't able to diagnose when the optimizer isn't working effectively. Well, there is ... all of the various query cost parameters. > For my mind, all the people on this list are potential "optimizer > developers" in the sense that we can all look at queries and see whether > there is a problem with particular join plans. Providing good cases of > poor optimization is just what's needed to assist those few that do > understand the internals to continue improving things. ... which is what this list is for. But, ultimately, improvements on the planner are still bottlenecked by having only one developer actually hacking the changes. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.
Bruno Wolff III <[EMAIL PROTECTED]> writes: > I have seen exactly this happen a number of times over the last several > years. However there is still only one Tom Lane implementing the > improvements. Ob: Well clearly the problem is we need more Tom Lanes. -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.
Greg Stark <[EMAIL PROTECTED]> writes: > Ob: Well clearly the problem is we need more Tom Lanes. ObHHGReference: "Haven't you heard? I come in six-packs!" regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] index v. seqscan for certain values
On Tue, 13 Apr 2004 13:55:49 -0400, Tom Lane <[EMAIL PROTECTED]> wrote: >Possibly the >nonuniform clumping of CID has something to do with the poor results. It shouldn't. The sampling algorithm is designed to give each tuple the same chance of ending up in the sample, and tuples are selected independently. (IOW each one of the {N \chooose n} possible samples has the same probability.) There are known problems with nonuniform distribution of dead vs. live and large vs. small tuples, but AFAICS the order of values does not matter. Servus Manfred ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.
Greg Stark wrote: Bruno Wolff III <[EMAIL PROTECTED]> writes: I have seen exactly this happen a number of times over the last several years. However there is still only one Tom Lane implementing the improvements. Ob: Well clearly the problem is we need more Tom Lanes. my $pgGuru = "Tom Lane"; my @morepgGurus; my $howmany = 10; while($howmany--) { push @morepgGurus, $pgGuru; } -- Until later, Geoffrey Registered Linux User #108567 Building secure systems in spite of Microsoft ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] query slows down with more accurate stats
[Just a quick note here; a more thorough discussion of my test results will be posted to -hackers] On Tue, 13 Apr 2004 15:18:42 -0400, Tom Lane <[EMAIL PROTECTED]> wrote: >Well, the first problem is why is ANALYZE's estimate of the total row >count so bad :-( ? I suspect you are running into the situation where >the initial pages of the table are thinly populated and ANALYZE >mistakenly assumes the rest are too. Manfred is working on a revised >sampling method for ANALYZE that should fix this problem The new method looks very promising with respect to row count estimation: I got estimation errors of +/- 1% where the old method was off by up to 60%. (My test methods might be a bit biased though :-)) My biggest concern at the moment is that the new sampling method violates the contract of returning each possible sample with he same probability: getting several tuples from the same page is more likely than with the old method. Servus Manfred ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] query slows down with more accurate stats
Manfred Koizar <[EMAIL PROTECTED]> writes: > My biggest concern at the moment is that the new sampling method > violates the contract of returning each possible sample with he same > probability: getting several tuples from the same page is more likely > than with the old method. Hm, are you sure? I recall objecting to your original proposal because I thought that would happen, but after further thought it seemed not. Also, I'm not at all sure that the old method satisfies that constraint completely in the presence of nonuniform numbers of tuples per page, so we'd not necessarily be going backwards anyhow ... regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.
my $pgGuru = "Tom Lane"; my @morepgGurus; my $howmany = 10; while($howmany--) { push @morepgGurus, $pgGuru; } This is just wrong... -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com PostgreSQL Replicator -- production quality replication for PostgreSQL ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.
"Joshua D. Drake" <[EMAIL PROTECTED]> writes: > > while($howmany--) { push @morepgGurus, $pgGuru; } > > This is just wrong... yeah, it would have been much clearer written as: push @morepgGurus, ($pgGuru)x$howmany; Or at least the perlish: for (1..$howmany) instead of C style while syntax. Ok. I stop now. -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])