Re: [PERFORM] vacuum analyze slows sql query
Given that the plan doesn't change after an analyze, my guess would be that the first query is hitting cached data, then you vacuum and that chews though all the cache with its own data pushing the good data out of the cache so it has to be re-fetched from disk. If you run the select a 2nd time after the vacuum, what is the time? Not sure what your pkk_offer_has_pending_purch function does, that might be something to look at as well. I could be wrong, but thats the only thing that makes sense to me. ARC is supposed to help with that type of behavior in 8.0 patrick ~ wrote: Greetings pgsql-performance :) Yesterday I posted to the pgsql-sql list about an issue with VACUUM while trying to track-down an issue with performance of a SQL SELECT statement invovling a stored function. It was suggested that I bring the discussion over to -performance. Instread of reposting the message here is a link to the original message followed by a brief summary: http://marc.theaimsgroup.com/?l=postgresql-sqlm=109945118928530w=2 Summary: Our customer complains about web/php-based UI sluggishness accessing the data in db. I created a stripped down version of the tables in question to be able to post to the pgsql-sql list asking for hints as to how I can improve the SQL query. While doing this I noticed that if I 'createdb' and populate it with the sanatized data the query in question is quite fast; 618 rows returned in 864.522 ms. This was puzzling. Next I noticed that after a VACUUM the very same query would slow down to a crawl; 618 rows returned in 1080688.921 ms). This was reproduced on PostgreSQL 7.4.2 running on a Intel PIII 700Mhz, 512mb. This system is my /personal/ test system/sandbox. i.e., it isn't being stressed by any other processes. Thanks for reading, --patrick ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Why isn't this index being used?
Hi, I ran into a similar problem using bigints... See: http://www.postgresql.org/docs/7.3/static/datatype.html#DATATYPE-INT small big int have to be cast when used in querries... try: explain select * from db where type=90::smallint and subtype=70::smallint and date='7/1/2004'; or explain select * from db where type='90' and subtype='70' and date='7/1/2004'; Knutsen, Mark wrote: The following is from a database of several hundred million rows of real data that has been VACUUM ANALYZEd. Why isn't the index being used for a query that seems tailor-made for it? The results (6,300 rows) take about ten minutes to retrieve with a sequential scan. A copy of this database with integer in place of smallint, a primary key in column order (date, time, type, subtype) and a secondary index in the required order (type, subtype, date, time) correctly uses the secondary index to return results in under a second. Actually, the integer version is the first one I made, and the smallint is the copy, but that shouldn't matter. Postgres is version postgresql-server-7.3.4-3.rhl9 from Red Hat Linux 9. = testdb2=# \d db Table public.db Column | Type | Modifiers -++--- date| date | not null time| time without time zone | not null type| smallint | not null subtype | smallint | not null value | integer| Indexes: db_pkey primary key btree (type, subtype, date, time) testdb2=# set enable_seqscan to off; SET testdb2=# explain select * from db where type=90 and subtype=70 and date='7/1/2004'; QUERY PLAN -- Seq Scan on db (cost=1.00..107455603.76 rows=178 width=20) Filter: ((type = 90) AND (subtype = 70) AND (date = '2004-07-01'::date)) (2 rows) ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Tuning shared_buffers with ipcs ?
Hello, I've seen a couple references to using ipcs to help properly size shared_buffers. I don't claim to be a SA guru, so could someone help explain how to interpret the output of ipcs and how that relates to shared_buffers? How does one determine the size of the segment arrays? I see the total size using ipcs -m which is roughly shared_buffers * 8k. I tried all of the dash commands in the ipcs man page, and the only one that might give a clue is ipcs -t which shows the time the semaphores were last used. If you look at the example I give below, it appears as if I'm only using 4 of the 17 semaphores (PG was started on Oct 8). Am I correct in assuming that if the arrays are all the same size then I should only need about 1/4 of my currently allocated shared_buffers? -- Shared Memory Operation/Change Times shmidowner last-oplast-changed 847183872 postgresFri Oct 8 11:03:31 2004 Fri Oct 8 11:03:31 2004 847216641 postgresFri Oct 8 11:03:31 2004 Fri Oct 8 11:03:31 2004 847249410 postgresFri Oct 8 11:03:31 2004 Fri Oct 8 11:03:31 2004 847282179 postgresFri Oct 8 11:03:31 2004 Fri Oct 8 11:03:31 2004 847314948 postgresFri Oct 8 11:03:31 2004 Fri Oct 8 11:03:31 2004 847347717 postgresFri Oct 8 11:03:31 2004 Fri Oct 8 11:03:31 2004 847380486 postgresFri Oct 8 11:03:31 2004 Fri Oct 8 11:03:31 2004 847413255 postgresFri Oct 8 11:03:31 2004 Fri Oct 8 11:03:31 2004 847446024 postgresFri Oct 8 11:03:31 2004 Fri Oct 8 11:03:31 2004 847478793 postgresFri Oct 8 11:03:31 2004 Fri Oct 8 11:03:31 2004 847511562 postgresFri Oct 8 11:03:31 2004 Fri Oct 8 11:03:31 2004 847544331 postgresFri Oct 8 11:03:31 2004 Fri Oct 8 11:03:31 2004 847577100 postgresFri Oct 8 11:03:31 2004 Fri Oct 8 11:03:31 2004 847609869 postgresFri Oct 15 11:34:28 2004 Fri Oct 15 11:34:29 2004 847642638 postgresFri Oct 15 11:33:35 2004 Fri Oct 15 11:33:35 2004 847675407 postgresFri Oct 15 11:34:28 2004 Fri Oct 15 11:34:29 2004 847708176 postgresFri Oct 15 11:27:17 2004 Fri Oct 15 11:32:20 2004 Also, isn't the shared memory supposed to show up in free? Its always showing as 0: # free total used free sharedbuffers cached Mem: 38969283868424 28504 0 597883605548 -/+ buffers/cache: 2030883693840 Swap: 1052216 161052200 Thanks! ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Tuning shared_buffers with ipcs ?
Tom Lane wrote: Doug Y [EMAIL PROTECTED] writes: I've seen a couple references to using ipcs to help properly size shared_buffers. I have not seen any such claim, and I do not see any way offhand that ipcs could help. Directly from: http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html As a rule of thumb, observe shared memory usage of PostgreSQL with tools like ipcs and determine the setting. I've seen references in the admin I tried all of the dash commands in the ipcs man page, and the only one that might give a clue is ipcs -t which shows the time the semaphores were last used. If you look at the example I give below, it appears as if I'm only using 4 of the 17 semaphores (PG was started on Oct 8). This might tell you something about how many concurrent backends you've used, but nothing about how many shared buffers you need. Thats strange, I know I've had more than 4 concurrent connections on that box... (I just checked and there were at least a dozen). A mirror DB with the same config also has the same basic output from ipcs, except that it has times for 11 of the 17 arrays slots and most of them are the time when we do our backup dump (which makes sense that it would require more memory at that time.) regards, tom lane I'm not saying you're wrong, because I don't know how the nitty gritty stuff works, I'm just trying to find something to work with, since presently there isn't anything other than anecdotal evidence. From what I've inferred, there seems to be some circumstantial evidence supporting my theory. Thanks. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[PERFORM] The never ending quest for clarity on shared_buffers
Hello, We recently upgraded os from rh 7.2 (2.4 kernel) to Suse 9.1 (2.6 kernel), and psql from 7.3.4 to 7.4.2 One of the quirks I've noticed is how the queries don't always have the same explain plans on the new psql... but that's a different email I think. My main question is I'm trying to convince the powers that be to let me use persistent DB connections (from apache 2 / php), and my research has yielded conflicting documentation about the shared_buffers setting... real shocker there :) For idle persistent connections, do each of them allocate the memory specified by this setting (shared_buffers * 8k), or is it one pool used by all the connection (which seems the logical conclusion based on the name SHARED_buffers)? Personally I'm more inclined to think the latter choice, but I've seen references that alluded to both cases, but never a definitive answer. For what its worth, shared_buffers is currently set to 5 (on a 4G system). Also, effective_cache_size is 125000. max_connections is 256, so I don't want to end up with a possible 100G (50k * 8k * 256) of memory tied up... not that it would be possible, but you never know. I typically never see more than a dozen or so concurrent connections to the db (serving 3 web servers), so I'm thinking of actually using something like pgpool to keep about 10 per web server, rather than use traditional persistent connections of 1 per Apache child, which would probably average about 50 per web server. Thanks. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Interpreting vmstat
Hello, (note best viewed in fixed-width font) I'm still trying to find where my performance bottle neck is... I have 4G ram, PG 7.3.4 shared_buffers = 75000 effective_cache_size = 75000 Run a query I've been having trouble with and watch the output of vmstat (linux): $ vmstat 1 procs memoryswap io system cpu r b w swpd free buff cache si sobibo incs us sy id 0 0 0148 8732 193652 2786668 0 0 0 0 292 151 0 2 98 2 0 2148 7040 193652 2786668 0 0 0 208 459 697 45 10 45 0 0 0148 9028 193652 2786684 0 016 644 318 613 25 4 71 1 0 0148 5092 193676 2780196 0 012 184 441 491 37 5 58 0 1 0148 5212 193684 2772512 0 0 112 9740 682 1063 45 12 43 1 0 0148 5444 193684 2771584 0 0 120 4216 464 1303 44 3 52 1 0 0148 12232 193660 2771620 0 0 244 628 340 681 43 20 38 1 0 0148 12168 193664 2771832 0 0 196 552 332 956 42 2 56 1 0 0148 12080 193664 2772248 0 0 272 204 371 201 40 1 59 1 1 0148 12024 193664 2772624 0 0 368 0 259 127 42 3 55 Thats the first 10 lines or so... the query takes 60 seconds to run. I'm confused on the bo bi parts of the io: IO bi: Blocks sent to a block device (blocks/s). bo: Blocks received from a block device (blocks/s). yet it seems to be opposite of that... bi only increases when doing a largish query, while bo also goes up, I typically see periodic bo numbers in the low 100's, which I'd guess are log writes. I would think that my entire DB should end up cached since a raw pg_dump file is about 1G in size, yet my performance doesn't indicate that that is the case... running the same query a few minutes later, I'm not seeing a significant performance improvement. Here's a sample from iostat while the query is running: $ iostat -x -d 1 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda0.00 0.00 0.00 0.000.000.00 0.00 42949552.960.00 0.00 100.00 sda1 0.00 0.00 0.00 0.000.000.00 0.00 42949662.960.00 0.00 100.00 sda2 0.00 0.00 0.00 0.000.000.00 0.00 42949642.960.00 0.00 100.00 sdb0.00 428.00 0.00 116.000.00 4368.0037.66 2844.40 296.55 86.21 100.00 sdb1 0.00 428.00 0.00 116.000.00 4368.0037.66 6874.40 296.55 86.21 100.00 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda0.00 0.00 0.00 0.000.000.00 0.00 42949552.960.00 0.00 100.00 sda1 0.00 0.00 0.00 0.000.000.00 0.00 42949662.960.00 0.00 100.00 sda2 0.00 0.00 0.00 0.000.000.00 0.00 42949642.960.00 0.00 100.00 sdb4.00 182.00 6.00 77.00 80.00 2072.0025.93 2814.50 54.22 120.48 100.00 sdb1 4.00 182.00 6.00 77.00 80.00 2072.0025.93 6844.50 54.22 120.48 100.00 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda0.00 0.00 0.00 0.000.000.00 0.00 42949552.960.00 0.00 100.00 sda1 0.00 0.00 0.00 0.000.000.00 0.00 42949662.960.00 0.00 100.00 sda2 0.00 0.00 0.00 0.000.000.00 0.00 42949642.960.00 0.00 100.00 sdb0.00 43.00 0.00 11.000.00 432.0039.27 2810.40 36.36 909.09 100.00 sdb1 0.00 43.00 0.00 11.000.00 432.0039.27 6840.40 36.36 909.09 100.00 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda0.00 15.84 0.00 17.820.00 269.3115.11 42524309.47 44.44 561.11 100.00 sda1 0.00 15.84 0.00 17.820.00 269.3115.11 42524419.47 44.44 561.11 100.00 sda2 0.00 0.00 0.00 0.000.000.00 0.00 42524398.670.00 0.00 100.00 sdb0.99 222.77 0.99 114.85 15.84 2700.9923.45 2814.16 35.90 86.32 100.00 sdb1 0.99 222.77 0.99 114.85 15.84 2700.9923.45 6844.16 35.90 86.32 100.00 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda0.00 0.00 0.00 0.000.000.00 0.00 42949551.760.00 0.00 101.00 sda1 0.00 0.00 0.00 0.000.000.00 0.00 42949662.860.00 0.00 101.00 sda2 0.00 0.00 0.00 0.000.000.00 0.00 42949642.660.00 0.00 101.00 sdb1.00 91.00 1.00 28.00 16.00 960.0033.66 2838.40 10.34 348.28 101.00 sdb1 1.00 91.00 1.00 28.00 16.00 960.0033.66 6908.70 10.34 348.28 101.00 The DB files and logs are on sdb1. Can someone point me in the direction of some documentation on how to interpret these numbers? Also, I've tried to figure out
Re: [PERFORM] Clarification on some settings
(Sorry if this ends up being a duplicate post, I sent a reply yesterday, but it doesn't appear to have gone through... I think I typo'd the address but never got a bounce.) Hi, Thanks for your initial help. I have some more questions below. At 05:02 AM 5/12/2004, Shridhar Daithankar wrote: Doug Y wrote: Hello, I've been having some performance issues with a DB I use. I'm trying to come up with some performance recommendations to send to the adminstrator. Ok for what I'm uncertain of... shared_buffers: According to http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html Its more of a staging area and more isn't necessarily better. That psql relies on the OS to cache data for later use. But according to http://www.ca.postgresql.org/docs/momjian/hw_performance/node3.html its where psql caches previous data for queries because the OS cache is slower, and should be as big as possible without causing swap. Those seem to be conflicting statements. In our case, the administrator kept increasing this until performance seemed to increase, which means its now 25 (x 8k is 2G). Is this just a staging area for data waiting to move to the OS cache, or is this really the area that psql caches its data? It is the area where postgresql works. It updates data in this area and pushes it to OS cache for disk writes later. By experience, larger does not mean better for this parameter. For multi-Gig RAM machines, the best(on an average for wide variety of load) value found to be around 1-15000. May be even lower. It is a well known fact that raising this parameter unnecessarily decreases the performance. You indicate that best performance occurred at 25. This is very very large compared to other people's experience. Ok. I think I understand a bit better now. effective_cache_size: Again, according to the Varlena guide this tells psql how much system memory is available for it to do its work in. until recently, this was set at the default value of 1000. It was just recently increased to 18 (1.5G) according to http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html it should be about 25% of memory? No rule of thumb. It is amount of memory OS will dedicate to psotgresql data buffers. Depending uponn what else you run on machine, it could be straight-forward or noodly value to calculate. For a 4GB machine, 1.5GB is quite good but coupled with 2G of shared buffers it could push the machines to swap storm. And swapping shared buffers is a big performance hit. We don't seem to be swapping much: # top 2:21pm up 236 days, 19:12, 1 user, load average: 1.45, 1.09, 1.00 53 processes: 51 sleeping, 2 running, 0 zombie, 0 stopped CPU0 states: 30.3% user, 9.1% system, 0.0% nice, 60.0% idle CPU1 states: 32.0% user, 9.3% system, 0.0% nice, 58.1% idle Mem: 3863468K av, 3845844K used, 17624K free, 2035472K shrd, 198340K buff Swap: 1052248K av,1092K used, 1051156K free 1465112K cached looks like at some point it did swap a little, but from running vmstat, I can't seem to catch it actively swapping. Finally sort_mem: Was until recently left at the default of 1000. Is now 16000. Sort memory is per sort not per query or per connection. So depending upon how many concurrent connections you entertain, it could take quite a chuck of RAM. Right I understand that. How does one calculate the size of a sort? Rows * width from an explain? Increasing the effective cache and sort mem didn't seem to make much of a difference. I'm guessing the eff cache was probably raised a bit too much, and shared_buffers is way to high. I agree. For shared buffers start with 5000 and increase in batches on 1000. Or set it to a high value and check with ipcs for maximum shared memory usage. If share memory usage peaks at 100MB, you don't need more than say 120MB of buffers. My results from ipcs seems confusing... says its using the full 2G of shared cache: # ipcs -- Shared Memory Segments keyshmid owner perms bytes nattch status 0x0052e2c1 6389760postgres 6002088370176 4 -- Semaphore Arrays keysemid owner perms nsems status 0x0052e2c1 424378368 postgres 60017 0x0052e2c2 424411137 postgres 60017 0x0052e2c3 424443906 postgres 60017 0x0052e2c4 424476675 postgres 60017 0x0052e2c5 424509444 postgres 60017 0x0052e2c6 424542213 postgres 60017 0x0052e2c7 424574982 postgres 60017 0x0052e2c8 424607751 postgres 60017 0x0052e2c9 424640520 postgres 60017 0x0052e2ca 424673289 postgres 60017 0x0052e2cb 424706058 postgres 60017 0x0052e2cc 424738827 postgres 60017 0x0052e2cd 424771596 postgres 60017 0x0052e2ce 424804365 postgres 60017 0x0052e2cf 424837134 postgres 60017 0x0052e2d0 424869903 postgres 60017 0x0052e2d1
[PERFORM] Clarification on some settings
Hello, I've been having some performance issues with a DB I use. I'm trying to come up with some performance recommendations to send to the adminstrator. Hardware: CPU0: Pentium III (Coppermine) 1000MHz (256k cache) CPU1: Pentium III (Coppermine) 1000MHz (256k cache) Memory: 3863468 kB (4 GB) OS: Red Hat Linux release 7.2 (Enigma) Kernel: 2.4.9-31smp I/O I believe is a 3-disk raid 5. /proc/sys/kernel/shmmax and /proc/sys/kernel/shmall were set to 2G Postgres version: 7.3.4 I know its a bit dated, and upgrades are planned, but several months out. Load average seems to hover between 1.0 and 5.0-ish during peak hours. CPU seems to be the limiting factor but I'm not positive (cpu utilization seems to be 40-50%). We have 2 of those set up as the back end to 3 web-servers each... supposedly load-balanced, but one of the 2 dbs consistently has higher load. We have a home-grown replication system that keeps them in sync with each other... peer to peer (master/master). The DB schema is, well to put it nicely... not exactly normalized. No constraints to speak of except for the requisite not-nulls on the primary keys (many of which are compound). Keys are mostly varchar(256) fields. Ok for what I'm uncertain of... shared_buffers: According to http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html Its more of a staging area and more isn't necessarily better. That psql relies on the OS to cache data for later use. But according to http://www.ca.postgresql.org/docs/momjian/hw_performance/node3.html its where psql caches previous data for queries because the OS cache is slower, and should be as big as possible without causing swap. Those seem to be conflicting statements. In our case, the administrator kept increasing this until performance seemed to increase, which means its now 25 (x 8k is 2G). Is this just a staging area for data waiting to move to the OS cache, or is this really the area that psql caches its data? effective_cache_size: Again, according to the Varlena guide this tells psql how much system memory is available for it to do its work in. until recently, this was set at the default value of 1000. It was just recently increased to 18 (1.5G) according to http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html it should be about 25% of memory? Finally sort_mem: Was until recently left at the default of 1000. Is now 16000. Increasing the effective cache and sort mem didn't seem to make much of a difference. I'm guessing the eff cache was probably raised a bit too much, and shared_buffers is way to high. What can I do to help determine what the proper settings should be and/or look at other possible choke points. What should I look for in iostat, mpstat, or vmstat as red flags that cpu, memory, or i/o bound? DB maintenance wise, I don't believe they were running vacuum full until I told them a few months ago that regular vacuum analyze no longer cleans out dead tuples. Now normal vac is run daily, vac full weekly (supposedly). How can I tell from the output of vacuum if the vac fulls aren't being done, or not done often enough? Or from the system tables, what can I read? Is there anywhere else I can look for possible clues? I have access to the DB super-user, but not the system root/user. Thank you for your time. Please let me know any help or suggestions you may have. Unfortunately upgrading postgres, OS, kernel, or re-writing schema is most likely not an option. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org