Re: [PERFORM] Fwd: FreeBSD, PostgreSQL, semwait and sbwait!
Josh Berkus wrote: Forget hyperthreading. Look at their postgresql.conf settings. 8mb shared mem, 16mb sort mem per connection for 512 connections, default effective_cache_size. Umm...its 64Mb shared buffers isn't it ? However agree completely with general thrust of message particularly the 16Mb of sort mem / connection - a very bad idea unless you are running a data warehouse box for only a few users (not 512 of them...) regards Mark ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Fwd: FreeBSD, PostgreSQL, semwait and sbwait!
Darcy Buskermolen wrote: -- Forwarded Message -- Subject: FreeBSD, PostgreSQL, semwait and sbwait! Date: March 23, 2004 12:02 pm From: Jason Coene [EMAIL PROTECTED] To: [EMAIL PROTECTED] Hello all, We're having a substantial problem with our FreeBSD 5.2 database server running PostgreSQL - it's getting a lot of traffic (figure about 3,000 queries per second), but queries are slow, and it's seemingly waiting on other things than CPU time Could this be a 5.2 performance issue ? In spite of certain areas where the 5.x series performance is known to be much better than 4.x (e.g networking), this may not be manifested in practice for a complete application. (e.g. I am still running 4.9 as it outperformed 5.1 vastly for a ~100 database sessions running queries - note that I have not tried out 5.2, so am happy to be corrected on this) regards Mark ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Benchmarking postgres on Solaris/Linux
Josh Berkus wrote: Mark, It might be worth considering Apple if you want a 64-bit chip that has a clock speed comparable to Intel's - the Xserv is similarly priced to Sun V210 (both dual cpu 1U's). Personally I'd stay *far* away from the XServs until Apple learns to build some real server harware.The current XServs have internal parts more appropriate to a Dell desktop (promise controller, low-speed commodity IDE drives), than a server. If Apple has prices these IU desktop machines similar to Sun, then I sense doom ahead for the Apple Server Division. (thinks...) Point taken - the Xserv is pretty entry level... However, having recently benchmarked a 280R vs a PIII Dell using a Promise ide raid controller - and finding the Dell comparable (with write cache *disabled*), I suspect that the Xserv has a pretty good chance of outperforming a V210 (certainly would be interesting to try out) What I think has happened is that over the last few years then cheap / slow ide stuff has gotten pretty fast - even when you make write mean write cheers Mark ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[PERFORM] slow vacuum performance
Hi, I am running pg 7.4.1 on linux box. I have a midle size DB with many updates and after it I try to run vacuum full analyze. It takes about 2 h. If I try to dump and reload the DB it take 20 min. How can I improve the vacuum full analyze time? My configuration: shared_buffers = 15000 # min 16, at least max_connections*2, 8KB each sort_mem = 1# min 64, size in KB vacuum_mem = 32000 # min 1024, size in KB effective_cache_size = 4# typically 8KB each #max_fsm_pages = 2 # min max_fsm_relations*16, 6 bytes each #max_fsm_relations = 1000 # min 100, ~50 bytes each regards, ivan. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux
Now if these vendors could somehow eliminate downtime due to human error we'd be talking *serious* reliablity. You mean making the OS smart enough to know when clearing the arp cache is a bonehead operation, or just making the hardware smart enough to realise that the keyswitch really shouldn't be turned while 40 people are logged in? (Either way, I agree this'd be an improvement. It'd sure make colocation a lot less painful.) Well I was joking really, but those are two very good examples! Yes, machines should require extra confirmation for operations like those. Hell, even a simple 'init 0' would be well served by a prompt that says There are currently 400 network sockets open, 50 remote users logged in, and 25 disk IOs per second. What's more, there's nobody logged in at the console to boot me up again afterwards - are you _sure_ you want to shut the machine down?. It's also crazy that there's no prompt after an 'rm -rf' (we could have 'rm -rf --iacceptfullresponsibility' for an unprompted version). Stuff like that would have saved me from a few embarrassments in the past for sure ;-) It drives me absolutely nuts every time I see a $staggeringly_expensive clustered server whose sysadmins are scared to do a failover test in case something goes wrong! Or which has worse uptime than my desktop PC because the cluster software's poorly set up or administered. Or which has both machines on the same circuit breaker. I could go on but it's depressing me. Favourite anecdote: A project manager friend of mine had a new 'lights out' datacenter to set up. The engineers, admins and operators swore blind that everything had been tested in every possible way, and that incredible uptime was guaranteed. 'So if I just pull this disk out everything will keep working?' he asked, and then pulled the disk out without waiting for an answer... Ever since he told me that story I've done exactly that with every piece of so-called 'redundant' hardware a vendor tries to flog me. Ask them to set it up, then just do nasty things to it without asking for permission. Less than half the gear makes it through that filter, and actually you can almost tell from the look on the technical sales rep's face as you reach for the drive/cable/card/whatever whether it will or won't. M ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Help with query plan inconsistencies
I currently have it set up to vacuum/analyze every 2 hours. However my QUERY PLAN #1 #2 in my example I ran my explain immediately after a vacuum/analyze. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Joseph Shraibman Sent: Tuesday, March 23, 2004 2:17 PM To: [EMAIL PROTECTED] Subject: Re: [PERFORM] Help with query plan inconsistencies I'm going to ask because someone else surely will: Do you regularily vacuum/analyze the database? Woody Woodring wrote: Hello, I am using postgres 7.4.2 as a backend for geocode data for a mapping application. My question is why can't I get a consistent use of my indexes during a query, I tend to get a lot of seq scan results. I use a standard query: SELECT lat, long, mac, status FROM ( SELECT text(mac) as mac, lat long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) WHERE boxtype='d' )AS FOO WHERE (long=X1) AND (long=X2) AND (lat=Y1) AND (lat=Y2) Where X1,X2,Y1,Y2 are the coordinates for the rectangle of the map viewing area. QUERY PLAN #1 #2 are from when I get a view from 10 miles out, sometimes it uses the index(#1) and most of the time not(#2). I do run into plans that seq scan both sides of the join. QUERY PLAN #3 is when I view from 5 miles out, and I have much greater chance of getting index scans ( about 90% of the time). I have listed information about the database below. Cable_billing ~500,000 rows updated once per day Davic ~500,000 rows, about 100 rows update per minute Any info or suggestions would be appreciated. Woody twc-ral-overview=# \d cable_billing; Table public.cable_billing Column | Type | Modifiers -++--- cable_billingid | integer| not null mac | macaddr| not null account | integer| number | character varying(10) | address | character varying(200) | region | character varying(30) | division| integer| franchise | integer| node| character varying(10) | lat | numeric| long| numeric| trunk | character varying(5) | ps | character varying(5) | fd | character varying(5) | le | character varying(5) | update | integer| boxtype | character(1) | Indexes: cable_billing_pkey primary key btree (mac), cable_billing_account_index btree (account), cable_billing_lat_long_idx btree (lat, long), cable_billing_node_index btree (node), cable_billing_region_index btree (region) twc-ral-overview=# \d davic Table public.davic Column | Type | Modifiers -+---+--- davicid | integer | not null mac | macaddr | not null source | character varying(20) | status | smallint | updtime | integer | type| character varying(10) | avail1 | integer | Indexes: davic_pkey primary key btree (mac) twc-ral-overview=# vacuum analyze; VACUUM twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) WHERE boxtype='d') AS foo WHERE (long=-78.70723462816063) AND (long=-78.53096764204116) AND (lat=35.5741118787) AND (lat=35.66366331376857); QUERY PLAN #1 -- -- - Nested Loop Left Join (cost=0.00..23433.18 rows=1871 width=34) (actual time=0.555..5095.434 rows=3224 loops=1) - Index Scan using cable_billing_lat_long_idx on cable_billing (cost=0.00..12145.85 rows=1871 width=32) (actual time=0.431..249.931 rows=3224 loops=1) Index Cond: ((lat = 35.5741118787) AND (lat = 35.66366331376857) AND (long = -78.70723462816063) AND (long = -78.53096764204116)) Filter: (boxtype = 'd'::bpchar) - Index Scan using davic_pkey on davic (cost=0.00..6.01 rows=1 width=8) (actual time=1.476..1.480 rows=1 loops=3224) Index Cond: (outer.mac = davic.mac) Total runtime: 5100.028 ms (7 rows) twc-ral-overview=# vacuum analyze; VACUUM twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) WHERE boxtype='d') AS foo
Re: [PERFORM] slow vacuum performance
pginfo wrote: Hi, I am running pg 7.4.1 on linux box. I have a midle size DB with many updates and after it I try to run vacuum full analyze. It takes about 2 h. If I try to dump and reload the DB it take 20 min. How can I improve the vacuum full analyze time? How often are you vacuuming? If you've gone a LONG time since the last vacuum, it can take quite a while, to the point where a dump/restore is faster. A recent realization that I've had some misconceptions about vacuuming led me to re-read section 8.2 of the admin guide (on vacuuming) ... I highly suggest a review of these 3 pages of the admin manual, as it contains an excellent description of why databases need vacuumed, that one can use to determine how often vacuuming is necessary. -- Bill Moran Potential Technologies http://www.potentialtech.com ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] slow vacuum performance
Hi Bill, I am vacuuming every 24 h. I have a cron script about i. But if I make massive update (for example it affects 1 M rows) and I start vacuum, it take this 2 h. Also I will note, that this massive update is running in one transaction ( I can not update 100K and start vacuum after it). regards, ivan. Bill Moran wrote: pginfo wrote: Hi, I am running pg 7.4.1 on linux box. I have a midle size DB with many updates and after it I try to run vacuum full analyze. It takes about 2 h. If I try to dump and reload the DB it take 20 min. How can I improve the vacuum full analyze time? How often are you vacuuming? If you've gone a LONG time since the last vacuum, it can take quite a while, to the point where a dump/restore is faster. A recent realization that I've had some misconceptions about vacuuming led me to re-read section 8.2 of the admin guide (on vacuuming) ... I highly suggest a review of these 3 pages of the admin manual, as it contains an excellent description of why databases need vacuumed, that one can use to determine how often vacuuming is necessary. -- Bill Moran Potential Technologies http://www.potentialtech.com ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] slow vacuum performance
On Wed, 24 Mar 2004, pginfo wrote: Hi, I am running pg 7.4.1 on linux box. I have a midle size DB with many updates and after it I try to run vacuum full analyze. Is there a reason to not use just regular vacuum / analyze (i.e. NOT full)? It takes about 2 h. Full vacuums, by their nature, tend to be a bit slow. It's better to let the database achieve a kind of steady state with regards to number of dead tuples, and use regular vacuums to reclaim said space rather than a full vacuum. How can I improve the vacuum full analyze time? My configuration: shared_buffers = 15000 # min 16, at least max_connections*2, 8KB each sort_mem = 1# min 64, size in KB You might want to look at dropping sort_mem. It would appear you've been going through the postgresql.conf file and bumping up numbers to see what works and what doesn't. While most of the settings aren't too dangerous to crank up a little high, sort_mem is quite dangerous to crank up high, should you have a lot of people connected who are all sorting. Note that sort_mem is a limit PER SORT, not per backend, or per database, or per user, or even per table, but per sort. IF a query needs to run three or four sorts, it can use 3 or 4x sort_mem. If a hundred users do this at once, they can then use 300 or 400x sort_mem. You can see where I'm heading. Note that for individual sorts in batch files, like import processes, you can bump up sort_mem with the set command, so you don't have to have a large setting in postgresql.conf to use a lot of sort mem when you need to, you can just grab it during that one session. vacuum_mem = 32000 # min 1024, size in KB If you've got lots of memory, crank up vacuum_mem to the 200 to 500 meg range and see what happens. For a good tuning guide, go here: http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] atrocious update performance
Greg Spiegelberg wrote: Will advise. After creating 100, 1K, 10K, 100K and 1M-row subsets of account.cust and the corresponding rows/tables with foreign key constraints referring to the table, I'm unable to reproduce the behavior at issue. explain analyze looks like the following, showing the query run with the join column indexed and not, respectively: # explain analyze update test.cust100 set prodid = tempprod.prodid, subprodid = tempprod.subprodid where origid = tempprod.debtid; -- with index QUERY PLAN --- Merge Join (cost=0.00..25.64 rows=500 width=220) (actual time=0.241..13.091 rows=100 loops=1) Merge Cond: ((outer.origid)::text = (inner.debtid)::text) - Index Scan using ix_origid_cust100 on cust100 (cost=0.00..11.50 rows=500 width=204) (actual time=0.125..6.465 rows=100 loops=1) - Index Scan using ix_debtid on tempprod (cost=0.00..66916.71 rows=4731410 width=26) (actual time=0.057..1.497 rows=101 loops=1) Total runtime: 34.067 ms (5 rows) -- without index QUERY PLAN -- Merge Join (cost=7.32..16.71 rows=100 width=220) (actual time=4.415..10.918 rows=100 loops=1) Merge Cond: ((outer.debtid)::text = inner.?column22?) - Index Scan using ix_debtid on tempprod (cost=0.00..66916.71 rows=4731410 width=26) (actual time=0.051..1.291 rows=101 loops=1) - Sort (cost=7.32..7.57 rows=100 width=204) (actual time=4.311..4.450 rows=100 loops=1) Sort Key: (cust100.origid)::text - Seq Scan on cust100 (cost=0.00..4.00 rows=100 width=204) (actual time=0.235..2.615 rows=100 loops=1) Total runtime: 25.031 ms (7 rows) With the join column indexed, it takes roughly .32ms/row on the first four tests (100.. 100K), and about .48ms/row on 1M rows. Without the index, it runs 100 rows @ .25/row, 1000 @ .26, 1 @ .27, 10 @ .48 and .5 @ 1M rows. In no case does the query plan reflect foreign key validation. Failing any other suggestions for diagnosis in the soon, I'm going to nuke the PostgreSQL install, scour it from the machine and start from scratch. Failing that, I'm going to come in some weekend and re-do the machine. Problem is when I recreate the indexes and add the constraints back on ORIG I end up with the same long running process. The original UPDATE runs for about 30 minutes on a table of 400,000 with the WHERE matching about 70% of the rows. The above runs for about 2 minutes without adding the constraints or indexes however adding the constraints and creating the dropped indexes negates any gain. Is this a frequently-run update? In my experience, with my seemingly mutant install, dropping indices and constraints to shave 14/15 off the update time would be worth the effort. Just script dropping, updating and recreating into one large transaction. It's a symptom-level fix, but re-creating the fifteen indices on one of our 5M row tables doesn't take 28 minutes, and your hardware looks to be rather less IO and CPU bound than ours. I'd also second Tom's suggestion of moving to 7.4. /rls ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Help with query plan inconsistencies
On Tuesday 23 March 2004 18:49, Woody Woodring wrote: Hello, I am using postgres 7.4.2 as a backend for geocode data for a mapping application. My question is why can't I get a consistent use of my indexes during a query, I tend to get a lot of seq scan results. I'm not sure it wants to be using the indexes all of the time. Nested Loop Left Join (cost=0.00..23433.18 rows=1871 width=34) (actual time=0.555..5095.434 rows=3224 loops=1) Total runtime: 5100.028 ms Nested Loop Left Join (cost=0.00..76468.90 rows=9223 width=34) (actual time=0.559..17387.427 rows=19997 loops=1) Total runtime: 17416.501 ms Nested Loop Left Join (cost=0.00..29160.02 rows=2327 width=34) (actual time=0.279..510.773 rows=5935 loops=1) Total runtime: 516.782 ms #1 = 630 rows/sec (with index on cable_billing) #2 = 1,148 rows/sec (without index) #3 = 11,501 rows/sec (with index) The third case is so much faster, I suspect the data wasn't cached at the beginning of this run. In any case #2 is faster than #1. If the planner is getting things wrong, you're not showing it here. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux
SS == Stalin Subbiah Subbiah writes: SS We are looking into Sun V210 (2 x 1 GHz cpu, 2 gig ram, 5.8Os) SS vs. Dell 1750 (2 x 2.4 GHz xeon, 2 gig ram, RH3.0). database will SS mostly be write intensive and disks will be on raid 10. Wondering SS if 64bit 1 GHz to 32bit 2.4 GHz make a big difference here. Spend all your money speeding up your disk system. If you're mostly writing (like my main app) then that's your bottleneck. I use a dell 2650 with external RAID 5 on 14 spindles. I didn't need that much disk space, but went for maxing out the number of spindles. RAID 5 was faster than RAID10 or RAID50 with this configuration for me. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Vivek Khera, Ph.D.Khera Communications, Inc. Internet: [EMAIL PROTECTED] Rockville, MD +1-301-869-4449 x806 AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] slow vacuum performance
Hi, scott.marlowe wrote: On Wed, 24 Mar 2004, pginfo wrote: Hi, I am running pg 7.4.1 on linux box. I have a midle size DB with many updates and after it I try to run vacuum full analyze. Is there a reason to not use just regular vacuum / analyze (i.e. NOT full)? Yes, in case I make massive updates (only in my case of cource) for example 2 M rows, I do not expect to have 2M new rows in next 180 days.That is the reaso for running vacuum full. My idea was to free unneedet space and so to have faster system. It is possible that I am wrong. It takes about 2 h. Full vacuums, by their nature, tend to be a bit slow. It's better to let the database achieve a kind of steady state with regards to number of dead tuples, and use regular vacuums to reclaim said space rather than a full vacuum. How can I improve the vacuum full analyze time? My configuration: shared_buffers = 15000 # min 16, at least max_connections*2, 8KB each sort_mem = 1# min 64, size in KB You might want to look at dropping sort_mem. It would appear you've been going through the postgresql.conf file and bumping up numbers to see what works and what doesn't. While most of the settings aren't too dangerous to crank up a little high, sort_mem is quite dangerous to crank up high, should you have a lot of people connected who are all sorting. Note that sort_mem is a limit PER SORT, not per backend, or per database, or per user, or even per table, but per sort. IF a query needs to run three or four sorts, it can use 3 or 4x sort_mem. If a hundred users do this at once, they can then use 300 or 400x sort_mem. You can see where I'm heading. Note that for individual sorts in batch files, like import processes, you can bump up sort_mem with the set command, so you don't have to have a large setting in postgresql.conf to use a lot of sort mem when you need to, you can just grab it during that one session. I know. In my case we are using many ID's declared as varchar/name (I know it is bad idea, butwe are migrating this system from oracle) and pg have very bad performance with varchar/name indexes. The only solution I found was to increase the sort mem. But, I wll try to decrease this one and to see the result. vacuum_mem = 32000 # min 1024, size in KB If you've got lots of memory, crank up vacuum_mem to the 200 to 500 meg range and see what happens. I wil try it today. It is good idea and hope it will help. For a good tuning guide, go here: http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html I know it. It is the best I found and also the site. Thanks for the help. ivan. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] slow vacuum performance
On Wed, 24 Mar 2004, pginfo wrote: Hi, scott.marlowe wrote: On Wed, 24 Mar 2004, pginfo wrote: Hi, I am running pg 7.4.1 on linux box. I have a midle size DB with many updates and after it I try to run vacuum full analyze. Is there a reason to not use just regular vacuum / analyze (i.e. NOT full)? Yes, in case I make massive updates (only in my case of cource) for example 2 M rows, I do not expect to have 2M new rows in next 180 days.That is the reaso for running vacuum full. My idea was to free unneedet space and so to have faster system. It is possible that I am wrong. It's all about percentages. If you've got an average of 5% dead tuples with regular vacuuming, then full vacuums won't gain you much, if anything. If you've got 20 dead tuples for each live one, then a full vacuum is pretty much a necessity. The generally accepted best performance comes with 5 to 50% or so dead tuples. Keep in mind, having a few dead tuples is actually a good thing, as your database won't grow then srhink the file all the time, but keep it in a steady state size wise. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] slow vacuum performance
scott.marlowe wrote: On Wed, 24 Mar 2004, pginfo wrote: Hi, scott.marlowe wrote: On Wed, 24 Mar 2004, pginfo wrote: Hi, I am running pg 7.4.1 on linux box. I have a midle size DB with many updates and after it I try to run vacuum full analyze. Is there a reason to not use just regular vacuum / analyze (i.e. NOT full)? Yes, in case I make massive updates (only in my case of cource) for example 2 M rows, I do not expect to have 2M new rows in next 180 days.That is the reaso for running vacuum full. My idea was to free unneedet space and so to have faster system. It is possible that I am wrong. It's all about percentages. If you've got an average of 5% dead tuples with regular vacuuming, then full vacuums won't gain you much, if anything. If you've got 20 dead tuples for each live one, then a full vacuum is pretty much a necessity. The generally accepted best performance comes with 5 to 50% or so dead tuples. Keep in mind, having a few dead tuples is actually a good thing, as your database won't grow then srhink the file all the time, but keep it in a steady state size wise. thanks for the good analyze,ivan. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])