[PERFORM] Optimisation of INTERSECT expressions
Dear PostgresQL Experts, I am trying to get to the bottom of some efficiency problems and hope that you can help. The difficulty seems to be with INTERSECT expressions. I have a query of the form select A from T where C1 intersect select A from T where C2; It runs in about 100 ms. But it is equivalent to this query select A from T where C1 and C2; which runs in less than 10 ms. Looking at the output of explain analyse on the first query, it seems that PostgresQL always computes the two sub-expressions and then computes an explicit intersection on the results. I had hoped that it would notice that both subexpressions are scanning the same input table T and convert the expression to the second form. Is there a reason why it can't do this transformation? (Of course, I could just re-write my code to use the second form, but my application is generating these bits of SQL programmatically, and it is not trivial as in some cases the two tables are not the same so an intersection really is needed; if PostgresQL can do it for me, that would be much better. I don't want to write an SQL parser!) While studying the same code I found another case where my INTERSECT expressions don't seem to be optimised as much as I'd like. In this case, one of the subexpressions being intersected is empty much of the time. But even when it is empty, PostgresQL computes the other (expensive) subexpression and does an intersect. Could PostgresQL do something like this: - guess which subexpression is likely to produce fewest rows - compute this subexpression - if empty, return now with an empty result - compute other subexpression - compute intersection - return intersection Alternatively, it could be defined that the left subexpression is always computed first and the second not computed if it is empty, like the behaviour of logical AND and OR operators in C. Thanks in advance for any suggestions. --Phil. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Optimisation of INTERSECT expressions
On Tue, 23 Mar 2004, Phil Endecott wrote: Dear PostgresQL Experts, I am trying to get to the bottom of some efficiency problems and hope that you can help. The difficulty seems to be with INTERSECT expressions. I have a query of the form select A from T where C1 intersect select A from T where C2; It runs in about 100 ms. But it is equivalent to this query select A from T where C1 and C2; which runs in less than 10 ms. Looking at the output of explain analyse on the first query, it seems that PostgresQL always computes the two sub-expressions and then computes an explicit intersection on the results. I had hoped that it would notice that both subexpressions are scanning the same input table T and convert the expression to the second form. Is there a reason why it can't do this transformation? Probably because noone's bothered to try to prove under what conditions it's the same. For example, given a non-unique A, the two queries can give different answers (if say the same two A values match both C1 and C2 in different rows how many output rows does each give? *), also given a non-stable A (for example random) the two queries are not necessarily equivalent. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Optimisation of INTERSECT expressions
On Tue, 23 Mar 2004, Stephan Szabo wrote: On Tue, 23 Mar 2004, Phil Endecott wrote: Dear PostgresQL Experts, I am trying to get to the bottom of some efficiency problems and hope that you can help. The difficulty seems to be with INTERSECT expressions. I have a query of the form select A from T where C1 intersect select A from T where C2; It runs in about 100 ms. But it is equivalent to this query select A from T where C1 and C2; which runs in less than 10 ms. Looking at the output of explain analyse on the first query, it seems that PostgresQL always computes the two sub-expressions and then computes an explicit intersection on the results. I had hoped that it would notice that both subexpressions are scanning the same input table T and convert the expression to the second form. Is there a reason why it can't do this transformation? Probably because noone's bothered to try to prove under what conditions it's the same. For example, given a non-unique A, the two queries can give different answers (if say the same two A values match both C1 and C2 in different rows how many output rows does each give? *), also given a non-stable A (for example random) the two queries are not necessarily equivalent. Ugh, the example got trimmed out for the * Given a non-unique A, C1 as B5, c2 as C5 and the data: A | B | C 1 | 6 | 1 1 | 1 | 6 The intersect gives 1 row, the and query gives 0 AFAICS. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] atrocious update performance
Greg Spiegelberg wrote: I've been following this thread closely as I have the same problem with an UPDATE. Everything is identical here right down to the strace output. Has anyone found a workaround or resolved the problem? If not, I have test systems here which I can use to help up test and explore. I'm still gathering data. The explain analyze I'd expected to finish Thursday afternoon hasn't yet. I'm going to kill it and try a few smaller runs, increasing in size, until the behavior manifests. Will advise. /rls ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Optimisation of INTERSECT expressions
I asked: select A from T where C1 intersect select A from T where C2; select A from T where C1 and C2; [why isn't the first optimised into the second?] Stephan Szabo answered: Given a non-unique A, C1 as B5, c2 as C5 and the data: A | B | C 1 | 6 | 1 1 | 1 | 6 The intersect gives 1 row, the and query gives 0 AFAICS. Tom Lane answered: Another way that the queries are not equivalent is that INTERSECT is defined to remove duplicate output rows (much like DISTINCT) whereas the AND form of course won't do that. Thanks! In my case the attribute A is unique - it is the primary key - and I hadn't considered the more general case properly. So I suppose I'll have to find a more sophisticated way to generate my queries. Imagine a user interface for a search facility with various buttons and text entry fields. At the moment, for each part of the search that the user has enabled I create a string of SQL. I then compose them into a single statement using INTERSECT. Each sub-query always returns the same attribute, but to make things complicated they may come from different tables. It now seems that I'll have to merge the queries more thoroughly. Does anyone have any suggestions about how to do this? I'd like a nice general technique that works for all possible subqueries, as my current composition with INTERSECT does. Any thoughts on my other question about empty intersections? Thanks again for the feedback. --Phil. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Optimisation of INTERSECT expressions
On Tue, Mar 23, 2004 at 11:21:39 -0500, Phil Endecott [EMAIL PROTECTED] wrote: Does anyone have any suggestions about how to do this? I'd like a nice general technique that works for all possible subqueries, as my current composition with INTERSECT does. One adjustment you might make is using INTERSECT ALL if you know there can't be duplicates. Then time won't be wasted trying to remove duplicates. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Optimisation of INTERSECT expressions
Bruno Wolff III [EMAIL PROTECTED] writes: On Tue, Mar 23, 2004 at 11:21:39 -0500, Phil Endecott [EMAIL PROTECTED] wrote: Does anyone have any suggestions about how to do this? I'd like a nice general technique that works for all possible subqueries, as my current composition with INTERSECT does. One adjustment you might make is using INTERSECT ALL if you know there can't be duplicates. Then time won't be wasted trying to remove duplicates. Actually, I don't think that will help. UNION ALL is much faster than UNION, because it doesn't have to match up duplicates, but INTERSECT and EXCEPT still have to match rows from the inputs in order to find out if they should emit a row at all. IIRC there will not be any noticeable speed difference with or without ALL. AFAICS, what Phil will want to do is SELECT a FROM table1 WHERE cond11 AND cond12 AND ... INTERSECT SELECT a FROM table2 WHERE cond21 AND cond22 AND ... INTERSECT ... which is more painful to assemble than his current approach, but it shouldn't be *that* bad --- you just need to tag each condition with the table it applies to, and bring together matching tags when you build the SQL string. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Optimisation of INTERSECT expressions
Phil, So I suppose I'll have to find a more sophisticated way to generate my queries. Imagine a user interface for a search facility with various buttons and text entry fields. At the moment, for each part of the search that the user has enabled I create a string of SQL. I then compose them into a single statement using INTERSECT. Each sub-query always returns the same attribute, but to make things complicated they may come from different tables. It now seems that I'll have to merge the queries more thoroughly. Does anyone have any suggestions about how to do this? I'd like a nice general technique that works for all possible subqueries, as my current composition with INTERSECT does. I've done this but it involves a choice between a lot of infrastrucure for fully configurable queries, or limiting user choice.The former option requires that you construct reference tables holding what search fields are available, what kind of values they hold, and what operators to use while querying, as well as a table storing the joins used for the various tables that can be queried. Based on that, you can construct dynamically a query on any field or combo of fields listed in your reference tables. If search options are more constrained, you can simply take the easier path of hard-coding the query building blocks into a set-returning function. I do this all the time for Web search interfaces, where the user only has about 9 things to search on. -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux
On Mon, Mar 22, 2004 at 04:05:45PM -0800, Subbiah, Stalin wrote: being the key performance booster for postgres. what is the preferred OS for postgres deployment if given an option between linux and solaris. As One thing this very much depends on is what you're trying to do. Suns have a reputation for greater reliability. While my own experience with Sun hardware has been rather shy of sterling, I _can_ say that it stands head and shoulders above a lot of the x86 gear you can get. If you're planning to use Solaris on x86, don't bother. Solaris is a slow, bloated pig compared to Linux, at least when it comes to managing the largish number of processes that Postgres requires. If pure speed is what you're after, I have found that 2-way, 32 bit Linux on P-IIIs compares very favourably to 4 way 64 bit Ultra SPARC IIs. A -- Andrew Sullivan | [EMAIL PROTECTED] The fact that technology doesn't work is no bar to success in the marketplace. --Philip Greenspun ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux
We are looking into Sun V210 (2 x 1 GHz cpu, 2 gig ram, 5.8Os) vs. Dell 1750 (2 x 2.4 GHz xeon, 2 gig ram, RH3.0). database will mostly be write intensive and disks will be on raid 10. Wondering if 64bit 1 GHz to 32bit 2.4 GHz make a big difference here. Thanks! -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Andrew Sullivan Sent: Tuesday, March 23, 2004 9:37 AM To: '[EMAIL PROTECTED]' Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux On Mon, Mar 22, 2004 at 04:05:45PM -0800, Subbiah, Stalin wrote: being the key performance booster for postgres. what is the preferred OS for postgres deployment if given an option between linux and solaris. As One thing this very much depends on is what you're trying to do. Suns have a reputation for greater reliability. While my own experience with Sun hardware has been rather shy of sterling, I _can_ say that it stands head and shoulders above a lot of the x86 gear you can get. If you're planning to use Solaris on x86, don't bother. Solaris is a slow, bloated pig compared to Linux, at least when it comes to managing the largish number of processes that Postgres requires. If pure speed is what you're after, I have found that 2-way, 32 bit Linux on P-IIIs compares very favourably to 4 way 64 bit Ultra SPARC IIs. A -- Andrew Sullivan | [EMAIL PROTECTED] The fact that technology doesn't work is no bar to success in the marketplace. --Philip Greenspun ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux
What bus speeds? 533MHz on the 32-bit Intel will give you about 4.2Gbps of IO throughput... I think the Sun will be 150MHz, 64bit is 2.4Gbps of IO. Correct me if i am wrong. Thanks, Anjan -Original Message- From: Subbiah, Stalin [mailto:[EMAIL PROTECTED] Sent: Tue 3/23/2004 1:40 PM To: 'Andrew Sullivan'; '[EMAIL PROTECTED]' Cc: Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux We are looking into Sun V210 (2 x 1 GHz cpu, 2 gig ram, 5.8Os) vs. Dell 1750 (2 x 2.4 GHz xeon, 2 gig ram, RH3.0). database will mostly be write intensive and disks will be on raid 10. Wondering if 64bit 1 GHz to 32bit 2.4 GHz make a big difference here. Thanks! -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Andrew Sullivan Sent: Tuesday, March 23, 2004 9:37 AM To: '[EMAIL PROTECTED]' Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux On Mon, Mar 22, 2004 at 04:05:45PM -0800, Subbiah, Stalin wrote: being the key performance booster for postgres. what is the preferred OS for postgres deployment if given an option between linux and solaris. As One thing this very much depends on is what you're trying to do. Suns have a reputation for greater reliability. While my own experience with Sun hardware has been rather shy of sterling, I _can_ say that it stands head and shoulders above a lot of the x86 gear you can get. If you're planning to use Solaris on x86, don't bother. Solaris is a slow, bloated pig compared to Linux, at least when it comes to managing the largish number of processes that Postgres requires. If pure speed is what you're after, I have found that 2-way, 32 bit Linux on P-IIIs compares very favourably to 4 way 64 bit Ultra SPARC IIs. A -- Andrew Sullivan | [EMAIL PROTECTED] The fact that technology doesn't work is no bar to success in the marketplace. --Philip Greenspun ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux
If it's going to be write intensive then the RAID controller will be the most important thing. A dual p3/500 with a write-back cache will smoke either of the boxes you mention using software RAID on write performance. As for the compute intensive side (complex joins sorts etc), the Dell will most likely beat the Sun by some distance, although what the Sun lacks in CPU power it may make up a bit in memory bandwidth/latency. Matt -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Subbiah, Stalin Sent: 23 March 2004 18:41 To: 'Andrew Sullivan'; '[EMAIL PROTECTED]' Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux We are looking into Sun V210 (2 x 1 GHz cpu, 2 gig ram, 5.8Os) vs. Dell 1750 (2 x 2.4 GHz xeon, 2 gig ram, RH3.0). database will mostly be write intensive and disks will be on raid 10. Wondering if 64bit 1 GHz to 32bit 2.4 GHz make a big difference here. Thanks! -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Andrew Sullivan Sent: Tuesday, March 23, 2004 9:37 AM To: '[EMAIL PROTECTED]' Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux On Mon, Mar 22, 2004 at 04:05:45PM -0800, Subbiah, Stalin wrote: being the key performance booster for postgres. what is the preferred OS for postgres deployment if given an option between linux and solaris. As One thing this very much depends on is what you're trying to do. Suns have a reputation for greater reliability. While my own experience with Sun hardware has been rather shy of sterling, I _can_ say that it stands head and shoulders above a lot of the x86 gear you can get. If you're planning to use Solaris on x86, don't bother. Solaris is a slow, bloated pig compared to Linux, at least when it comes to managing the largish number of processes that Postgres requires. If pure speed is what you're after, I have found that 2-way, 32 bit Linux on P-IIIs compares very favourably to 4 way 64 bit Ultra SPARC IIs. A -- Andrew Sullivan | [EMAIL PROTECTED] The fact that technology doesn't work is no bar to success in the marketplace. --Philip Greenspun ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[PERFORM] Help with query plan inconsistencies
Hello, I am using postgres 7.4.2 as a backend for geocode data for a mapping application. My question is why can't I get a consistent use of my indexes during a query, I tend to get a lot of seq scan results. I use a standard query: SELECT lat, long, mac, status FROM ( SELECT text(mac) as mac, lat long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) WHERE boxtype='d' )AS FOO WHERE (long=X1) AND (long=X2) AND (lat=Y1) AND (lat=Y2) Where X1,X2,Y1,Y2 are the coordinates for the rectangle of the map viewing area. QUERY PLAN #1 #2 are from when I get a view from 10 miles out, sometimes it uses the index(#1) and most of the time not(#2). I do run into plans that seq scan both sides of the join. QUERY PLAN #3 is when I view from 5 miles out, and I have much greater chance of getting index scans ( about 90% of the time). I have listed information about the database below. Cable_billing ~500,000 rows updated once per day Davic ~500,000 rows, about 100 rows update per minute Any info or suggestions would be appreciated. Woody twc-ral-overview=# \d cable_billing; Table public.cable_billing Column | Type | Modifiers -++--- cable_billingid | integer| not null mac | macaddr| not null account | integer| number | character varying(10) | address | character varying(200) | region | character varying(30) | division| integer| franchise | integer| node| character varying(10) | lat | numeric| long| numeric| trunk | character varying(5) | ps | character varying(5) | fd | character varying(5) | le | character varying(5) | update | integer| boxtype | character(1) | Indexes: cable_billing_pkey primary key btree (mac), cable_billing_account_index btree (account), cable_billing_lat_long_idx btree (lat, long), cable_billing_node_index btree (node), cable_billing_region_index btree (region) twc-ral-overview=# \d davic Table public.davic Column | Type | Modifiers -+---+--- davicid | integer | not null mac | macaddr | not null source | character varying(20) | status | smallint | updtime | integer | type| character varying(10) | avail1 | integer | Indexes: davic_pkey primary key btree (mac) twc-ral-overview=# vacuum analyze; VACUUM twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) WHERE boxtype='d') AS foo WHERE (long=-78.70723462816063) AND (long=-78.53096764204116) AND (lat=35.5741118787) AND (lat=35.66366331376857); QUERY PLAN #1 - Nested Loop Left Join (cost=0.00..23433.18 rows=1871 width=34) (actual time=0.555..5095.434 rows=3224 loops=1) - Index Scan using cable_billing_lat_long_idx on cable_billing (cost=0.00..12145.85 rows=1871 width=32) (actual time=0.431..249.931 rows=3224 loops=1) Index Cond: ((lat = 35.5741118787) AND (lat = 35.66366331376857) AND (long = -78.70723462816063) AND (long = -78.53096764204116)) Filter: (boxtype = 'd'::bpchar) - Index Scan using davic_pkey on davic (cost=0.00..6.01 rows=1 width=8) (actual time=1.476..1.480 rows=1 loops=3224) Index Cond: (outer.mac = davic.mac) Total runtime: 5100.028 ms (7 rows) twc-ral-overview=# vacuum analyze; VACUUM twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) WHERE boxtype='d') AS foo WHERE (long=-78.87878592206046) AND (long=-78.70220280717479) AND (lat=35.71703190638861) AND (lat=35.80658335998006); QUERY PLAN #2 --- Nested Loop Left Join (cost=0.00..76468.90 rows=9223 width=34) (actual time=0.559..17387.427 rows=19997 loops=1) - Seq Scan on cable_billing (cost=0.00..20837.76 rows=9223 width=32) (actual time=0.290..7117.799 rows=19997 loops=1) Filter: ((boxtype = 'd'::bpchar) AND (long = -78.87878592206046) AND (long =
Re: [PERFORM] Help with query plan inconsistencies
I'm going to ask because someone else surely will: Do you regularily vacuum/analyze the database? Woody Woodring wrote: Hello, I am using postgres 7.4.2 as a backend for geocode data for a mapping application. My question is why can't I get a consistent use of my indexes during a query, I tend to get a lot of seq scan results. I use a standard query: SELECT lat, long, mac, status FROM ( SELECT text(mac) as mac, lat long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) WHERE boxtype='d' )AS FOO WHERE (long=X1) AND (long=X2) AND (lat=Y1) AND (lat=Y2) Where X1,X2,Y1,Y2 are the coordinates for the rectangle of the map viewing area. QUERY PLAN #1 #2 are from when I get a view from 10 miles out, sometimes it uses the index(#1) and most of the time not(#2). I do run into plans that seq scan both sides of the join. QUERY PLAN #3 is when I view from 5 miles out, and I have much greater chance of getting index scans ( about 90% of the time). I have listed information about the database below. Cable_billing ~500,000 rows updated once per day Davic ~500,000 rows, about 100 rows update per minute Any info or suggestions would be appreciated. Woody twc-ral-overview=# \d cable_billing; Table public.cable_billing Column | Type | Modifiers -++--- cable_billingid | integer| not null mac | macaddr| not null account | integer| number | character varying(10) | address | character varying(200) | region | character varying(30) | division| integer| franchise | integer| node| character varying(10) | lat | numeric| long| numeric| trunk | character varying(5) | ps | character varying(5) | fd | character varying(5) | le | character varying(5) | update | integer| boxtype | character(1) | Indexes: cable_billing_pkey primary key btree (mac), cable_billing_account_index btree (account), cable_billing_lat_long_idx btree (lat, long), cable_billing_node_index btree (node), cable_billing_region_index btree (region) twc-ral-overview=# \d davic Table public.davic Column | Type | Modifiers -+---+--- davicid | integer | not null mac | macaddr | not null source | character varying(20) | status | smallint | updtime | integer | type| character varying(10) | avail1 | integer | Indexes: davic_pkey primary key btree (mac) twc-ral-overview=# vacuum analyze; VACUUM twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) WHERE boxtype='d') AS foo WHERE (long=-78.70723462816063) AND (long=-78.53096764204116) AND (lat=35.5741118787) AND (lat=35.66366331376857); QUERY PLAN #1 - Nested Loop Left Join (cost=0.00..23433.18 rows=1871 width=34) (actual time=0.555..5095.434 rows=3224 loops=1) - Index Scan using cable_billing_lat_long_idx on cable_billing (cost=0.00..12145.85 rows=1871 width=32) (actual time=0.431..249.931 rows=3224 loops=1) Index Cond: ((lat = 35.5741118787) AND (lat = 35.66366331376857) AND (long = -78.70723462816063) AND (long = -78.53096764204116)) Filter: (boxtype = 'd'::bpchar) - Index Scan using davic_pkey on davic (cost=0.00..6.01 rows=1 width=8) (actual time=1.476..1.480 rows=1 loops=3224) Index Cond: (outer.mac = davic.mac) Total runtime: 5100.028 ms (7 rows) twc-ral-overview=# vacuum analyze; VACUUM twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) WHERE boxtype='d') AS foo WHERE (long=-78.87878592206046) AND (long=-78.70220280717479) AND (lat=35.71703190638861) AND (lat=35.80658335998006); QUERY PLAN #2 --- Nested Loop Left Join (cost=0.00..76468.90 rows=9223 width=34) (actual time=0.559..17387.427 rows=19997 loops=1) - Seq Scan on cable_billing (cost=0.00..20837.76 rows=9223 width=32) (actual time=0.290..7117.799
Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux
Matt, Stalin, As for the compute intensive side (complex joins sorts etc), the Dell will most likely beat the Sun by some distance, although what the Sun lacks in CPU power it may make up a bit in memory bandwidth/ latency. Personally, I've been unimpressed by Dell/Xeon; I think the Sun might do better than you think, comparitively.On all the Dell servers I've used so far, I've not seen performance that comes even close to the hardware specs. -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] SLOW query with aggregates
A Palmblad [EMAIL PROTECTED] writes: GroupAggregate (cost=0.00..338300.34 rows=884 width=345) (actual time=86943.272..382718.104 rows=3117 loops=1) - Merge Join (cost=0.00..93642.52 rows=1135610 width=345) (actual time=0.148..24006.748 rows=1120974 loops=1) You do not have a planning problem here, and trying to change the plan is a waste of time. The slowness is in the actual computation of the aggregate functions; ergo the only way to speed it up is to change what you're computing. What aggregates are you computing exactly, and over what datatypes? regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux
On Tue, 23 Mar 2004, Josh Berkus wrote: Matt, Stalin, As for the compute intensive side (complex joins sorts etc), the Dell will most likely beat the Sun by some distance, although what the Sun lacks in CPU power it may make up a bit in memory bandwidth/ latency. Personally, I've been unimpressed by Dell/Xeon; I think the Sun might do better than you think, comparitively.On all the Dell servers I've used so far, I've not seen performance that comes even close to the hardware specs. We use a 2600 at work (dual 2.8GHz) with the LSI/Megaraid based battery backed caching controller, and it flies. Truly flies. It's not Dell that's so slow, it's the default adaptec RAID controller or IDE drives that are slow. Ours has 533 MHz memory bus, by the way. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[PERFORM] Fwd: FreeBSD, PostgreSQL, semwait and sbwait!
-- Forwarded Message -- Subject: FreeBSD, PostgreSQL, semwait and sbwait! Date: March 23, 2004 12:02 pm From: Jason Coene [EMAIL PROTECTED] To: [EMAIL PROTECTED] Hello all, We're having a substantial problem with our FreeBSD 5.2 database server running PostgreSQL - it's getting a lot of traffic (figure about 3,000 queries per second), but queries are slow, and it's seemingly waiting on other things than CPU time. The database server is a dual P4-2.8 w/ HT enabled (kernel finds 4 processors), 2GB RAM, 4 disk Serial ATA on 3ware RAID, gigabit Ethernet connection to web servers. It's running FreeBSD 5.2 and PostgreSQL 7.4.1. The server is taking a while to respond to both connections, and then queries (between 1-3 seconds, on a query that should execute in 0.05 or less). The CPU usage for the server never goes above 30% (70% idle), and the CPU time that's in use is nearly always split equal between user and system. The system is using Doing a top, this is what we see: last pid: 51833; load averages: 13.72, 11.74, 10.01 up 0+01:55:45 15:00:03 116 processes: 1 running, 99 sleeping, 16 lock CPU states: 14.6% user, 0.0% nice, 23.7% system, 0.2% interrupt, 61.5% idle Mem: 91M Active, 1043M Inact, 160M Wired, 52K Cache, 112M Buf, 644M Free Swap: 4096M Total, 4096M Free 20354 pgsql1310 80728K 5352K select 0 0:24 1.71% 1.71% postgres 36415 pgsql 40 81656K 67468K sbwait 2 0:00 3.23% 0.59% postgres 36442 pgsql1280 82360K 15868K select 2 0:00 1.75% 0.24% postgres 36447 pgsql -40 82544K 10616K semwai 0 0:00 2.05% 0.20% postgres 36461 pgsql -40 81612K 6844K semwai 2 0:00 2.05% 0.20% postgres 36368 pgsql 40 82416K 20780K sbwait 3 0:00 0.50% 0.15% postgres 36459 pgsql -40 81840K 7816K semwai 0 0:00 1.54% 0.15% postgres 36469 pgsql -40 81840K 7964K semwai 2 0:00 1.54% 0.15% postgres 36466 pgsql1290 81840K 7976K *Giant 2 0:00 1.54% 0.15% postgres 36479 pgsql -40 81528K 6648K semwai 0 0:00 3.00% 0.15% postgres 36457 pgsql -40 81840K 8040K semwai 1 0:00 1.03% 0.10% postgres 36450 pgsql1290 82352K 8188K *Giant 2 0:00 1.03% 0.10% postgres 36472 pgsql -40 81824K 7416K semwai 2 0:00 1.03% 0.10% postgres 36478 pgsql1310 81840K 7936K select 0 0:00 2.00% 0.10% postgres 36454 pgsql 40 82416K 16300K sbwait 3 0:00 0.51% 0.05% postgres 36414 pgsql 40 82416K 15872K sbwait 2 0:00 0.27% 0.05% postgres Our kernel is GENERIC plus: maxusers512 options SYSVSHM options SHMMAXPGS=262144 options SHMSEG=512 options SHMMNI=512 options SYSVSEM options SEMMNI=512 options SEMMNS=1024 options SEMMNU=512 options SEMMAP=512 options NMBCLUSTERS=32768 Interesting bits from postgresql.conf: max_connections = 512 shared_buffers = 8192 sort_mem = 16384 vacuum_mem = 8192 fsync = false It seems that queries are executing fine once they start, but it's taking a while for them to get going, while the postgres process sits in semwait, sbwait or select. This problem doesn't happen when there's little load on the server, it's only when we open it for public consumption that it exhibits these problems. Anyone have this type of problem before? Am I missing something? Thanks, Jason ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-performance To unsubscribe, send any mail to [EMAIL PROTECTED] --- -- Darcy Buskermolen Wavefire Technologies Corp. ph: 250.717.0200 fx: 250.763.1759 http://www.wavefire.com ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux
As anyone done performance benchmark testing with solaris sparc/intel linux. I once read a post here, which had benchmarking test results for using different filesystem like xfs, ext3, ext2, ufs etc. i couldn't find that link anymore and google is failing on me, so anyone have the link handy. Thanks! -Original Message- From: Josh Berkus [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 23, 2004 12:13 PM To: Matt Clark; Subbiah, Stalin; 'Andrew Sullivan'; [EMAIL PROTECTED] Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux Matt, Stalin, As for the compute intensive side (complex joins sorts etc), the Dell will most likely beat the Sun by some distance, although what the Sun lacks in CPU power it may make up a bit in memory bandwidth/ latency. Personally, I've been unimpressed by Dell/Xeon; I think the Sun might do better than you think, comparitively.On all the Dell servers I've used so far, I've not seen performance that comes even close to the hardware specs. -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux
On Tue, Mar 23, 2004 at 08:53:42PM -, [EMAIL PROTECTED] wrote: is way down the priority list compared with IO throughput, stability, manageability, support, etc etc. Indeed, if our Suns actually diabled the broken hardware when they died, fell over, and rebooted themselves, I'd certainly praise them to heaven. But I have to say that the really very good reporting of failing memory has saved me some headaches. environment, I'd take the Sun every day of the week, assuming that those compile option changes have sorted out the oddly slow PG performance at last. I seem to have hit a bad batch of Dell hardware recently, which makes me second this opinion. I should say, also, that my initial experience of AIX has been extremely good. I can't comment on the fun it might involve in the long haul, of course. A -- Andrew Sullivan | [EMAIL PROTECTED] This work was visionary and imaginative, and goes to show that visionary and imaginative work need not end up well. --Dennis Ritchie ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux
Subbiah, Stalin wrote: As anyone done performance benchmark testing with solaris sparc/intel linux. I once read a post here, which had benchmarking test results for using different filesystem like xfs, ext3, ext2, ufs etc. i couldn't find that link anymore and google is failing on me, so anyone have the link handy. If you're talking about the work I did, it's here: http://www.potentialtech.com/wmoran/ (then follow the link) Anyway, that should be easily portable to any platform that will run Postgres, but I don't know how useful it is in comparing two different platforms. See the information in the document. It was intended only to test disk access speed, and attempts to flood the HDD system with database work to do. Thanks! -Original Message- From: Josh Berkus [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 23, 2004 12:13 PM To: Matt Clark; Subbiah, Stalin; 'Andrew Sullivan'; [EMAIL PROTECTED] Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux Matt, Stalin, As for the compute intensive side (complex joins sorts etc), the Dell will most likely beat the Sun by some distance, although what the Sun lacks in CPU power it may make up a bit in memory bandwidth/ latency. Personally, I've been unimpressed by Dell/Xeon; I think the Sun might do better than you think, comparitively.On all the Dell servers I've used so far, I've not seen performance that comes even close to the hardware specs. -- Bill Moran Potential Technologies http://www.potentialtech.com ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Fwd: FreeBSD, PostgreSQL, semwait and sbwait!
Darcy Buskermolen [EMAIL PROTECTED] writes: The database server is a dual P4-2.8 w/ HT enabled (kernel finds 4 processors), 2GB RAM, 4 disk Serial ATA on 3ware RAID, gigabit Ethernet connection to web servers. It's running FreeBSD 5.2 and PostgreSQL 7.4.1. Hm. What happens if you turn off the hyperthreading? We have seen a number of reports recently that suggest that our spinlocking code behaves inefficiently on hyperthreaded machines. This hasn't got to the point where we have any substantiated evidence, mind you, but maybe you can help provide some. Also it might be interesting to put one of these into the outer loop in s_lock(): __asm__ __volatile__( rep; nop \n : : : memory); (This suggestion is a quick-and-dirty backport of a change that's already in CVS tip.) regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Fwd: FreeBSD, PostgreSQL, semwait and sbwait!
Darcy, I suggest getting this person over here instead.They have a *lot* to learn about tuning PostgreSQL. -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] atrocious update performance
Rosser Schwarz wrote: Greg Spiegelberg wrote: I've been following this thread closely as I have the same problem with an UPDATE. Everything is identical here right down to the strace output. Has anyone found a workaround or resolved the problem? If not, I have test systems here which I can use to help up test and explore. I'm still gathering data. The explain analyze I'd expected to finish Thursday afternoon hasn't yet. I'm going to kill it and try a few smaller runs, increasing in size, until the behavior manifests. Will advise. I've replaced my atrocious UPDATE with the following. begin; -- Drop all contraints alter table ORIG drop constraint ...; -- Drop all indexes drop index ...; -- Update update ORIG set column=... where...; commit; Problem is when I recreate the indexes and add the constraints back on ORIG I end up with the same long running process. The original UPDATE runs for about 30 minutes on a table of 400,000 with the WHERE matching about 70% of the rows. The above runs for about 2 minutes without adding the constraints or indexes however adding the constraints and creating the dropped indexes negates any gain. RedHat 7.3 + Kernel 2.4.24 + ext3 + PostgreSQL 7.3.5 Dual PIII 1.3'ishGHz, 2GB Memory U160 OS drives and a 1Gbps test SAN on a Hitachi 9910 Greg -- Greg Spiegelberg Sr. Product Development Engineer Cranel, Incorporated. Phone: 614.318.4314 Fax: 614.431.8388 Email: [EMAIL PROTECTED] Cranel. Technology. Integrity. Focus. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux
Are you talking about http://www.potentialtech.com/wmoran/postgresql.php#conclusion - Original Message - From: Subbiah, Stalin [EMAIL PROTECTED] To: [EMAIL PROTECTED]; Matt Clark [EMAIL PROTECTED]; Subbiah, Stalin [EMAIL PROTECTED]; 'Andrew Sullivan' [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Tuesday, March 23, 2004 3:42 PM Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux As anyone done performance benchmark testing with solaris sparc/intel linux. I once read a post here, which had benchmarking test results for using different filesystem like xfs, ext3, ext2, ufs etc. i couldn't find that link anymore and google is failing on me, so anyone have the link handy. Thanks! -Original Message- From: Josh Berkus [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 23, 2004 12:13 PM To: Matt Clark; Subbiah, Stalin; 'Andrew Sullivan'; [EMAIL PROTECTED] Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux Matt, Stalin, As for the compute intensive side (complex joins sorts etc), the Dell will most likely beat the Sun by some distance, although what the Sun lacks in CPU power it may make up a bit in memory bandwidth/ latency. Personally, I've been unimpressed by Dell/Xeon; I think the Sun might do better than you think, comparitively.On all the Dell servers I've used so far, I've not seen performance that comes even close to the hardware specs. -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] atrocious update performance
Greg Spiegelberg [EMAIL PROTECTED] writes: RedHat 7.3 + Kernel 2.4.24 + ext3 + PostgreSQL 7.3.5 Please try 7.4. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Fwd: FreeBSD, PostgreSQL, semwait and sbwait!
Tom, Hm. What happens if you turn off the hyperthreading? Forget hyperthreading. Look at their postgresql.conf settings. 8mb shared mem, 16mb sort mem per connection for 512 connections, default effective_cache_size. -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Fwd: FreeBSD, PostgreSQL, semwait and sbwait!
Josh Berkus [EMAIL PROTECTED] writes: Forget hyperthreading. Look at their postgresql.conf settings. 8mb shared mem, 16mb sort mem per connection for 512 connections, default effective_cache_size. They could well be going into swap hell due to the oversized sort_mem, but that didn't quite seem to explain the reported behavior. I'd want to see vmstat or similar output to confirm whether the disks are busy, though. Amazing how many people forget that a database is normally I/O-bound rather than CPU-bound. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux
Yep. Thanks Bill. -Original Message- From: Bill Moran [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 23, 2004 2:10 PM To: Subbiah, Stalin Cc: [EMAIL PROTECTED] Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux Subbiah, Stalin wrote: As anyone done performance benchmark testing with solaris sparc/intel linux. I once read a post here, which had benchmarking test results for using different filesystem like xfs, ext3, ext2, ufs etc. i couldn't find that link anymore and google is failing on me, so anyone have the link handy. If you're talking about the work I did, it's here: http://www.potentialtech.com/wmoran/ (then follow the link) Anyway, that should be easily portable to any platform that will run Postgres, but I don't know how useful it is in comparing two different platforms. See the information in the document. It was intended only to test disk access speed, and attempts to flood the HDD system with database work to do. Thanks! -Original Message- From: Josh Berkus [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 23, 2004 12:13 PM To: Matt Clark; Subbiah, Stalin; 'Andrew Sullivan'; [EMAIL PROTECTED] Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux Matt, Stalin, As for the compute intensive side (complex joins sorts etc), the Dell will most likely beat the Sun by some distance, although what the Sun lacks in CPU power it may make up a bit in memory bandwidth/ latency. Personally, I've been unimpressed by Dell/Xeon; I think the Sun might do better than you think, comparitively.On all the Dell servers I've used so far, I've not seen performance that comes even close to the hardware specs. -- Bill Moran Potential Technologies http://www.potentialtech.com ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] SLOW query with aggregates
A Palmblad [EMAIL PROTECTED] writes: GroupAggregate (cost=0.00..338300.34 rows=884 width=345) (actual time=86943.272..382718.104 rows=3117 loops=1) - Merge Join (cost=0.00..93642.52 rows=1135610 width=345) (actual time=0.148..24006.748 rows=1120974 loops=1) I think the reason you're getting a GroupAggregate here instead of a HashAggregate is that the MergeJoin is already producing the records in the desired order, so the GroupAggregate doesn't require an extra sort, ie, it's effectively free. You might be able to verify this by running the query with enable_indexscan = off and/or enable_mergejoin = off some combination of which might get the planner to do a seqscan of the large table with a hash join to the small table and then a HashAggregate. If you're reading a lot of the large table the seqscan could be a little faster, not much though. And given the accurate statistics guesses here the planner may well have gotten this one right and the seqscan is slower. Can't hurt to be verify it though. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Benchmarking postgres on Solaris/Linux
Mark, It might be worth considering Apple if you want a 64-bit chip that has a clock speed comparable to Intel's - the Xserv is similarly priced to Sun V210 (both dual cpu 1U's). Personally I'd stay *far* away from the XServs until Apple learns to build some real server harware.The current XServs have internal parts more appropriate to a Dell desktop (promise controller, low-speed commodity IDE drives), than a server. If Apple has prices these IU desktop machines similar to Sun, then I sense doom ahead for the Apple Server Division. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] [ADMIN] Databases Vs. Schemas
And we also created rules to allow update, delete, and insert on those views so that they looked like tables. The reason we did this is because we ran into issues with too many open files during pg_dump when we had thousands of tables instead of about 1 hundred tables and thousands of views. Is it because you had smaller value set for max. allowable number of open files descriptor. what was ulimit -a set to ? We, however, did have a need to periodically select data from 2 schemas at a time, and it was simpler logic than if we needed 2 database connections. Adam Ruth On Mar 22, 2004, at 2:30 PM, Subbiah, Stalin wrote: --sorry to repost, just subscribed to the list. hopefully it gets to the list this time -- Hi All, We are evaluating the options for having multiple databases vs. schemas on a single database cluster for a custom grown app that we developed. Each app installs same set of tables for each service. And the service could easily be in thousands. so Is it better to have 1000 databases vs 1000 schemas in a database cluster. What are the performance overhead of having multiple databases vs. schemas (if any). I'm leaning towards having schemas rather than databases but i would like to get others opinion on this. Appreciate your reply. Thanks, Stalin ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html