[PERFORM] Why the difference in plans ??
Hi, I have two table: customers and salesorders. salesorders have a foreign key to the customer If I run this query: SELECT salesOrders.objectid, salesOrders.ordernumber, salesOrders.orderdate, customers.objectid, customers.customernumber, customers.lastname FROM prototype.salesorders INNER JOIN prototype.customers ON ( customers.objectid = salesorders.customer ) where lastname ilike 'Boonk' order by ordernumber asc LIMIT 1 WITHOUT LIMIT 1 this query plan is executed (EXPLAIN ANALYZE): Sort (cost=41811.90..41812.78 rows=353 width=103) (actual time=623.855..623.867 rows=7 loops=1) Sort Key: salesorders.ordernumber - Nested Loop (cost=2.15..41796.96 rows=353 width=103) (actual time=0.166..623.793 rows=7 loops=1) - Seq Scan on customers (cost=0.00..21429.44 rows=118 width=55) (actual time=0.037..623.325 rows=5 loops=1) Filter: (lastname ~~* 'Boonk'::text) - Bitmap Heap Scan on salesorders (cost=2.15..172.06 rows=44 width=88) (actual time=0.075..0.079 rows=1 loops=5) Recheck Cond: (outer.objectid = salesorders.customer) - Bitmap Index Scan on orders_customer (cost=0.00..2.15 rows=44 width=0) (actual time=0.066..0.066 rows=1 loops=5) Index Cond: (outer.objectid = salesorders.customer) Total runtime: 624.051 ms With the limit this query plan is used (EXPLAIN ANALYZE): Limit (cost=0.00..18963.24 rows=1 width=103) (actual time=18404.730..18404.732 rows=1 loops=1) - Nested Loop (cost=0.00..6694025.41 rows=353 width=103) (actual time=18404.723..18404.723 rows=1 loops=1) - Index Scan using prototype_orders_ordernumber on salesorders (cost=0.00..37263.14 rows=1104381 width=88) (actual time=26.715..1862.408 rows=607645 loops=1) - Index Scan using pk_prototype_customers on customers (cost=0.00..6.02 rows=1 width=55) (actual time=0.023..0.023 rows=0 loops=607645) Index Cond: (customers.objectid = outer.customer) Filter: (lastname ~~* 'Boonk'::text) Total runtime: 18404.883 ms Both tables are freshly fully vacuumed analyzed. Why the difference and can I influence the result so that the first query plan (which is the fastest) is actually used in both cases (I would expect that the limit would be done after the sort?)? TIA -- Groeten, Joost Kraaijeveld Askesis B.V. Molukkenstraat 14 6524NB Nijmegen tel: 024-3888063 / 06-51855277 fax: 024-3608416 web: www.askesis.nl ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] High CPU Load
Hi, Jérôme, Jérôme BENOIS wrote: max_connections = 512 Do you really have that much concurrent connections? Then you should think about getting a larger machine, probably. You will definitely want to play with commit_delay and commit_siblings settings in that case, especially if you have write access. work_mem = 65536 effective_cache_size = 131072 hmm, 131072*8*1024 + 512*65536*1024 = 35433480192 - thats 33 Gig of Memory you assume here, not counting OS usage, and the fact that certain queries can use up a multiple of work_mem. Even on amachine that big, I'd be inclined to dedicate more memory to caching, and less to the backends, unless specific needs dictate it. You could try to use sqlrelay or pgpool to cut down the number of backends you need. My Server is Dual Xeon 3.06GHz For xeons, there were rumours about context switch storms which kill performance. with 2 Go RAM and good SCSI disks. For 2 Gigs of ram, you should cut down the number of concurrent backends. Does your machine go into swap? Markus -- Markus Schaber | Logical TrackingTracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Vacuums on large busy databases
Hi, Francisco, Francisco Reyes wrote: I am looking to either improve the time of the vacuum or decrease it's impact on the loads. Are the variables: #vacuum_cost_delay = 0 # 0-1000 milliseconds #vacuum_cost_page_hit = 1 # 0-1 credits #vacuum_cost_page_miss = 10 # 0-1 credits #vacuum_cost_page_dirty = 20# 0-1 credits #vacuum_cost_limit = 200# 0-1 credits Just to avoid a silly mistake: You pasted those settings with # sign, that means that PostgreSQL does treat them as comments, and uses the defaults instead. You should make shure that you use real settings in your config. HTH, Markus -- Markus Schaber | Logical TrackingTracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Why the difference in plans ??
Joost Kraaijeveld [EMAIL PROTECTED] writes: Why the difference and can I influence the result so that the first query plan (which is the fastest) is actually used in both cases (I would expect that the limit would be done after the sort?)? It likes the second plan because 6694025.41/353 41812.78. It would probably be right, too, if the number of matching rows were indeed 353, but it seems there are only 7. Try increasing your statistics target and re-analyzing. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Optimize SQL
Hi, Is there anyway we can optimize this sql ? it is doing full table scan on listing and address table . Postgres version 8.0.2 Thanks! Pallav. explain analyze select listing0_.listingid as col_0_0_, getmaxdate(listing0_.lastupdate, max(addressval2_.createdate)) as col_1_0_ from listing.listing listing0_ left outer join listing.address listingadd1_ on listing0_.fkbestaddressid=listingadd1_.addressid left outer join listing.addressvaluation addressval2_ on listingadd1_.addressid=addressval2_.fkaddressid where listing0_.lastupdate'2006-09-15 08:31:26.927' and listing0_.lastupdate=current_timestamp or addressval2_.createdate'2006-09-15 08:31:26.927' and addressval2_.createdate=current_timestamp group by listing0_.listingid , listing0_.lastupdate order by getmaxdate(listing0_.lastupdate, max(addressval2_.createdate)) asc limit 10; Limit (cost=2399501.49..2399501.51 rows=10 width=20) (actual time=414298.076..414298.174 rows=10 loops=1) - Sort (cost=2399501.49..2410707.32 rows=4482333 width=20) (actual time=414298.068..414298.098 rows=10 loops=1) Sort Key: getmaxdate(listing0_.lastupdate, max(addressval2_.createdate)) - GroupAggregate (cost=1784490.47..1851725.47 rows=4482333 width=20) (actual time=414212.926..414284.927 rows=2559 loops=1) - Sort (cost=1784490.47..1795696.31 rows=4482333 width=20) (actual time=414174.678..414183.536 rows=2563 loops=1) Sort Key: listing0_.listingid, listing0_.lastupdate - Merge Right Join (cost=1113947.32..1236714.45 rows=4482333 width=20) (actual time=273257.256..414163.920 rows=2563 loops=1) Merge Cond: (outer.fkaddressid = inner.addressid) Filter: (((inner.lastupdate '2006-09-15 08:31:26.927'::timestamp without time zone) AND (inner.lastupdate = ('now'::text)::timestamp(6) with time zone)) OR ((outer.createdate '2006-09-15 08:31:26.927'::timestamp without time zone) AND (outer.createdate = ('now'::text)::timestamp(6) with time zone))) - Index Scan using idx_addressvaluation_fkaddressid on addressvaluation addressval2_ (cost=0.00..79769.55 rows=947056 width=12) (actual time=0.120..108240.633 rows=960834 loops=1) - Sort (cost=1113947.32..1125153.15 rows=4482333 width=16) (actual time=256884.646..275823.217 rows=5669719 loops=1) Sort Key: listingadd1_.addressid - Hash Left Join (cost=228115.38..570557.39 rows=4482333 width=16) (actual time=93874.356..205054.946 rows=4490963 loops=1) Hash Cond: (outer.fkbestaddressid = inner.addressid) - Seq Scan on listing listing0_ (cost=0.00..112111.33 rows=4482333 width=16) (actual time=0.026..25398.685 rows=4490963 loops=1) - Hash (cost=18.70..18.70 rows=6990270 width=4) (actual time=93873.659..93873.659 rows=0 loops=1) - Seq Scan on address listingadd1_ (cost=0.00..18.70 rows=6990270 width=4) (actual time=13.256..69441.056 rows=6990606 loops=1) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] RAID 0 not as fast as expected
Greg, Josh, Something I found out while doing this - lvm (and lvm2) slows the block stream down dramatically. At first I was using it for convenience sake to implement partitions on top of the md devices, but I found I was stuck at about 700 MB/s. Removing lvm2 from the picture allowed me to get within chucking distance of 2GB/s. When we first started working with Solaris ZFS, we were getting about 400-600 MB/s, and after working with the Solaris Engineering team we now get rates approaching 2GB/s. The updates needed to Solaris are part of the Solaris 10 U3 available in October (and already in Solaris Express, aka Solaris 11). - Luke On 9/15/06 5:43 AM, Spiegelberg, Greg [EMAIL PROTECTED] wrote: That's an all PCI-X box which makes sense. There are 6 SATA controllers in that little beastie also. You can always count on Sun to provide over engineered boxes. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Joshua D. Drake Sent: Friday, September 15, 2006 12:01 AM To: Luke Lonergan Cc: Craig A. James; pgsql-performance@postgresql.org Subject: Re: [PERFORM] RAID 0 not as fast as expected Luke Lonergan wrote: Josh, On 9/14/06 8:47 PM, Joshua D. Drake [EMAIL PROTECTED] wrote: I've obtained 1,950 MB/s using Linux software RAID on SATA drives. With what? :) Sun X4500 (aka Thumper) running stock RedHat 4.3 (actually CentOS 4.3) with XFS and the linux md driver without lvm. Here is a summary of the results: Good god! Read Test RAID Level Max Readahead (KB) RAID Chunksize Max Readahead on Disks (KB) Max Time (s) Read Bandwidth (MB/s) 0 65536 64 256 16.689 1,917.43 0 4096 64 256 21.269 1,504.54 0 65536 256 256 17.967 1,781.04 0 2816 256 256 18.835 1,698.96 0 65536 1024 256 18.538 1,726.18 0 65536 64 512 18.295 1,749.11 0 65536 64 256 18.931 1,690.35 0 65536 64 256 18.873 1,695.54 0 64768 64 256 18.545 1,725.53 0 131172 64 256 18.548 1,725.25 0 131172 64 65536 19.046 1,680.14 0 131172 64 524288 18.125 1,765.52 0 131172 64 1048576 18.701 1,711.14 5 2560 64 256 39.933 801.34 5 16777216 64 256 37.76 847.46 5 524288 64 256 53.497 598.16 5 65536 32 256 38.472 831.77 5 65536 32 256 38.004 842.02 5 65536 32 256 37.884 844.68 5 2560 16 256 41.39 773.13 5 65536 16 256 48.902 654.37 10 65536 64 256 83.256 384.36 1+0 65536 64 256 19.394 1,649.99 1+0 65536 64 256 19.047 1,680.05 1+0 65536 64 256 19.195 1,667.10 1+0 65536 64 256 18.806 1,701.58 1+0 65536 64 256 18.848 1,697.79 1+0 65536 64 256 18.371 1,741.88 1+0 65536 64 256 21.446 1,492.12 1+0 65536 64 256 20.254 1,579.93 -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Optimize SQL
Pallav Kalva [EMAIL PROTECTED] writes: select listing0_.listingid as col_0_0_, getmaxdate(listing0_.lastupdate, max(addressval2_.createdate)) as col_1_0_ from listing.listing listing0_ left outer join listing.address listingadd1_ on listing0_.fkbestaddressid=listingadd1_.addressid left outer join listing.addressvaluation addressval2_ on listingadd1_.addressid=addressval2_.fkaddressid where listing0_.lastupdate'2006-09-15 08:31:26.927' and listing0_.lastupdate=current_timestamp or addressval2_.createdate'2006-09-15 08:31:26.927' and addressval2_.createdate=current_timestamp group by listing0_.listingid , listing0_.lastupdate order by getmaxdate(listing0_.lastupdate, max(addressval2_.createdate)) asc limit 10; If that WHERE logic is actually what you need, then getting this query to run quickly seems pretty hopeless. The database must form the full outer join result: it cannot discard any listing0_ rows, even if they have lastupdate outside the given range, because they might join to addressval2_ rows within the given createdate range. And conversely it can't discard any addressval2_ rows early. Is there any chance that you wanted AND not OR there? One thing that might help a bit is to change the join order: from listing.listing listing0_ left outer join listing.addressvaluation addressval2_ on listing0_.fkbestaddressid=addressval2_.fkaddressid left outer join listing.address listingadd1_ on listing0_.fkbestaddressid=listingadd1_.addressid so that at least the WHERE clause can be applied before having joined to listingadd1_. The semantics of your ON clauses are probably wrong anyway --- did you think twice about what happens if there's no matching listingadd1_ entry? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Why the difference in plans ??
On Fri, 2006-09-15 at 10:08 -0400, Tom Lane wrote: but it seems there are only 7. Try increasing your statistics target and re-analyzing. Do you mean with increasing my statistics target changing the value of default_statistics_target = 10 to a bigger number? If so, changing it to 900 did not make any difference (PostgreSQL restarted, vacuumed analysed etc). -- Groeten, Joost Kraaijeveld Askesis B.V. Molukkenstraat 14 6524NB Nijmegen tel: 024-3888063 / 06-51855277 fax: 024-3608416 web: www.askesis.nl ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Why the difference in plans ??
Joost Kraaijeveld [EMAIL PROTECTED] writes: Do you mean with increasing my statistics target changing the value of default_statistics_target = 10 to a bigger number? If so, changing it to 900 did not make any difference (PostgreSQL restarted, vacuumed analysed etc). Hm, did the 353 rowcount estimate not change at all? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Optimize SQL
On 15-9-2006 17:53 Tom Lane wrote: If that WHERE logic is actually what you need, then getting this query to run quickly seems pretty hopeless. The database must form the full outer join result: it cannot discard any listing0_ rows, even if they have lastupdate outside the given range, because they might join to addressval2_ rows within the given createdate range. And conversely it can't discard any addressval2_ rows early. Is there any chance that you wanted AND not OR there? Couldn't it also help to do something like this? SELECT ..., (SELECT MAX(createdate) FROM addressval ...) FROM listing l LEFT JOIN address ... WHERE l.id IN (SELECT id FROM listing WHERE lastupdate ... UNION SELECT id FROM listing JOIN addressval a ON ... WHERE a.createdate ...) Its not pretty, but looking at the explain only a small amount of records match both clauses. So this should allow the use of indexes for both the createdate-clause and the lastupdate-clause. Best regards, Arjen ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] RAID 0 not as fast as expected
When we first started working with Solaris ZFS, we were getting about 400-600 MB/s, and after working with the Solaris Engineering team we now get rates approaching 2GB/s. The updates needed to Solaris are part of the Solaris 10 U3 available in October (and already in Solaris Express, aka Solaris 11). Luke, What other file systems have you had good success with? Solaris would be nice, but it looks like I'm stuck running on FreeBSD (6.1, amd64) so UFS2 would be the default. Not sure about XFS on BSD, and I'm not sure at the moment that ext2/3 provide enough benefit over UFS to spend much time on. Also, has anyone had any experience with gmirror (good or bad)? I'm thinking of trying to use it to stripe two hardware mirrored sets since HW RAID10 wasn't doing as well as I had hoped (Dell Perc5/I controller). For a 4 disk RAID 10 (10k rpm SAS/SCSI disks) what would be a good target performance number? Right now, dd shows 224 MB/s. And lastly, for a more OLAP style database, would I be correct in assuming that sequential access speed would be more important than is normally the case? (I have a relatively small number of connections, but each running on pretty large data sets). Thanks, Bucky ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Performance of IN (...) vs. = ANY array[...]
Greetings: I'm running 8.1.4, and have noticed major differences in execution time for plpgsql functions running queries that differ only in use of an array such as: slower_function( vals integer[] ) [query] WHERE id = ANY vals; faster_function( vals integer[] ) vals_text := array_to_string( vals, ',' ) EXECUTE '[query] WHERE id IN (' || vals_text || ')'; In general, there are about 10 integers in the lookup set on average and 50 max. What are the advantages or disadvantages of using arrays in this situation? The = ANY array method makes plpgsql development cleaner, but seems to really lack performance in certain cases. What do you recommend as the preferred method? Thanks for your comments. -- Benjamin Minshall [EMAIL PROTECTED] Senior Developer -- Intellicon, Inc. http://www.intellicon.biz smime.p7s Description: S/MIME Cryptographic Signature
Re: [PERFORM] Performance of IN (...) vs. = ANY array[...]
Benjamin Minshall [EMAIL PROTECTED] writes: What are the advantages or disadvantages of using arrays in this situation? The = ANY array method makes plpgsql development cleaner, but seems to really lack performance in certain cases. In existing releases, the form with IN (list-of-scalar-constants) can be optimized into indexscan(s), but = ANY (array) isn't. 8.2 will treat them equivalently (in fact, it converts IN (...) to = ANY (ARRAY[...]) !). So depending on your time horizon, you might wish to stick with whichever is cleaner for your calling code. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] RAID 0 not as fast as expected
Josh, On 9/14/06 8:47 PM, Joshua D. Drake [EMAIL PROTECTED] wrote: I've obtained 1,950 MB/s using Linux software RAID on SATA drives. With what? :) Sun X4500 (aka Thumper) running stock RedHat 4.3 (actually CentOS 4.3) with XFS and the linux md driver without lvm. Here is a summary of the results: Read Test RAID Level Max Readahead (KB) RAID Chunksize Max Readahead on Disks (KB) Max Time (s) Read Bandwidth (MB/s) 0 65536 64 256 16.689 1,917.43 0 4096 64 256 21.269 1,504.54 0 65536 256 256 17.967 1,781.04 0 2816 256 256 18.835 1,698.96 0 65536 1024 256 18.538 1,726.18 0 65536 64 512 18.295 1,749.11 0 65536 64 256 18.931 1,690.35 0 65536 64 256 18.873 1,695.54 0 64768 64 256 18.545 1,725.53 0 131172 64 256 18.548 1,725.25 0 131172 64 65536 19.046 1,680.14 0 131172 64 524288 18.125 1,765.52 0 131172 64 1048576 18.701 1,711.14 5 2560 64 256 39.933 801.34 5 16777216 64 256 37.76 847.46 5 524288 64 256 53.497 598.16 5 65536 32 256 38.472 831.77 5 65536 32 256 38.004 842.02 5 65536 32 256 37.884 844.68 5 2560 16 256 41.39 773.13 5 65536 16 256 48.902 654.37 10 65536 64 256 83.256 384.36 1+0 65536 64 256 19.394 1,649.99 1+0 65536 64 256 19.047 1,680.05 1+0 65536 64 256 19.195 1,667.10 1+0 65536 64 256 18.806 1,701.58 1+0 65536 64 256 18.848 1,697.79 1+0 65536 64 256 18.371 1,741.88 1+0 65536 64 256 21.446 1,492.12 1+0 65536 64 256 20.254 1,579.93 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings