Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4
It will probably be useful with EXPLAIN ANALYZE of your queries, not just the EXPLAIN. it took 245 seconds to complete, see below. It looks like the planner thinks this is going to be really cheap -- so it's misestimating something somewhere. Have you ANALYZEd recently? yes, but to be sure I did it again before issuing the request; no improvements... regards, Liviu Nested Loop Left Join (cost=32.03..2026.70 rows=1 width=125) (actual time=16.686..244822.521 rows=2026 loops=1) Join Filter: (n.nodeid = public.nodeattributes.nodeid) - Nested Loop Left Join (cost=26.55..1420.57 rows=1 width=115) (actual time=13.833..176136.527 rows=2026 loops=1) Join Filter: (n.nodeid = public.nodeattributes.nodeid) - Nested Loop Left Join (cost=21.06..810.90 rows=1 width=105) (actual time=10.336..95476.175 rows=2026 loops=1) Join Filter: (n.nodeid = public.nodeattributes.nodeid) - Nested Loop Left Join (cost=15.55..194.15 rows=1 width=95) (actual time=6.514..11524.892 rows=2026 loops=1) Join Filter: (n.nodeid = rtunodes.nodeid) - Nested Loop Left Join (cost=11.17..107.94 rows=1 width=82) (actual time=0.661..71.751 rows=2026 loops=1) Filter: (templatenodes.nodeid IS NULL) - Hash Left Join (cost=11.17..99.66 rows=1 width=82) (actual time=0.643..36.053 rows=2206 loops=1) Hash Cond: (n.nodeid = templates.nodeid) Filter: (templates.nodeid IS NULL) - Hash Left Join (cost=8.73..88.06 rows=2270 width=82) (actual time=0.502..27.756 rows=2270 loops=1) Hash Cond: (n.nodeid = rtus.nodeid) - Hash Left Join (cost=4.34..74.11 rows=2270 width=74) (actual time=0.286..20.179 rows=2270 loops=1) Hash Cond: (n.nodeid = areas.nodeid) - Hash Left Join (cost=1.43..61.83 rows=2270 width=66) (actual time=0.114..13.062 rows=2270 loops=1) Hash Cond: (n.nodeid = realms.nodeid) - Seq Scan on nodes n (cost=0.00..51.70 rows=2270 width=49) (actual time=0.016..4.089 rows=2270 loops=1) - Hash (cost=1.19..1.19 rows=19 width=17) (actual time=0.056..0.056 rows=19 loops=1) - Seq Scan on realms (cost=0.00..1.19 rows=19 width=17) (actual time=0.006..0.023 rows=19 loops=1) - Hash (cost=1.85..1.85 rows=85 width=8) (actual time=0.156..0.156 rows=85 loops=1) - Seq Scan on areas (cost=0.00..1.85 rows=85 width=8) (actual time=0.007..0.070 rows=85 loops=1) - Hash (cost=3.06..3.06 rows=106 width=8) (actual time=0.200..0.200 rows=106 loops=1) - Seq Scan on rtus (cost=0.00..3.06 rows=106 width=8) (actual time=0.010..0.105 rows=106 loops=1) - Hash (cost=1.64..1.64 rows=64 width=4) (actual time=0.119..0.119 rows=64 loops=1) - Seq Scan on templates (cost=0.00..1.64 rows=64 width=4) (actual time=0.006..0.059 rows=64 loops=1) - Index Scan using nodeid_pkey on templatenodes (cost=0.00..8.27 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=2206) Index Cond: (n.nodeid = templatenodes.nodeid) - Hash Join (cost=4.38..63.51 rows=1816 width=13) (actual time=0.012..4.417 rows=1816 loops=2026) Hash Cond: (rtunodes.rtuid = r.nodeid) - Seq Scan on rtunodes (cost=0.00..34.16 rows=1816 width=9) (actual time=0.009..1.290 rows=1816 loops=2026) - Hash (cost=3.06..3.06 rows=106 width=8) (actual time=0.194..0.194 rows=106 loops=1) - Seq Scan on rtus r (cost=0.00..3.06 rows=106 width=8) (actual time=0.005..0.091 rows=106 loops=1) - Hash Join (cost=5.51..611.90 rows=388 width=14) (actual time=0.033..39.896 rows=2079 loops=2026) Hash Cond: (a.typeid = t.typeid) - Hash Join (cost=4.34..604.41 rows=647 width=18) (actual time=0.031..36.513 rows=2079 loops=2026) Hash Cond: (public.nodeattributes.attributeid = a.attributeid) - Seq Scan on nodeattributes (cost=0.00..505.35 rows=23535 width=18) (actual time=0.008..16.826 rows=23535 loops=2026) - Hash (cost=4.28..4.28 rows=5 width=8) (actual time=0.077..0.077 rows=5 loops=1)
Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4
On Fri, May 18, 2007 at 12:02:44PM +0300, Liviu Ionescu wrote: the 8.2.4 plan with join_collapse_limit = 1 (with default it was worse, full of nested loops) It will probably be useful with EXPLAIN ANALYZE of your queries, not just the EXPLAIN. Nested Loop Left Join (cost=32.01..2012.31 rows=1 width=230) It looks like the planner thinks this is going to be really cheap -- so it's misestimating something somewhere. Have you ANALYZEd recently? /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Ever Increasing IOWAIT
Ralph Mason wrote: We have a database running on a 4 processor machine. As time goes by the IO gets worse and worse peeking at about 200% as the machine loads up. The weird thing is that if we restart postgres it’s fine for hours but over time it goes bad again. (CPU usage graph here HYPERLINK http://www.flickr.com/photos/[EMAIL PROTECTED]/502596262/http://www.flickr.com/p hotos/[EMAIL PROTECTED]/502596262/ ) You can clearly see where the restart happens in the IO area I'm assuming here we're talking about that big block of iowait at about 4-6am? I take it vmstat/iostat show a corresponding increase in disk activity at that time. The question is - what? Does the number of PG processes increase at that time? If that's not intentional then you might need to see what your applications are up to. Do you have a vacuum/backup scheduled for that time? Do you have some other process doing a lot of file I/O at that time? This is Postgres 8.1.4 64bit. You'll want to upgrade to the latest patch release - you're missing 5 lots of bug-fixes there. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] performance drop on 8.2.4, reverting to 8.1.4
I recently tried to upgrade to 8.2.4, but major queries I wrote for 8.1.4 are now planned differently on 8.2.4 and are no longer usable. What the 8.1.4 planned as a series of 'hash left join's and took about 2 seconds now is planned as 'nested loop left joins' and takes forever. Other request were also affected, increasing the time form miliseconds to hundreds of miliseconds, even seconds. The worst performance hit was on the following query. I know it is a bit extreme, but worked perfectly on 8.1.4. Regards, Liviu SELECT n.nodeid, CASE WHEN n.parentnodeid IS NULL THEN -1 ELSE n.parentnodeid END AS parentnodeid, n.nodename, av.value AS iconname, avt.value AS templatename, avs.value AS subclass, n.globalnodeid, n.isaddupi, CASE WHEN realms.nodeid IS NOT NULL THEN 'SERVER'::text WHEN areas.nodeid IS NOT NULL THEN 'AREA'::text WHEN rtus.nodeid IS NOT NULL THEN 'DEVICE'::text WHEN rtunodes.nodeid IS NOT NULL THEN 'TAG'::text ELSE NULL::text END AS class, realms.name AS realmname, CASE WHEN n.nodeclass::text = 'area'::text AND n.nodesubclass IS NOT NULL THEN true ELSE false END AS istemplate, CASE WHEN realms.nodeid IS NOT NULL THEN realms.nodeid WHEN areas.nodeid IS NOT NULL THEN areas.realmid WHEN rtus.nodeid IS NOT NULL THEN rtus.realmid WHEN rtunodes.nodeid IS NOT NULL THEN r.realmid ELSE NULL::integer END AS realmid, rtunodes.rtuid, rtunodes.isinvalid, n.isvalid FROM nodes n LEFT JOIN realms ON n.nodeid = realms.nodeid LEFT JOIN areas ON n.nodeid = areas.nodeid LEFT JOIN rtus ON n.nodeid = rtus.nodeid LEFT JOIN templates ON n.nodeid = templates.nodeid LEFT JOIN templatenodes ON n.nodeid = templatenodes.nodeid LEFT JOIN (rtunodes JOIN rtus r ON rtunodes.rtuid = r.nodeid) ON n.nodeid = rtunodes.nodeid LEFT JOIN ( SELECT attributes_values2_view.nodeid, attributes_values2_view.value FROM attributes_values2_view WHERE attributes_values2_view.attributename::text = 'iconName'::text) av ON n.nodeid = av.nodeid LEFT JOIN ( SELECT attributes_values2_view.nodeid, attributes_values2_view.value FROM attributes_values2_view WHERE attributes_values2_view.attributename::text = 'addUPItemplate'::text) avt ON n.nodeid = avt.nodeid LEFT JOIN ( SELECT attributes_values2_view.nodeid, attributes_values2_view.value FROM attributes_values2_view WHERE attributes_values2_view.attributename::text = 'addUPIsubclass'::text) avs ON n.nodeid = avs.nodeid WHERE templates.nodeid IS NULL AND templatenodes.nodeid IS NULL; CREATE OR REPLACE VIEW attributes_values2_view AS SELECT nodeattributes.nodeid, nodeattributes.attributeid, a.name AS attributename, t.name AS typename, a.typeid, a.valuesize, a.flags, nodeattributes.value, a.creationdate FROM nodeattributes LEFT JOIN attributes a USING (attributeid) LEFT JOIN types t USING (typeid) WHERE t.isattributetype; the 8.2.4 plan with join_collapse_limit = 1 (with default it was worse, full of nested loops) Nested Loop Left Join (cost=32.01..2012.31 rows=1 width=230) Join Filter: (n.nodeid = public.nodeattributes.nodeid) - Nested Loop Left Join (cost=26.47..1411.38 rows=1 width=220) Join Filter: (n.nodeid = public.nodeattributes.nodeid) - Nested Loop Left Join (cost=20.93..810.45 rows=1 width=210) Join Filter: (n.nodeid = public.nodeattributes.nodeid) - Nested Loop Left Join (cost=15.39..209.52 rows=1 width=200) Join Filter: (n.nodeid = rtunodes.nodeid) - Nested Loop Left Join (cost=11.14..122.60 rows=1 width=187) Filter: (templatenodes.nodeid IS NULL) - Hash Left Join (cost=11.14..99.52 rows=11 width=187) Hash Cond: (n.nodeid = templates.nodeid) Filter: (templates.nodeid IS NULL) - Hash Left Join (cost=8.70..87.95 rows=2266 width=187) Hash Cond: (n.nodeid = rtus.nodeid) - Hash Left Join (cost=4.45..74.20 rows=2266 width=179) Hash Cond: (n.nodeid = areas.nodeid) - Hash Left Join (cost=1.45..61.81 rows=2266 width=171) Hash Cond: (n.nodeid = realms.nodeid) - Seq Scan on nodes n (cost=0.00..51.66 rows=2266 width=49) - Hash (cost=1.20..1.20 rows=20 width=122) - Seq Scan on realms (cost=0.00..1.20 rows=20 width=122)
Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4
On Fri, May 18, 2007 at 01:14:56PM +0300, Liviu Ionescu wrote: yes, but to be sure I did it again before issuing the request; no improvements... Is this with the join collapse limit set to 1, or with default? (Default is generally more interesting.) /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4
Is this with the join collapse limit set to 1, or with default? (Default is generally more interesting.) below is the same query with the default setting. regards, Liviu Nested Loop Left Join (cost=23.35..1965.46 rows=1 width=125) (actual time=50.408..231926.123 rows=2026 loops=1) Join Filter: (n.nodeid = public.nodeattributes.nodeid) - Nested Loop Left Join (cost=17.81..1357.58 rows=1 width=115) (actual time=47.103..156521.050 rows=2026 loops=1) Join Filter: (n.nodeid = public.nodeattributes.nodeid) - Nested Loop Left Join (cost=12.30..752.97 rows=1 width=105) (actual time=43.924..81977.726 rows=2026 loops=1) Join Filter: (n.nodeid = public.nodeattributes.nodeid) - Nested Loop Left Join (cost=6.83..150.65 rows=1 width=95) (actual time=40.603..12477.227 rows=2026 loops=1) - Nested Loop Left Join (cost=6.83..150.37 rows=1 width=78) (actual time=38.448..12459.918 rows=2026 loops=1) - Nested Loop Left Join (cost=6.83..150.08 rows=1 width=70) (actual time=31.793..12436.536 rows=2026 loops=1) - Nested Loop Left Join (cost=6.83..149.80 rows=1 width=62) (actual time=6.588..12394.366 rows=2026 loops=1) Filter: (templatenodes.nodeid IS NULL) - Nested Loop Left Join (cost=6.83..149.51 rows=1 width=62) (actual time=6.525..12362.969 rows=2206 loops=1) Join Filter: (n.nodeid = rtunodes.nodeid) - Hash Left Join (cost=2.44..63.29 rows=1 width=49) (actual time=0.361..14.426 rows=2206 loops=1) Hash Cond: (n.nodeid = templates.nodeid) Filter: (templates.nodeid IS NULL) - Seq Scan on nodes n (cost=0.00..51.70 rows=2270 width=49) (actual time=0.071..4.417 rows=2270 loops=1) - Hash (cost=1.64..1.64 rows=64 width=4) (actual time=0.152..0.152 rows=64 loops=1) - Seq Scan on templates (cost=0.00..1.64 rows=64 width=4) (actual time=0.032..0.082 rows=64 loops=1) - Hash Join (cost=4.38..63.51 rows=1816 width=13) (actual time=0.011..4.365 rows=1816 loops=2206) Hash Cond: (rtunodes.rtuid = r.nodeid) - Seq Scan on rtunodes (cost=0.00..34.16 rows=1816 width=9) (actual time=0.008..1.276 rows=1816 loops=2206) - Hash (cost=3.06..3.06 rows=106 width=8) (actual time=0.241..0.241 rows=106 loops=1) - Seq Scan on rtus r (cost=0.00..3.06 rows=106 width=8) (actual time=0.029..0.136 rows=106 loops=1) - Index Scan using nodeid_pkey on templatenodes (cost=0.00..0.28 rows=1 width=4) (actual time=0.008..0.008 rows=0 loops=2206) Index Cond: (n.nodeid = templatenodes.nodeid) - Index Scan using rtus_pkey on rtus (cost=0.00..0.27 rows=1 width=8) (actual time=0.016..0.016 rows=0 loops=2026) Index Cond: (n.nodeid = rtus.nodeid) - Index Scan using areas_pkey on areas (cost=0.00..0.27 rows=1 width=8) (actual time=0.007..0.007 rows=0 loops=2026) Index Cond: (n.nodeid = areas.nodeid) - Index Scan using realms_pkey on realms (cost=0.00..0.27 rows=1 width=17) (actual time=0.004..0.004 rows=0 loops=2026) Index Cond: (n.nodeid = realms.nodeid) - Hash Join (cost=5.48..600.38 rows=155 width=14) (actual time=0.812..34.198 rows=132 loops=2026) Hash Cond: (public.nodeattributes.attributeid = a.attributeid) - Seq Scan on nodeattributes (cost=0.00..505.35 rows=23535 width=18) (actual time=0.009..16.660 rows=23535 loops=2026) - Hash (cost=5.47..5.47 rows=1 width=4) (actual time=0.196..0.196 rows=2 loops=1) - Hash Join (cost=1.18..5.47 rows=1 width=4) (actual time=0.124..0.187 rows=2 loops=1) Hash Cond: (a.typeid = t.typeid) - Seq Scan on attributes a (cost=0.00..4.28 rows=2 width=8) (actual time=0.044..0.103 rows=2 loops=1) Filter: ((name)::text = 'addUPIsubclass'::text) - Hash (cost=1.10..1.10 rows=6 width=4) (actual time=0.047..0.047 rows=6 loops=1) - Seq Scan on types t
Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4
On Fri, May 18, 2007 at 02:05:36PM +0300, Liviu Ionescu wrote: - Hash Left Join (cost=2.44..63.29 rows=1 width=49) (actual time=0.361..14.426 rows=2206 loops=1) Hash Cond: (n.nodeid = templates.nodeid) Filter: (templates.nodeid IS NULL) - Seq Scan on nodes n (cost=0.00..51.70 rows=2270 width=49) (actual time=0.071..4.417 rows=2270 loops=1) - Hash (cost=1.64..1.64 rows=64 width=4) (actual time=0.152..0.152 rows=64 loops=1) - Seq Scan on templates (cost=0.00..1.64 rows=64 width=4) (actual time=0.032..0.082 rows=64 loops=1) This seems to be the source of the misestimation. You might want to try using n WHERE n.nodein NOT IN (SELECT nodeid FROM templates) instead of n LEFT JOIN templates USING (nodeid) WHERE templates.nodeid IS NULL and see if it helps. Total runtime: 231929.656 ms Note that this is better than the version with collapse_limit set to 1. :-) /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4
On Fri, May 18, 2007 at 02:51:42PM +0300, Liviu Ionescu wrote: it helped, the new version of the query takes 2303 ms on both 8.1.4 and 8.2.4. And the old one? any idea why the 8.2.4 planner is not happy with the initial select? was it just a big chance that it worked in 8.1.4 or the 8.2.4 planner has a problem? I guess it was more or less by chance, especially as 8.1 did not reorder outer joins. Others might know more about the estimation, though. /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4
it helped, the new version of the query takes 2303 ms on both 8.1.4 and 8.2.4. And the old one? slightly shorter, 2204 ms. as a subjective perception, the entire application is slightly slower on 8.2.4, probably there are many queries that were manually tunned for 7.x/8.1.x and now need rewriting, which is not really what I expected from an upgrade. Liviu ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Ever Increasing IOWAIT
You're not swapping are you? One explanation could be that PG is configured to think it has access to a little more memory than the box can really provide, which forces it to swap once it's been running for long enough to fill up its shared buffers or after a certain number of concurrent connections are opened. -- Mark Lewis On Fri, 2007-05-18 at 10:45 +1200, Ralph Mason wrote: We have a database running on a 4 processor machine. As time goes by the IO gets worse and worse peeking at about 200% as the machine loads up. The weird thing is that if we restart postgres it’s fine for hours but over time it goes bad again. (CPU usage graph here http://www.flickr.com/photos/[EMAIL PROTECTED]/502596262/ ) You can clearly see where the restart happens in the IO area This is Postgres 8.1.4 64bit. Anyone have any ideas? Thanks Ralph -- Internal Virus Database is out-of-date. Checked by AVG Free Edition. Version: 7.5.432 / Virus Database: 268.15.9/573 - Release Date: 5/12/2006 4:07 p.m. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4
This seems to be the source of the misestimation. You might want to try using n WHERE n.nodein NOT IN (SELECT nodeid FROM templates) instead of n LEFT JOIN templates USING (nodeid) WHERE templates.nodeid IS NULL and see if it helps. it helped, the new version of the query takes 2303 ms on both 8.1.4 and 8.2.4. this is very interesting. on 8.1.x i have also repeatedly had to rewrite joins as their equivalent IN/NOT IN alternatives in order to improve performance, so i feel that at least under some alignments of the planets 8.1 has similar problems. george ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4
under some alignments of the planets 8.1 has similar problems. 8.1 might have similar problems, but the point here is different: if what was manually tuned to work in 8.1 confuses the 8.2 planner and performance drops so much (from 2303 to 231929 ms in my case) upgrading a production machine to 8.2 is a risky business. I probably have hundreds of sql statements in my application, some of them quite complex, and it is not reasonable to check all of them in order to certify them to be 8.2 compliant. regards, Liviu ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4
Steinar H. Gunderson [EMAIL PROTECTED] writes: On Fri, May 18, 2007 at 02:05:36PM +0300, Liviu Ionescu wrote: - Hash Left Join (cost=2.44..63.29 rows=1 width=49) (actual time=0.361..14.426 rows=2206 loops=1) Hash Cond: (n.nodeid = templates.nodeid) Filter: (templates.nodeid IS NULL) This seems to be the source of the misestimation. Yeah. 8.2 is estimating that the nodeid IS NULL condition will discard all or nearly all the rows, presumably because there aren't any null nodeid's in the underlying table --- it fails to consider that the LEFT JOIN may inject some nulls. 8.1 was not any brighter; the reason it gets a different estimate is that it doesn't distinguish left-join and WHERE clauses at all, but assumes that the result of the left join can't have fewer rows than its left input, even after applying the filter condition. In this particular scenario that happens to be a better estimate. So even though 8.2 is smarter, and there is no bug here that wasn't in 8.1 too, it's getting a worse estimate leading to a worse plan. This is a sufficiently common idiom that I think it's a must-fix problem. Not sure about details yet, but it seems somehow the selectivity estimator had better start accounting for outer-join-injected NULLs. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] CPU Intensive query
I have an interesting problem. I have the following query that ran ok on Monday and Tuesday and it has been running ok since I have been at this job. I have seen it to be IO intensive, but since Wednesday it has become CPU intensive. Database wise fresh data has been put into the tables, vacuumed analyzed, no other parameter has been modified. Wednesday it ran over 24 hours and it did not finish and all this time it pegged a CPU between 95-99%. Yesterday the same story. I do not understand what could have caused it to behave like this suddenly. I am hoping somebody can point me to do research in the right direction. The query is as follows and it's explain plan is also attached: set enable_nestloop = off; INSERT INTO linkshare.macys_ls_daily_shipped SELECT ddw.intr_xref, cdm.cdm_get_linkshare_id_safe(ddw.intr_xref, 10), to_char(cdm.cdm_utc_convert(cdm.cdm_get_linkshare_timestamp(ddw.intr_xref, 10), -5), '-MM-DD/HH24:MI:SS'), to_char(cdm.cdm_utc_convert(to_char(sales.order_date, '-MM-DD HH24:MI:SS')::timestamp without time zone, -5), '-MM-DD/HH24:MI:SS') , ddw.item_upc, sum(abs(ddw.itm_qty)), sum((ddw.tran_itm_total * 100::numeric)::integer), 'USD', '', '', '', ddw.item_desc FROM cdm.cdm_ddw_tran_item_grouped ddw JOIN cdm.cdm_sitesales sales ON ddw.intr_xref::text = sales.order_number::text WHERE ddw.cal_date (CURRENT_DATE - 7) AND ddw.cal_date CURRENT_DATE AND ddw.intr_xref IS NOT NULL AND trim(cdm.cdm_get_linkshare_id_safe(ddw.intr_xref, 10)) '' AND cdm.cdm_utc_convert(cdm.cdm_get_linkshare_timestamp(ddw.intr_xref, 10), -5)::text::date = (CURRENT_DATE - 52) AND sales.order_date = (CURRENT_DATE - 52) AND (tran_typ_id = 'S'::bpchar) AND btrim(item_group::text) 'EGC'::text AND btrim(item_group::text) 'VGC'::text GROUP BY ddw.intr_xref, cdm.cdm_get_linkshare_id_safe(ddw.intr_xref, 10), to_char(cdm.cdm_utc_convert(cdm.cdm_get_linkshare_timestamp(ddw.intr_xref, 10), -5), '-MM-DD/HH24:MI:SS'), to_char(cdm.cdm_utc_convert(to_char(sales.order_date, '-MM-DD HH24:MI:SS')::timestamp without time zone, -5), '-MM-DD/HH24:MI:SS'), ddw.item_upc, 8, 9, 10, 11, ddw.item_desc; HashAggregate (cost=152555.97..152567.32 rows=267 width=162) - Hash Join (cost=139308.18..152547.96 rows=267 width=162) Hash Cond: ((outer.intr_xref)::text = (inner.order_number)::text) - GroupAggregate (cost=106793.14..109222.13 rows=4319 width=189) - Sort (cost=106793.14..106901.09 rows=43182 width=189) Sort Key: cdm_ddw_tran_item.appl_xref, cdm_ddw_tran_item.intr_xref, cdm_ddw_tran_item.tran_typ_id, cdm_ddw_tran_item.cal_date, cdm_ddw_tran_item.cal_time, cdm_ddw_tran_item.tran_itm_total, cdm_ddw_tran_item.tran_tot_amt, cdm_ddw_tran_item.fill_store_div, cdm_ddw_tran_item.itm_price, cdm_ddw_tran_item.item_id, cdm_ddw_tran_item.item_upc, cdm_ddw_tran_item.item_pid, cdm_ddw_tran_item.item_desc, cdm_ddw_tran_item.nrf_color_name, cdm_ddw_tran_item.nrf_size_name, cdm_ddw_tran_item.dept_id, c - Index Scan using cdm_ddw_tranp_item_cal_date on cdm_ddw_tran_item (cost=0.01..103468.52 rows=43182 width=189) Index Cond: ((cal_date (('now'::text)::date - 7)) AND (cal_date ('now'::text)::date)) Filter: ((intr_xref IS NOT NULL) AND (btrim(cdm.cdm_get_linkshare_id_safe(intr_xref, 10)) ''::text) AND (((cdm.cdm_utc_convert(cdm.cdm_get_linkshare_timestamp(intr_xref, 10), -5))::text)::date = (('now'::text)::date - 52)) AND (tran_typ_id = 'S'::bpchar) AND (btrim((item_group)::text) 'EGC'::text) AND (btrim((item_group)::text) 'VGC'::text)) - Hash (cost=31409.92..31409.92 rows=442050 width=20) - Index Scan using cdm_sitesales_order_date on cdm_sitesales sales (cost=0.00..31409.92 rows=442050 width=20) Index Cond: (order_date = (('now'::text)::date - 52)) - Bored stiff? Loosen up... Download and play hundreds of games for free on Yahoo! Games.
Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4
It is arguable, that updating the DB software version in an enterprise environment requires exactly that: check all production queries on the new software to identify any issues. In part, this is brought on by the very tuning that you performed against the previous software. Restore the 8.1 DB into 8.2. Then run the queries against both versions to evaluate functioning and timing. you're right. my previous message was not a complain, was a warning for others to avoid the same mistake. I was overconfident and got bitten. in the future I'll check my queries on 8.2/8.3/... on a development configuration before upgrading the production server. regards, Liviu ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] CPU Intensive query
On Fri, May 18, 2007 at 09:02:52AM -0700, Abu Mushayeed wrote: I have an interesting problem. I have the following query that ran ok on Monday and Tuesday and it has been running ok since I have been at this job. I have seen it to be IO intensive, but since Wednesday it has become CPU intensive. Database wise fresh data has been put into the tables, vacuumed analyzed, no other parameter has been modified. What Postgres version is this? The query is as follows and it's explain plan is also attached: Normally EXPLAIN ANALYZE data would be much better than EXPLAIN, but if the query indeed does not finish, it's not going to help much. set enable_nestloop = off; What's the rationale for this? HashAggregate (cost=152555.97..152567.32 rows=267 width=162) 152000 disk page fetches is a bit, but it shouldn't take 24 hours. There's probably misestimation involved at some point here. Does it really return 267 rows, or many more? /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] reading large BYTEA type is slower than expected
We have recently ported our application to the postgres database. For the most part performance has not been an issue; however there is one situation that is a problem and that is the initial read of rows containing BYTEA values that have an average size of 2 kilobytes or greater. For BYTEA values postgres requires as much 3 seconds to read the values from disk into its buffer cache. After the initial read into buffer cache, performance is comparable to other commercial DBMS that we have ported to. As would be expected the commercial DBMS are also slower to display data that is not already in the buffer cache, but the magnitude of difference for postgres for this type of data read from disk as opposed to read from buffer cache is much greater. We have vacuumed the table and played around with the database initialization parameters in the postgresql.conf. Neither helped with this problem. Does anyone have any tips on improving the read from disk performance of BYTEA data that is typically 2KB or larger? Mark
Re: [PERFORM] reading large BYTEA type is slower than expected
Mark Harris [EMAIL PROTECTED] writes: We have recently ported our application to the postgres database. For the most part performance has not been an issue; however there is one situation that is a problem and that is the initial read of rows containing BYTEA values that have an average size of 2 kilobytes or greater. For BYTEA values postgres requires as much 3 seconds to read the values from disk into its buffer cache. How large is large? (No, I don't believe it takes 3 sec to fetch a single 2Kb value.) regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] 121+ million record table perf problems
I need some help on recommendations to solve a perf problem. I've got a table with ~121 million records in it. Select count on it currently takes ~45 minutes, and an update to the table to set a value on one of the columns I finally killed after it ran 17 hours and had still not completed. Queries into the table are butt slow, and System: SUSE LINUX 10.0 (X86-64) Postgresql: PostgreSQL 8.2.1 Index type: btree A select count took ~48 minutes before I made some changes to the postgresql.conf, going from default values to these: shared_buffers = 24MB work_mem = 256MB maintenance_work_mem = 512MB random_page_cost = 100 stats_start_collector = off stats_row_level = off As a test I am trying to do an update on state using the following queries: update res set state=5001; select count(resid) from res; The update query that started this all I had to kill after 17hours. It should have updated all 121+ million records. That brought my select count down to 19 minutes, but still a far cry from acceptable. The system has 2GB of RAM (more is alreads on order), but doesn't seem to show problems in TOP with running away with RAM. If anything, I don't think it's using enough as I only see about 6 processes using 26-27 MB each) and is running on a single disk (guess I will likely have to at the minimum go to a RAID1). Workload will primarily be comprised of queries against the indicies (thus why so many of them) and updates to a single record from about 10 clients where that one records will have md5, state, rval, speed, audit, and date columns updated. Those updates don't seem to be a problem, and are generally processed in bulk of 500 to 5000 at a time. Here is the schema for the table giving me problems: CREATE TABLE res ( res_id integer NOT NULL DEFAULT nextval('result_id_seq'::regclass), res_client_id integer NOT NULL, time real DEFAULT 0, error integer DEFAULT 0, md5 character(32) DEFAULT 0, res_tc_id integer NOT NULL, state smallint DEFAULT 0, priority smallint, rval integer, speed real, audit real, date timestamp with time zone, gold_result_id integer, CONSTRAINT result_pkey PRIMARY KEY (res_id), CONSTRAINT unique_res UNIQUE (res_client_id, res_tc_id) ) WITHOUT OIDS; ALTER TABLE res OWNER TO postgres; CREATE INDEX index_audit ON res USING btree (audit); CREATE INDEX index_event ON res USING btree (error); CREATE INDEX index_priority ON res USING btree (priority); CREATE INDEX index_rval ON res USING btree (rval); CREATE INDEX index_speed ON res USING btree (speed); CREATE INDEX index_state ON res USING btree (state); CREATE INDEX index_tc_id ON res USING btree (res_tc_id); CREATE INDEX index_time ON res USING btree (time); ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] reading large BYTEA type is slower than expected
Mark, I am no expert but this looks like a file system I/O thing. I set hw.ata.wc=1 for a SATA drive and =0 for a SCSI drive in /boot/loader.conf on my FreeBSD systems. That seems to provide some needed tweaking. Yudhvir == On 5/18/07, Mark Harris [EMAIL PROTECTED] wrote: We have recently ported our application to the postgres database. For the most part performance has not been an issue; however there is one situation that is a problem and that is the initial read of rows containing BYTEA values that have an average size of 2 kilobytes or greater. For BYTEA values postgres requires as much 3 seconds to read the values from disk into its buffer cache. After the initial read into buffer cache, performance is comparable to other commercial DBMS that we have ported to. As would be expected the commercial DBMS are also slower to display data that is not already in the buffer cache, but the magnitude of difference for postgres for this type of data read from disk as opposed to read from buffer cache is much greater. We have vacuumed the table and played around with the database initialization parameters in the postgresql.conf. Neither helped with this problem. Does anyone have any tips on improving the read from disk performance of BYTEA data that is typically 2KB or larger? Mark -- Yudhvir Singh Sidhu 408 375 3134 cell
Re: [PERFORM] 121+ million record table perf problems
On Fri, May 18, 2007 at 12:43:40PM -0500, [EMAIL PROTECTED] wrote: I've got a table with ~121 million records in it. Select count on it currently takes ~45 minutes, and an update to the table to set a value on one of the columns I finally killed after it ran 17 hours and had still not completed. Queries into the table are butt slow, and I don't think you've told us anything like enough to get started on solving your problem. But to start with, you know that in Postgres, an unrestricted count() on a table always results in reading the entire table, right? Standard questions: have you performed any vacuum or analyse? Your update statement is also a case where you have to touch every row. Note that, given that you seem to be setting the state field to the same value for everything, an index on there will do you not one jot of good until there's greater selectivity. How fast is the disk? Is it fast enough to read and touch every one of those rows on the table inside of 17 hours? Note also that your approach of updating all 121 million records in one statement is approximately the worst way to do this in Postgres, because it creates 121 million dead tuples on your table. (You've created some number of those by killing the query as well.) All of that said, 17 hours seems kinda long. As a test I am trying to do an update on state using the following queries: update res set state=5001; select count(resid) from res; What is this testing? The update query that started this all I had to kill after 17hours. Does that suggest that the update you're trying to make work well is _not_ update res set state = 5001? each) and is running on a single disk (guess I will likely have to at the minimum go to a RAID1). Workload will primarily be comprised of queries I bet that single disk is your problem. Iostat is your friend, I'd say. A -- Andrew Sullivan | [EMAIL PROTECTED] Everything that happens in the world happens at some place. --Jane Jacobs ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] 121+ million record table perf problems
[EMAIL PROTECTED] wrote: I need some help on recommendations to solve a perf problem. I've got a table with ~121 million records in it. Select count on it currently takes ~45 minutes, and an update to the table to set a value on one of the columns I finally killed after it ran 17 hours and had still not completed. Queries into the table are butt slow, and Scanning 121 million rows is going to be slow even on 16 disks. System: SUSE LINUX 10.0 (X86-64) Postgresql: PostgreSQL 8.2.1 Index type: btree You really should be running 8.2.4. A select count took ~48 minutes before I made some changes to the postgresql.conf, going from default values to these: shared_buffers = 24MB This could be increased. work_mem = 256MB maintenance_work_mem = 512MB random_page_cost = 100 stats_start_collector = off stats_row_level = off As a test I am trying to do an update on state using the following queries: update res set state=5001; You are updating 121 million rows, that takes a lot of time considering you are actually (at a very low level) marking 121 million rows dead and inserting 121 million more. The update query that started this all I had to kill after 17hours. It should have updated all 121+ million records. That brought my select count down to 19 minutes, but still a far cry from acceptable. Not quite sure what you would considerable acceptable based on what you are trying to do. Sincerely, Joshua D. Drake Here is the schema for the table giving me problems: CREATE TABLE res ( res_id integer NOT NULL DEFAULT nextval('result_id_seq'::regclass), res_client_id integer NOT NULL, time real DEFAULT 0, error integer DEFAULT 0, md5 character(32) DEFAULT 0, res_tc_id integer NOT NULL, state smallint DEFAULT 0, priority smallint, rval integer, speed real, audit real, date timestamp with time zone, gold_result_id integer, CONSTRAINT result_pkey PRIMARY KEY (res_id), CONSTRAINT unique_res UNIQUE (res_client_id, res_tc_id) ) WITHOUT OIDS; ALTER TABLE res OWNER TO postgres; CREATE INDEX index_audit ON res USING btree (audit); CREATE INDEX index_event ON res USING btree (error); CREATE INDEX index_priority ON res USING btree (priority); CREATE INDEX index_rval ON res USING btree (rval); CREATE INDEX index_speed ON res USING btree (speed); CREATE INDEX index_state ON res USING btree (state); CREATE INDEX index_tc_id ON res USING btree (res_tc_id); CREATE INDEX index_time ON res USING btree (time); ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] reading large BYTEA type is slower than expected
Tom, No it is not 3 seconds to read a single value. Multiple records are read, approximately 120 records if the raster dataset is created with our application's default configuration. Please read on to understand why, if you need to. We are a GIS software company and have two basic types of data, raster and vector. Both types of data are stored in a BYTEA. Vector data are representations of geometry stored as a series of vertices to represent points, lines and polygons. This type of data is typically 30 to 200 bytes, but can be very large (consider how many vertices would be required to represent the Pacific Ocean at a detailed resolution). Vector data does not seem to exhibit the cold fetch issue (fetch from disk as opposed to fetch from buffer cache). It is with raster data that we see the problem. Raster data is image data stored in the database. When we store a georeferenced image in the database we block it up into tiles. The default tile size is 128 by 128 pixels. We compress the data using either: LZ77, JPEG or JPEG2000. Typically the pixel blocks stored in the BYTEA range in size from 800 bytes to 16000 bytes for 8-bit data stored with the default tile size, depending on the type of compression and the variability of the data. Our application is capable of mosaicking source images together into huge raster datasets that can grow into terabytes. Consider the entire landsat imagery with a resolution of 15 meters mosaicked into one raster dataset. It requires less than a terabyte to store that data. For practical reasons, as you can imagine, we construct a reduced resolution pyramid on the raster base level, allowing applications to view a reduced resolution level of the raster dataset as the user zooms out, and a higher resolution level as the user zooms in. The pyramid levels are also stored as pixel blocks in the table. Each pyramid level is reduced in resolution by 1/2 in the X and Y dimension. Therefore pyramid level 1 will be 1/4 of pyramid level 0 (the base). As the application queries the raster blocks table which stores the raster data tiles, it will request a raster tiles that fall within the spatial extent of the window for a particular pyramid level. Therefore the number of records queried from the raster blocks table containing the BYTEA column of pixel data is fairly constant. For the screen resolution of 1680 by 1050 that I am testing with about 120 records will be fetched from the raster blocks table each time the user pans or zooms. Mark -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Friday, May 18, 2007 10:48 AM To: Mark Harris Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] reading large BYTEA type is slower than expected Mark Harris [EMAIL PROTECTED] writes: We have recently ported our application to the postgres database. For the most part performance has not been an issue; however there is one situation that is a problem and that is the initial read of rows containing BYTEA values that have an average size of 2 kilobytes or greater. For BYTEA values postgres requires as much 3 seconds to read the values from disk into its buffer cache. How large is large? (No, I don't believe it takes 3 sec to fetch a single 2Kb value.) regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] 121+ million record table perf problems
[EMAIL PROTECTED] wrote: I need some help on recommendations to solve a perf problem. I've got a table with ~121 million records in it. Select count on it currently takes ~45 minutes, and an update to the table to set a value on one of the columns I finally killed after it ran 17 hours and had still not completed. Queries into the table are butt slow, and This is way too long. I just did a select count(*) on a table of mine that has 48 million rows and it took only 178 seconds. And this is on a serious POS disk subsystem that's giving me about 1/2 the read speed of a single off the shelf SATA disk. As select count(*) has to read the whole table sequentially, the time it takes is linear with the size of the table (once you get large enough that the whole table doesn't get cached in memory). So I'd be surprised if a 121 million record table took more than 500 or so seconds to read, and would expect it to be less. So my advice: vacuum. I'll bet you've got a whole boatload of dead tuples kicking around. Then analyze. Then consider firing off a reindex and/or cluster against the table. The other thing I'd consider is dropping the money on some more hardware- a few hundred bucks to get a battery backed raid card and half a dozen SATA drives would probably do wonders for your performance. shared_buffers = 24MB Up your shared buffers. This is a mistake I made originally as well- but this is the total number of shared buffers used by the system. I had originally assumed that the number of shared buffers used was this times the number of backends, but it's not. With 2G of memory, I'd start with shared buffers of 512MB, and consider upping it to 768MB or even 1024MB. This will also really help performance. stats_start_collector = off stats_row_level = off I think I'd also recommend turning these one. Brian ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Slow queries on big table
I have a two column table with over 160 million rows in it. As the size of the table grows queries on this table get exponentially slower. I am using version 8.1.5 32-bit on Red Hat Enterprise Linux 3. The hardware is an Intel 3 Ghz Xeon with 4GB RAM, and 6 disks in a RAID 5 configuration. For current testing I am running a single database connection with no other applications running on the machine, and the swap is not being used at all. Here is the table definition: mdsdb=# \d backup_location Table public.backup_location Column | Type | Modifiers ---+-+--- record_id | bigint | not null backup_id | integer | not null Indexes: backup_location_pkey PRIMARY KEY, btree (record_id, backup_id) backup_location_rid btree (record_id) Foreign-key constraints: backup_location_bfk FOREIGN KEY (backup_id) REFERENCES backups(backup_id) ON DELETE CASCADE Here is the table size: mdsdb=# select count(*) from backup_location; count --- 162101296 (1 row) And here is a simple query on this table that takes nearly 20 minutes to return less then 3000 rows. I ran an analyze immediately before I ran this query: mdsdb=# explain analyze select record_id from backup_location where backup_id = 1070; QUERY PLAN - Index Scan using backup_location_pkey on backup_location (cost=0.00..1475268.53 rows=412394 width=8) (actual time=3318.057..1196723.915 rows=2752 loops=1) Index Cond: (backup_id = 1070) Total runtime: 1196725.617 ms (3 rows) Obviously at this point the application is not usable. If possible we would like to grow this table to the 3-5 billion row range, but I don't know if that is realistic. Any guidance would be greatly appreciated. Thanks, Ed ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] 121+ million record table perf problems
On Friday 18 May 2007 11:51, Joshua D. Drake [EMAIL PROTECTED] wrote: The update query that started this all I had to kill after 17hours. It should have updated all 121+ million records. That brought my select count down to 19 minutes, but still a far cry from acceptable. You're going to want to drop all your indexes before trying to update 121 million records. Updates in PostgreSQL are really quite slow, mostly due to all the index updates. Drop indexes, do the updates, create a primary key, cluster the table on that key to free up the dead space, then recreate the rest of the indexes. That's about as fast as you can get that process. Of course, doing anything big on one disk is also going to be slow, no matter what you do. I don't think a table scan should take 19 minutes, though, not for 121 million records. You should be able to get at least 60-70MB/sec out of anything modern. I can only assume your disk is thrashing doing something else at the same time as the select. -- We can no more blame our loss of freedom on Congressmen than we can prostitution on pimps. Both simply provide broker services for their customers. -- Dr. Walter Williams ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Background vacuum
Tom Lane wrote: Ron Mayer [EMAIL PROTECTED] writes: Greg Smith wrote: Count me on the side that agrees adjusting the vacuuming parameters is the more straightforward way to cope with this problem. Agreed for vacuum; but it still seems interesting to me that across databases and workloads high priority transactions tended to get through faster than low priority ones. Is there any reason to believe that the drawbacks of priority inversion outweigh the benefits of setting priorities? Well, it's unclear, and anecdotal evidence is unlikely to convince anybody. I had put some stock in the CMU paper, but if it's based on PG 7.3 then you've got to **seriously** question its relevance to the current code. I was thinking the paper's results might apply more generally to RDBMS-like applications since they did test 3 of them with different locking behavior and different bottlenecks. But true, I should stop bringing up 7.3 examples. Anecdotally ;-) I've found renice-ing reports to help; especially in the (probably not too uncommon case) where slow running batch reporting queries hit different tables than interactive reporting queries. I guess that's why I keep defending priorities as a useful technique. It seems even more useful considering the existence of schedulers that have priority inheritance features. I'll admit there's still the theoretical possibility that it's a foot-gun so I don't mind people having to write their own stored procedure to enable it - but I'd be surprised if anyone could find a real world case where priorities would do more harm than good. Though, yeah, it'd be easy to construct an artificial case that'd demonstrate priority inversion (i.e. have a low priority process that takes a lock and sits and spins on some CPU-intensive stored procedure without doing any I/O). ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Slow queries on big table
Tyrrill, Ed wrote: I have a two column table with over 160 million rows in it. As the size of the table grows queries on this table get exponentially slower. I am using version 8.1.5 32-bit on Red Hat Enterprise Linux 3. The hardware is an Intel 3 Ghz Xeon with 4GB RAM, and 6 disks in a RAID 5 configuration. For current testing I am running a single database connection with no other applications running on the machine, and the swap is not being used at all. Here is the table definition: mdsdb=# \d backup_location Table public.backup_location Column | Type | Modifiers ---+-+--- record_id | bigint | not null backup_id | integer | not null Indexes: backup_location_pkey PRIMARY KEY, btree (record_id, backup_id) backup_location_rid btree (record_id) Foreign-key constraints: backup_location_bfk FOREIGN KEY (backup_id) REFERENCES backups(backup_id) ON DELETE CASCADE Here is the table size: mdsdb=# select count(*) from backup_location; count --- 162101296 (1 row) And here is a simple query on this table that takes nearly 20 minutes to return less then 3000 rows. I ran an analyze immediately before I ran this query: mdsdb=# explain analyze select record_id from backup_location where backup_id = 1070; QUERY PLAN - Index Scan using backup_location_pkey on backup_location (cost=0.00..1475268.53 rows=412394 width=8) (actual time=3318.057..1196723.915 rows=2752 loops=1) Index Cond: (backup_id = 1070) Total runtime: 1196725.617 ms (3 rows) I've got a few points. Firstly, is your data amenable to partitioning? If so that might be a big winner. Secondly, it might be more efficient for the planner to choose the backup_location_rid index than the combination primary key index. You can test this theory with this cool pg trick: begin; alter table backup_location drop constraint backup_location_pkey; explain analyze select rollback; to see if it's faster. Obviously at this point the application is not usable. If possible we would like to grow this table to the 3-5 billion row range, but I don't know if that is realistic. Any guidance would be greatly appreciated. Without knowing more about your usage patterns, it's hard to say. But partitioning seems like your best choice at the moment. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] reading large BYTEA type is slower than expected
Tom, Actually the 120 records I quoted is a mistake. Since it is a three band image the number of records should be 360 records or 120 records for each band. Mark -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Friday, May 18, 2007 10:48 AM To: Mark Harris Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] reading large BYTEA type is slower than expected Mark Harris [EMAIL PROTECTED] writes: We have recently ported our application to the postgres database. For the most part performance has not been an issue; however there is one situation that is a problem and that is the initial read of rows containing BYTEA values that have an average size of 2 kilobytes or greater. For BYTEA values postgres requires as much 3 seconds to read the values from disk into its buffer cache. How large is large? (No, I don't believe it takes 3 sec to fetch a single 2Kb value.) regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Slow queries on big table
Tyrrill, Ed [EMAIL PROTECTED] writes: Index Scan using backup_location_pkey on backup_location (cost=0.00..1475268.53 rows=412394 width=8) (actual time=3318.057..1196723.915 rows=2752 loops=1) Index Cond: (backup_id = 1070) Total runtime: 1196725.617 ms If we take that at face value it says the indexscan is requiring 434 msec per actual row fetched. Which is just not very credible; the worst case should be about 1 disk seek per row fetched. So there's something going on that doesn't meet the eye. What I'm wondering about is whether the table is heavily updated and seldom vacuumed, leading to lots and lots of dead tuples being fetched and then rejected (hence they'd not show in the actual-rows count). The other thing that seems pretty odd is that it's not using a bitmap scan --- for such a large estimated rowcount I'd have expected a bitmap scan not a plain indexscan. What do you get from EXPLAIN ANALYZE if you force a bitmap scan? (Set enable_indexscan off, and enable_seqscan too if you have to.) regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Slow queries on big table
Tyrrill, Ed wrote: mdsdb=# \d backup_location Table public.backup_location Column | Type | Modifiers ---+-+--- record_id | bigint | not null backup_id | integer | not null Indexes: backup_location_pkey PRIMARY KEY, btree (record_id, backup_id) backup_location_rid btree (record_id) Foreign-key constraints: backup_location_bfk FOREIGN KEY (backup_id) REFERENCES backups(backup_id) ON DELETE CASCADE [snip] mdsdb=# explain analyze select record_id from backup_location where backup_id = 1070; QUERY PLAN - Index Scan using backup_location_pkey on backup_location (cost=0.00..1475268.53 rows=412394 width=8) (actual time=3318.057..1196723.915 rows=2752 loops=1) Index Cond: (backup_id = 1070) Total runtime: 1196725.617 ms (3 rows) The backup_location_rid index on your table is not necessary. The primary key index on (record_id, backup_id) can be used by Postgres, even if the query is only constrained by record_id. See http://www.postgresql.org/docs/8.2/interactive/indexes-multicolumn.html for details. The explain plan indicates that your query is filtered on backup_id, but is using the primary key index on (record_id, backup_id). Based on the table definition, you do not have any good index for filtering on backup_id. The explain plan also seems way off, as I would expect a sequential scan would be used without a good index for backup_id. Did you disable sequential scans before running this query? Have you altered any other configuration or planner parameters? As your backup_location_rid is not necessary, I would recommend dropping that index and creating a new one on just backup_id. This should be a net wash on space, and the new index should make for a straight index scan for the query you presented. Don't forget to analyze after changing the indexes. Hope this helps. Andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Slow queries on big table
Scott Marlowe [EMAIL PROTECTED] writes: Secondly, it might be more efficient for the planner to choose the backup_location_rid index than the combination primary key index. Oh, I'm an idiot; I didn't notice the way the index was set up. Yeah, that index pretty well sucks for a query on backup_id --- it has to scan the entire index, since there's no constraint on the leading column. So that's where the time is going. This combination of indexes: Indexes: backup_location_pkey PRIMARY KEY, btree (record_id, backup_id) backup_location_rid btree (record_id) is really just silly. You should have the pkey and then an index on backup_id alone. See the discussion of multiple indexes in the fine manual: http://www.postgresql.org/docs/8.2/static/indexes-multicolumn.html http://www.postgresql.org/docs/8.2/static/indexes-bitmap-scans.html regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] choosing fillfactor
Gene Hart wrote: I've tried searching the documentation to answer this question but could not find anything. When trying to choose the optimal fillfactor for an index, what is important the number of times the row is updated or the column indexed upon is updated? In my case each row is updated on average about 5 times but for some of the columns with indexes don't change after insertion ever. thanks for any advice It's the number of times the row is updated, regardless of which columns are changed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Slow queries on big table
Tom Lane [EMAIL PROTECTED] writes: Scott Marlowe [EMAIL PROTECTED] writes: Secondly, it might be more efficient for the planner to choose the backup_location_rid index than the combination primary key index. Oh, I'm an idiot; I didn't notice the way the index was set up. Yeah, that index pretty well sucks for a query on backup_id --- it has to scan the entire index, since there's no constraint on the leading column. So that's where the time is going. This combination of indexes: Indexes: backup_location_pkey PRIMARY KEY, btree (record_id, backup_id) backup_location_rid btree (record_id) is really just silly. You should have the pkey and then an index on backup_id alone. See the discussion of multiple indexes in the fine manual: http://www.postgresql.org/docs/8.2/static/indexes-multicolumn.html http://www.postgresql.org/docs/8.2/static/indexes-bitmap-scans.html regards, tom lane Thanks for the help guys! That was my problem. I actually need the backup_location_rid index for a different query so I am going to keep it. Here is the result with the new index: mdsdb=# explain analyze select record_id from backup_location where backup_id = 1070; QUERY PLAN Index Scan using backup_location_bid on backup_location (cost=0.00..9573.07 rows=415897 width=8) (actual time=0.106..3.486 rows=2752 loops=1) Index Cond: (backup_id = 1070) Total runtime: 4.951 ms (3 rows) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Slow queries on big table
Tyrrill, Ed [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: This combination of indexes: Indexes: backup_location_pkey PRIMARY KEY, btree (record_id, backup_id) backup_location_rid btree (record_id) is really just silly. You should have the pkey and then an index on backup_id alone. Thanks for the help guys! That was my problem. I actually need the backup_location_rid index for a different query so I am going to keep it. Well, you don't really *need* it; the two-column index on (record_id, backup_id) will serve perfectly well for queries on its leading column alone. It'll be physically bigger and hence slightly slower to scan than a single-column index; but unless the table is almost completely read-only, the update overhead of maintaining all three indexes is probably going to cost more than you can save with it. Try that other query with and without backup_location_rid and see how much you're really saving. Index Scan using backup_location_bid on backup_location (cost=0.00..9573.07 rows=415897 width=8) (actual time=0.106..3.486 rows=2752 loops=1) Index Cond: (backup_id = 1070) Total runtime: 4.951 ms That's more like it ;-) regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] CPU Intensive query
What Postgres version is this? 8.1.3 set enable_nestloop = off; What's the rationale for this? To eliminate nested loop. It does a nested loop betwwen to very large table(millions of rows). HashAggregate (cost=152555.97..152567.32 rows=267 width=162) 152000 disk page fetches is a bit, but it shouldn't take 24 hours. There's probably misestimation involved at some point here. Does it really return 267 rows, or many more? It returns finally about 19-20 thousand rows. Steinar H. Gunderson [EMAIL PROTECTED] wrote: On Fri, May 18, 2007 at 09:02:52AM -0700, Abu Mushayeed wrote: I have an interesting problem. I have the following query that ran ok on Monday and Tuesday and it has been running ok since I have been at this job. I have seen it to be IO intensive, but since Wednesday it has become CPU intensive. Database wise fresh data has been put into the tables, vacuumed analyzed, no other parameter has been modified. What Postgres version is this? The query is as follows and it's explain plan is also attached: Normally EXPLAIN ANALYZE data would be much better than EXPLAIN, but if the query indeed does not finish, it's not going to help much. set enable_nestloop = off; What's the rationale for this? HashAggregate (cost=152555.97..152567.32 rows=267 width=162) 152000 disk page fetches is a bit, but it shouldn't take 24 hours. There's probably misestimation involved at some point here. Does it really return 267 rows, or many more? /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster - 8:00? 8:25? 8:40? Find a flick in no time with theYahoo! Search movie showtime shortcut.
Re: [PERFORM] CPU Intensive query
Abu Mushayeed [EMAIL PROTECTED] writes: The query is as follows and it's explain plan is also attached: The query itself seems to be a simple join over not too many rows, so I don't see how it could be taking 24 hours. What I suspect is you're incurring lots and lots of invocations of those user-written functions and one of them has suddenly decided to get very slow. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] CPU Intensive query
On Fri, May 18, 2007 at 02:37:27PM -0700, Abu Mushayeed wrote: set enable_nestloop = off; What's the rationale for this? To eliminate nested loop. It does a nested loop betwwen to very large table(millions of rows). If the planner chooses a nested loop, it is because it believes it is the most efficient solution. I'd turn it back on and try to figure out why the planner was wrong. Note that a nested loop with an index scan on one or both sides can easily be as efficient as anything. Did you ANALYZE your tables recently? If the joins are really between millions of rows and the planner thinks it's a couple thousands, the stats sound rather off... HashAggregate (cost=152555.97..152567.32 rows=267 width=162) 152000 disk page fetches is a bit, but it shouldn't take 24 hours. There's probably misestimation involved at some point here. Does it really return 267 rows, or many more? It returns finally about 19-20 thousand rows. So the planner is off by a factor of at least a hundred. That's a good first-level explanation for why it's slow, at least... If you can, please provide EXPLAIN ANALYZE output for your query (after running ANALYZE on all your tables, if you haven't already); even though it will take some time, it usually makes this kind of performance debugging much easier. /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] CPU Intensive query
On Sat, May 19, 2007 at 12:32:33AM +0200, Steinar H. Gunderson wrote: Did you ANALYZE your tables recently? If the joins are really between millions of rows and the planner thinks it's a couple thousands, the stats sound rather off... Sorry, I forgot your first e-mail where you said you had both vacuumed and analyzed recently. The estimates are still off, though -- the WHERE query might be difficult to estimate properly. (I'm not sure how Tom arrived on his conclusion of expensive user-defined functions, but given the usual precisions of his guesses, I'd check that too...) /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] 121+ million record table perf problems
I've got a table with ~121 million records in it. Select count on it currently takes ~45 minutes, and an update to the table to set a value on one of the columns I finally killed after it ran 17 hours and had still not completed. Queries into the table are butt slow, and The update query that started this all I had to kill after 17hours. It should have updated all 121+ million records. That brought my select count down to 19 minutes, but still a far cry from acceptable. If you have a column that needs to be updated often for all rows, separate it into a different table, and create a view that joins it back to the main table so that your application still sees the old schema. This will greatly speed your update since (in Postgres) and update is the same as a delete+insert. By updating that one column, you're re-writing your entire 121 million rows. If you separate it, you're only rewriting that one column. Don't forget to vacuum/analyze and reindex when you're done. Better yet, if you can stand a short down time, you can drop indexes on that column, truncate, then do 121 million inserts, and finally reindex. That will be MUCH faster. Craig ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Slow queries on big table
Tom Lane [EMAIL PROTECTED] writes: Thanks for the help guys! That was my problem. I actually need the backup_location_rid index for a different query so I am going to keep it. Well, you don't really *need* it; the two-column index on (record_id, backup_id) will serve perfectly well for queries on its leading column alone. It'll be physically bigger and hence slightly slower to scan than a single-column index; but unless the table is almost completely read-only, the update overhead of maintaining all three indexes is probably going to cost more than you can save with it. Try that other query with and without backup_location_rid and see how much you're really saving. Well, the query that got me to add backup_location_rid took 105 minutes using only the primary key index. After I added backup_location_rid the query was down to about 45 minutes. Still not very good, and I am still fiddling around with it. The query is: mdsdb=# explain analyze select backupobjects.record_id from backupobjects left outer join backup_location using(record_id) where backup_id is null; QUERY PLAN - Merge Left Join (cost=0.00..21408455.06 rows=11790970 width=8) (actual time=2784967.410..2784967.410 rows=0 loops=1) Merge Cond: (outer.record_id = inner.record_id) Filter: (inner.backup_id IS NULL) - Index Scan using backupobjects_pkey on backupobjects (cost=0.00..443484.31 rows=11790970 width=8) (actual time=0.073..47865.957 rows=11805996 loops=1) - Index Scan using backup_location_rid on backup_location (cost=0.00..20411495.21 rows=162435366 width=12) (actual time=0.110..2608485.437 rows=162426837 loops=1) Total runtime: 2784991.612 ms (6 rows) It is of course the same backup_location, but backupobjects is: mdsdb=# \d backupobjects Table public.backupobjects Column |Type | Modifiers +-+--- record_id | bigint | not null dir_record_id | integer | name | text| extension | character varying(64) | hash | character(40) | mtime | timestamp without time zone | size | bigint | user_id| integer | group_id | integer | meta_data_hash | character(40) | Indexes: backupobjects_pkey PRIMARY KEY, btree (record_id) backupobjects_meta_data_hash_key UNIQUE, btree (meta_data_hash) backupobjects_extension btree (extension) backupobjects_hash btree (hash) backupobjects_mtime btree (mtime) backupobjects_size btree (size) record_id has in backupobjects has a many to many relationship to record_id in backup_location. Ed ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] 121+ million record table perf problems
Craig James wrote: Better yet, if you can stand a short down time, you can drop indexes on that column, truncate, then do 121 million inserts, and finally reindex. That will be MUCH faster. Or you can do a CLUSTER, which does all the same things automatically. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] 121+ million record table perf problems
On Fri, 18 May 2007, [EMAIL PROTECTED] wrote: shared_buffers = 24MB work_mem = 256MB maintenance_work_mem = 512MB You should take a minute to follow the suggestions at http://www.westnet.com/~gsmith/content/postgresql/pg-5minute.htm and set dramatically higher values for shared_buffers and effective_cache_size for your server. Also, your work_mem figure may be OK for now, but if ever do have 10 people connect to this database at once and run big queries you could have an issue with it set that high--that's a per client setting. After you're done with that, you should also follow the suggestions there to do a VACCUM ANALYZE. That may knock out two other potential issues at once. It will take a while to run, but I think you need it badly to sort out what you've already done. random_page_cost = 100 I'm not sure what logic prompted this change, but after you correct the above you should return this to its default; if this is helping now it's only because other things are so far off from where they should be. update res set state=5001; The update query that started this all I had to kill after 17hours. It should have updated all 121+ million records. That brought my select count down to 19 minutes, but still a far cry from acceptable. You should work on the select side of this first. If that isn't running in a moderate amount of time, trying to get the much more difficult update to happen quickly is hopeless. Once the select is under control, there are a lot of parameters to adjust that will effect the speed of the updates. The first thing to do is dramatically increase checkpoint_segments; I would set that to at least 30 in your situation. Also: going to RAID-1 won't make a bit of difference to your update speed; could even make it worse. Adding more RAM may not help much either. If you don't have one already, the real key to improving performance in a heavy update situation is to get a better disk controller with a cache that helps accelerate writes. Then the next step is to stripe this data across multiple disks in a RAID-0 configuration to split the I/O up. You have a lot of work ahead of you. Even after you resolve the gross issues here, you have a table that has around 10 indexes on it. Maintaining those is far from free; every time you update a single record in that table, the system has to update each of those indexes on top of the record update itself. So you're really asking your system to do around 1.2 billion disk-related operations when you throw out your simple batch update against every row, and good luck getting that to run in a time frame that's less than days long. The right way to get a feel for what's going on is to drop all the indexes except for the constraints and see how the bulk update runs after the parameter changes suggested above are in place and the database has been cleaned up with vacuum+analyze. Once you have a feel for that, add some indexes back in and see how it degrades. Then you'll know how adding each one of them impacts your performance. I suspect you're going to have to redesign your indexing scheme before this is over. I don't think your current design is ever going to work the way you expect it to. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Background vacuum
On Fri, 18 May 2007, Ron Mayer wrote: Anecdotally ;-) I've found renice-ing reports to help Let's break this down into individual parts: 1) Is there enough CPU-intensive activity in some database tasks that they can be usefully be controlled by tools like nice? Sure. 2) Is it so likely that you'll fall victim to a priority inversion problem that you shouldn't ever consider that technique? No. 3) Does the I/O scheduler in modern OSes deal with a lot more things than just the CPU? You bet. 4) Is vacuuming a challenging I/O demand? Quite. Add all this up, and that fact that you're satisfied with how nice has worked successfully for you doesn't have to conflict with an opinion that it's not the best approach for controlling vacuuming. I just wouldn't extrapolate your experience too far here. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate