Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-05-18 Thread Liviu Ionescu
 It will probably be useful with EXPLAIN ANALYZE of your 
 queries, not just the EXPLAIN.

it took 245 seconds to complete, see below.

 It looks like the planner thinks this is going to be really 
 cheap -- so it's misestimating something somewhere. Have you 
 ANALYZEd recently?

yes, but to be sure I did it again before issuing the request; no 
improvements...

regards,

Liviu

Nested Loop Left Join  (cost=32.03..2026.70 rows=1 width=125) (actual 
time=16.686..244822.521 rows=2026 loops=1)
  Join Filter: (n.nodeid = public.nodeattributes.nodeid)
  -  Nested Loop Left Join  (cost=26.55..1420.57 rows=1 width=115) (actual 
time=13.833..176136.527 rows=2026 loops=1)
Join Filter: (n.nodeid = public.nodeattributes.nodeid)
-  Nested Loop Left Join  (cost=21.06..810.90 rows=1 width=105) 
(actual time=10.336..95476.175 rows=2026 loops=1)
  Join Filter: (n.nodeid = public.nodeattributes.nodeid)
  -  Nested Loop Left Join  (cost=15.55..194.15 rows=1 width=95) 
(actual time=6.514..11524.892 rows=2026 loops=1)
Join Filter: (n.nodeid = rtunodes.nodeid)
-  Nested Loop Left Join  (cost=11.17..107.94 rows=1 
width=82) (actual time=0.661..71.751 rows=2026 loops=1)
  Filter: (templatenodes.nodeid IS NULL)
  -  Hash Left Join  (cost=11.17..99.66 rows=1 
width=82) (actual time=0.643..36.053 rows=2206 loops=1)
Hash Cond: (n.nodeid = templates.nodeid)
Filter: (templates.nodeid IS NULL)
-  Hash Left Join  (cost=8.73..88.06 
rows=2270 width=82) (actual time=0.502..27.756 rows=2270 loops=1)
  Hash Cond: (n.nodeid = rtus.nodeid)
  -  Hash Left Join  (cost=4.34..74.11 
rows=2270 width=74) (actual time=0.286..20.179 rows=2270 loops=1)
Hash Cond: (n.nodeid = 
areas.nodeid)
-  Hash Left Join  
(cost=1.43..61.83 rows=2270 width=66) (actual time=0.114..13.062 rows=2270 
loops=1)
  Hash Cond: (n.nodeid = 
realms.nodeid)
  -  Seq Scan on nodes n  
(cost=0.00..51.70 rows=2270 width=49) (actual time=0.016..4.089 rows=2270 
loops=1)
  -  Hash  (cost=1.19..1.19 
rows=19 width=17) (actual time=0.056..0.056 rows=19 loops=1)
-  Seq Scan on realms 
 (cost=0.00..1.19 rows=19 width=17) (actual time=0.006..0.023 rows=19 loops=1)
-  Hash  (cost=1.85..1.85 rows=85 
width=8) (actual time=0.156..0.156 rows=85 loops=1)
  -  Seq Scan on areas  
(cost=0.00..1.85 rows=85 width=8) (actual time=0.007..0.070 rows=85 loops=1)
  -  Hash  (cost=3.06..3.06 rows=106 
width=8) (actual time=0.200..0.200 rows=106 loops=1)
-  Seq Scan on rtus  
(cost=0.00..3.06 rows=106 width=8) (actual time=0.010..0.105 rows=106 loops=1)
-  Hash  (cost=1.64..1.64 rows=64 width=4) 
(actual time=0.119..0.119 rows=64 loops=1)
  -  Seq Scan on templates  
(cost=0.00..1.64 rows=64 width=4) (actual time=0.006..0.059 rows=64 loops=1)
  -  Index Scan using nodeid_pkey on templatenodes  
(cost=0.00..8.27 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=2206)
Index Cond: (n.nodeid = templatenodes.nodeid)
-  Hash Join  (cost=4.38..63.51 rows=1816 width=13) 
(actual time=0.012..4.417 rows=1816 loops=2026)
  Hash Cond: (rtunodes.rtuid = r.nodeid)
  -  Seq Scan on rtunodes  (cost=0.00..34.16 
rows=1816 width=9) (actual time=0.009..1.290 rows=1816 loops=2026)
  -  Hash  (cost=3.06..3.06 rows=106 width=8) (actual 
time=0.194..0.194 rows=106 loops=1)
-  Seq Scan on rtus r  (cost=0.00..3.06 
rows=106 width=8) (actual time=0.005..0.091 rows=106 loops=1)
  -  Hash Join  (cost=5.51..611.90 rows=388 width=14) (actual 
time=0.033..39.896 rows=2079 loops=2026)
Hash Cond: (a.typeid = t.typeid)
-  Hash Join  (cost=4.34..604.41 rows=647 width=18) 
(actual time=0.031..36.513 rows=2079 loops=2026)
  Hash Cond: (public.nodeattributes.attributeid = 
a.attributeid)
  -  Seq Scan on nodeattributes  (cost=0.00..505.35 
rows=23535 width=18) (actual time=0.008..16.826 rows=23535 loops=2026)
  -  Hash  (cost=4.28..4.28 rows=5 width=8) (actual 
time=0.077..0.077 rows=5 loops=1)

Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-05-18 Thread Steinar H. Gunderson
On Fri, May 18, 2007 at 12:02:44PM +0300, Liviu Ionescu wrote:
 the 8.2.4 plan with join_collapse_limit = 1 (with default it was worse, full 
 of nested loops)

It will probably be useful with EXPLAIN ANALYZE of your queries, not just the
EXPLAIN.

 Nested Loop Left Join  (cost=32.01..2012.31 rows=1 width=230)

It looks like the planner thinks this is going to be really cheap -- so it's
misestimating something somewhere. Have you ANALYZEd recently?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Ever Increasing IOWAIT

2007-05-18 Thread Richard Huxton

Ralph Mason wrote:

We have a database running on a 4 processor machine.  As time goes by the IO
gets worse and worse peeking at about 200% as the machine loads up.

The weird thing is that if we restart postgres it’s fine for hours but over
time it goes bad again.

(CPU usage graph here HYPERLINK
http://www.flickr.com/photos/[EMAIL 
PROTECTED]/502596262/http://www.flickr.com/p
hotos/[EMAIL PROTECTED]/502596262/ )  You can clearly see where the restart
happens in the IO area


I'm assuming here we're talking about that big block of iowait at about 
4-6am?


I take it vmstat/iostat show a corresponding increase in disk activity 
at that time.


The question is - what?
Does the number of PG processes increase at that time? If that's not 
intentional then you might need to see what your applications are up to.


Do you have a vacuum/backup scheduled for that time? Do you have some 
other process doing a lot of file I/O at that time?



This is Postgres  8.1.4 64bit.


You'll want to upgrade to the latest patch release - you're missing 5 
lots of bug-fixes there.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-05-18 Thread Liviu Ionescu
I recently tried to upgrade to 8.2.4, but major queries I wrote for 8.1.4 are 
now planned differently on 8.2.4 and are no longer usable. What the 8.1.4 
planned as a series of 'hash left join's and took about 2 seconds now is 
planned as 'nested loop left joins' and takes forever.

Other request were also affected, increasing the time form miliseconds to 
hundreds of miliseconds, even seconds.

The worst performance hit was on the following query. I know it is a bit 
extreme, but worked perfectly on 8.1.4.

Regards,

Liviu


SELECT n.nodeid, 
CASE
WHEN n.parentnodeid IS NULL THEN -1
ELSE n.parentnodeid
END AS parentnodeid, n.nodename, av.value AS iconname, 
avt.value AS templatename, avs.value AS subclass, n.globalnodeid, 
n.isaddupi, 
CASE
WHEN realms.nodeid IS NOT NULL THEN 'SERVER'::text
WHEN areas.nodeid IS NOT NULL THEN 'AREA'::text
WHEN rtus.nodeid IS NOT NULL THEN 'DEVICE'::text
WHEN rtunodes.nodeid IS NOT NULL THEN 'TAG'::text
ELSE NULL::text
END AS class, realms.name AS realmname, 
CASE
WHEN n.nodeclass::text = 'area'::text AND n.nodesubclass IS NOT 
NULL THEN true
ELSE false
END AS istemplate, 
CASE
WHEN realms.nodeid IS NOT NULL THEN realms.nodeid
WHEN areas.nodeid IS NOT NULL THEN areas.realmid
WHEN rtus.nodeid IS NOT NULL THEN rtus.realmid
WHEN rtunodes.nodeid IS NOT NULL THEN r.realmid
ELSE NULL::integer
END AS realmid, rtunodes.rtuid, rtunodes.isinvalid, n.isvalid
   FROM nodes n
   LEFT JOIN realms ON n.nodeid = realms.nodeid
   LEFT JOIN areas ON n.nodeid = areas.nodeid
   LEFT JOIN rtus ON n.nodeid = rtus.nodeid
   LEFT JOIN templates ON n.nodeid = templates.nodeid
   LEFT JOIN templatenodes ON n.nodeid = templatenodes.nodeid
   LEFT JOIN (rtunodes
   JOIN rtus r ON rtunodes.rtuid = r.nodeid) ON n.nodeid = rtunodes.nodeid
   LEFT JOIN ( SELECT attributes_values2_view.nodeid, 
attributes_values2_view.value
   FROM attributes_values2_view
  WHERE attributes_values2_view.attributename::text = 'iconName'::text) av ON 
n.nodeid = av.nodeid
   LEFT JOIN ( SELECT attributes_values2_view.nodeid, 
attributes_values2_view.value
   FROM attributes_values2_view
  WHERE attributes_values2_view.attributename::text = 'addUPItemplate'::text) 
avt ON n.nodeid = avt.nodeid
   LEFT JOIN ( SELECT attributes_values2_view.nodeid, 
attributes_values2_view.value
   FROM attributes_values2_view
  WHERE attributes_values2_view.attributename::text = 'addUPIsubclass'::text) 
avs ON n.nodeid = avs.nodeid
  WHERE templates.nodeid IS NULL AND templatenodes.nodeid IS NULL;


CREATE OR REPLACE VIEW attributes_values2_view AS 
 SELECT nodeattributes.nodeid, nodeattributes.attributeid, a.name AS 
attributename, 
   t.name AS typename, a.typeid, a.valuesize, a.flags, nodeattributes.value, 
a.creationdate
   FROM nodeattributes
   LEFT JOIN attributes a USING (attributeid)
   LEFT JOIN types t USING (typeid)
  WHERE t.isattributetype;



the 8.2.4 plan with join_collapse_limit = 1 (with default it was worse, full of 
nested loops)

Nested Loop Left Join  (cost=32.01..2012.31 rows=1 width=230)
  Join Filter: (n.nodeid = public.nodeattributes.nodeid)
  -  Nested Loop Left Join  (cost=26.47..1411.38 rows=1 width=220)
Join Filter: (n.nodeid = public.nodeattributes.nodeid)
-  Nested Loop Left Join  (cost=20.93..810.45 rows=1 width=210)
  Join Filter: (n.nodeid = public.nodeattributes.nodeid)
  -  Nested Loop Left Join  (cost=15.39..209.52 rows=1 width=200)
Join Filter: (n.nodeid = rtunodes.nodeid)
-  Nested Loop Left Join  (cost=11.14..122.60 rows=1 
width=187)
  Filter: (templatenodes.nodeid IS NULL)
  -  Hash Left Join  (cost=11.14..99.52 rows=11 
width=187)
Hash Cond: (n.nodeid = templates.nodeid)
Filter: (templates.nodeid IS NULL)
-  Hash Left Join  (cost=8.70..87.95 
rows=2266 width=187)
  Hash Cond: (n.nodeid = rtus.nodeid)
  -  Hash Left Join  (cost=4.45..74.20 
rows=2266 width=179)
Hash Cond: (n.nodeid = 
areas.nodeid)
-  Hash Left Join  
(cost=1.45..61.81 rows=2266 width=171)
  Hash Cond: (n.nodeid = 
realms.nodeid)
  -  Seq Scan on nodes n  
(cost=0.00..51.66 rows=2266 width=49)
  -  Hash  (cost=1.20..1.20 
rows=20 width=122)
-  Seq Scan on realms 
 (cost=0.00..1.20 rows=20 width=122)
  

Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-05-18 Thread Steinar H. Gunderson
On Fri, May 18, 2007 at 01:14:56PM +0300, Liviu Ionescu wrote:
 yes, but to be sure I did it again before issuing the request; no 
 improvements...

Is this with the join collapse limit set to 1, or with default? (Default is
generally more interesting.)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-05-18 Thread Liviu Ionescu
 Is this with the join collapse limit set to 1, or with 
 default? (Default is generally more interesting.)

below is the same query with the default setting.

regards,

Liviu


Nested Loop Left Join  (cost=23.35..1965.46 rows=1 width=125) (actual 
time=50.408..231926.123 rows=2026 loops=1)
  Join Filter: (n.nodeid = public.nodeattributes.nodeid)
  -  Nested Loop Left Join  (cost=17.81..1357.58 rows=1 width=115) (actual 
time=47.103..156521.050 rows=2026 loops=1)
Join Filter: (n.nodeid = public.nodeattributes.nodeid)
-  Nested Loop Left Join  (cost=12.30..752.97 rows=1 width=105) 
(actual time=43.924..81977.726 rows=2026 loops=1)
  Join Filter: (n.nodeid = public.nodeattributes.nodeid)
  -  Nested Loop Left Join  (cost=6.83..150.65 rows=1 width=95) 
(actual time=40.603..12477.227 rows=2026 loops=1)
-  Nested Loop Left Join  (cost=6.83..150.37 rows=1 
width=78) (actual time=38.448..12459.918 rows=2026 loops=1)
  -  Nested Loop Left Join  (cost=6.83..150.08 rows=1 
width=70) (actual time=31.793..12436.536 rows=2026 loops=1)
-  Nested Loop Left Join  (cost=6.83..149.80 
rows=1 width=62) (actual time=6.588..12394.366 rows=2026 loops=1)
  Filter: (templatenodes.nodeid IS NULL)
  -  Nested Loop Left Join  
(cost=6.83..149.51 rows=1 width=62) (actual time=6.525..12362.969 rows=2206 
loops=1)
Join Filter: (n.nodeid = 
rtunodes.nodeid)
-  Hash Left Join  
(cost=2.44..63.29 rows=1 width=49) (actual time=0.361..14.426 rows=2206 
loops=1)
  Hash Cond: (n.nodeid = 
templates.nodeid)
  Filter: (templates.nodeid IS 
NULL)
  -  Seq Scan on nodes n  
(cost=0.00..51.70 rows=2270 width=49) (actual time=0.071..4.417 rows=2270 
loops=1)
  -  Hash  (cost=1.64..1.64 
rows=64 width=4) (actual time=0.152..0.152 rows=64 loops=1)
-  Seq Scan on 
templates  (cost=0.00..1.64 rows=64 width=4) (actual time=0.032..0.082 rows=64 
loops=1)
-  Hash Join  (cost=4.38..63.51 
rows=1816 width=13) (actual time=0.011..4.365 rows=1816 loops=2206)
  Hash Cond: (rtunodes.rtuid = 
r.nodeid)
  -  Seq Scan on rtunodes  
(cost=0.00..34.16 rows=1816 width=9) (actual time=0.008..1.276 rows=1816 
loops=2206)
  -  Hash  (cost=3.06..3.06 
rows=106 width=8) (actual time=0.241..0.241 rows=106 loops=1)
-  Seq Scan on rtus r 
 (cost=0.00..3.06 rows=106 width=8) (actual time=0.029..0.136 rows=106 loops=1)
  -  Index Scan using nodeid_pkey on 
templatenodes  (cost=0.00..0.28 rows=1 width=4) (actual time=0.008..0.008 
rows=0 loops=2206)
Index Cond: (n.nodeid = 
templatenodes.nodeid)
-  Index Scan using rtus_pkey on rtus  
(cost=0.00..0.27 rows=1 width=8) (actual time=0.016..0.016 rows=0 loops=2026)
  Index Cond: (n.nodeid = rtus.nodeid)
  -  Index Scan using areas_pkey on areas  
(cost=0.00..0.27 rows=1 width=8) (actual time=0.007..0.007 rows=0 loops=2026)
Index Cond: (n.nodeid = areas.nodeid)
-  Index Scan using realms_pkey on realms  
(cost=0.00..0.27 rows=1 width=17) (actual time=0.004..0.004 rows=0 loops=2026)
  Index Cond: (n.nodeid = realms.nodeid)
  -  Hash Join  (cost=5.48..600.38 rows=155 width=14) (actual 
time=0.812..34.198 rows=132 loops=2026)
Hash Cond: (public.nodeattributes.attributeid = 
a.attributeid)
-  Seq Scan on nodeattributes  (cost=0.00..505.35 
rows=23535 width=18) (actual time=0.009..16.660 rows=23535 loops=2026)
-  Hash  (cost=5.47..5.47 rows=1 width=4) (actual 
time=0.196..0.196 rows=2 loops=1)
  -  Hash Join  (cost=1.18..5.47 rows=1 width=4) 
(actual time=0.124..0.187 rows=2 loops=1)
Hash Cond: (a.typeid = t.typeid)
-  Seq Scan on attributes a  (cost=0.00..4.28 
rows=2 width=8) (actual time=0.044..0.103 rows=2 loops=1)
  Filter: ((name)::text = 
'addUPIsubclass'::text)
-  Hash  (cost=1.10..1.10 rows=6 width=4) 
(actual time=0.047..0.047 rows=6 loops=1)
  -  Seq Scan on types t  

Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-05-18 Thread Steinar H. Gunderson
On Fri, May 18, 2007 at 02:05:36PM +0300, Liviu Ionescu wrote:
 -  Hash Left Join  
 (cost=2.44..63.29 rows=1 width=49) (actual time=0.361..14.426 rows=2206 
 loops=1)
   Hash Cond: (n.nodeid = 
 templates.nodeid)
   Filter: (templates.nodeid 
 IS NULL)
   -  Seq Scan on nodes n  
 (cost=0.00..51.70 rows=2270 width=49) (actual time=0.071..4.417 rows=2270 
 loops=1)
   -  Hash  (cost=1.64..1.64 
 rows=64 width=4) (actual time=0.152..0.152 rows=64 loops=1)
 -  Seq Scan on 
 templates  (cost=0.00..1.64 rows=64 width=4) (actual time=0.032..0.082 
 rows=64 loops=1)

This seems to be the source of the misestimation. You might want to try using
n WHERE n.nodein NOT IN (SELECT nodeid FROM templates) instead of n LEFT
JOIN templates USING (nodeid) WHERE templates.nodeid IS NULL and see if it
helps.

 Total runtime: 231929.656 ms

Note that this is better than the version with collapse_limit set to 1. :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-05-18 Thread Steinar H. Gunderson
On Fri, May 18, 2007 at 02:51:42PM +0300, Liviu Ionescu wrote:
 it helped, the new version of the query takes 2303 ms on both 8.1.4 and 8.2.4.

And the old one?

 any idea why the 8.2.4 planner is not happy with the initial select? was it
 just a big chance that it worked in 8.1.4 or the 8.2.4 planner has a
 problem?

I guess it was more or less by chance, especially as 8.1 did not reorder
outer joins. Others might know more about the estimation, though.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-05-18 Thread Liviu Ionescu
  it helped, the new version of the query takes 2303 ms on both 8.1.4 
  and 8.2.4.
 
 And the old one?

slightly shorter, 2204 ms.

as a subjective perception, the entire application is slightly slower on 8.2.4, 
probably there are many queries that were manually tunned for 7.x/8.1.x and now 
need rewriting, which is not really what I expected from an upgrade.
 

Liviu


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Ever Increasing IOWAIT

2007-05-18 Thread Mark Lewis
You're not swapping are you?  One explanation could be that PG is
configured to think it has access to a little more memory than the box
can really provide, which forces it to swap once it's been running for
long enough to fill up its shared buffers or after a certain number of
concurrent connections are opened.

-- Mark Lewis

On Fri, 2007-05-18 at 10:45 +1200, Ralph Mason wrote:
 We have a database running on a 4 processor machine.  As time goes by
 the IO gets worse and worse peeking at about 200% as the machine loads
 up.
 
  
 
 The weird thing is that if we restart postgres it’s fine for hours but
 over time it goes bad again.
 
  
 
 (CPU usage graph here
 http://www.flickr.com/photos/[EMAIL PROTECTED]/502596262/ )  You can clearly
 see where the restart happens in the IO area
 
  
 
 This is Postgres  8.1.4 64bit.
 
  
 
 Anyone have any ideas?
 
  
 
 Thanks
 
 Ralph
 
  
 
 
 
 --
 Internal Virus Database is out-of-date.
 Checked by AVG Free Edition.
 Version: 7.5.432 / Virus Database: 268.15.9/573 - Release Date:
 5/12/2006 4:07 p.m.
 
 

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-05-18 Thread George Pavlov
  This seems to be the source of the misestimation. You might 
  want to try using n WHERE n.nodein NOT IN (SELECT nodeid 
  FROM templates) instead of n LEFT JOIN templates USING 
  (nodeid) WHERE templates.nodeid IS NULL and see if it helps.
 
 it helped, the new version of the query takes 2303 ms on both 
 8.1.4 and 8.2.4.

this is very interesting. on 8.1.x i have also repeatedly had to rewrite
joins as their equivalent IN/NOT IN alternatives in order to improve
performance, so i feel that at least under some alignments of the
planets 8.1 has similar problems.

george

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-05-18 Thread Liviu Ionescu
 under some alignments of the planets 8.1 has similar problems.

8.1 might have similar problems, but the point here is different: if what
was manually tuned to work in 8.1 confuses the 8.2 planner and performance
drops so much (from 2303 to 231929 ms in my case) upgrading a production
machine to 8.2 is a risky business. I probably have hundreds of sql
statements in my application, some of them quite complex, and it is not
reasonable to check all of them in order to certify them to be 8.2
compliant.

regards,

Liviu


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-05-18 Thread Tom Lane
Steinar H. Gunderson [EMAIL PROTECTED] writes:
 On Fri, May 18, 2007 at 02:05:36PM +0300, Liviu Ionescu wrote:
 -  Hash Left Join  
 (cost=2.44..63.29 rows=1 width=49) (actual time=0.361..14.426 rows=2206 
 loops=1)
   Hash Cond: (n.nodeid = 
 templates.nodeid)
   Filter: (templates.nodeid 
 IS NULL)

 This seems to be the source of the misestimation.

Yeah.  8.2 is estimating that the nodeid IS NULL condition will
discard all or nearly all the rows, presumably because there aren't any
null nodeid's in the underlying table --- it fails to consider that the
LEFT JOIN may inject some nulls.  8.1 was not any brighter; the reason
it gets a different estimate is that it doesn't distinguish left-join
and WHERE clauses at all, but assumes that the result of the left join
can't have fewer rows than its left input, even after applying the
filter condition.  In this particular scenario that happens to be a
better estimate.  So even though 8.2 is smarter, and there is no bug
here that wasn't in 8.1 too, it's getting a worse estimate leading to
a worse plan.

This is a sufficiently common idiom that I think it's a must-fix
problem.  Not sure about details yet, but it seems somehow the
selectivity estimator had better start accounting for
outer-join-injected NULLs.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] CPU Intensive query

2007-05-18 Thread Abu Mushayeed
I have an interesting problem. I have the following query that ran ok on Monday 
and Tuesday and it has been running ok since I have been at this job. I have 
seen it to be IO intensive, but since Wednesday it has become CPU intensive. 
Database wise fresh data has been put into the tables, vacuumed  analyzed, no 
other parameter has been modified.
   
  Wednesday it ran over 24 hours and it did not finish and all this time it 
pegged a CPU between 95-99%. Yesterday the same story. I do not understand what 
could have caused it to behave like this suddenly. I am hoping somebody can 
point me to do research in the right direction. 
   
  The query is as follows and it's explain plan is also attached:
   
  set enable_nestloop = off; 
INSERT INTO linkshare.macys_ls_daily_shipped 
SELECT 
 ddw.intr_xref, 
 cdm.cdm_get_linkshare_id_safe(ddw.intr_xref, 10), 
 to_char(cdm.cdm_utc_convert(cdm.cdm_get_linkshare_timestamp(ddw.intr_xref, 
10), -5), '-MM-DD/HH24:MI:SS'), 
 to_char(cdm.cdm_utc_convert(to_char(sales.order_date, '-MM-DD 
HH24:MI:SS')::timestamp without time zone, -5), '-MM-DD/HH24:MI:SS') , 
 ddw.item_upc, 
 sum(abs(ddw.itm_qty)), 
 sum((ddw.tran_itm_total * 100::numeric)::integer), 
 'USD', '', '', '', 
 ddw.item_desc 
FROM 
 cdm.cdm_ddw_tran_item_grouped ddw 
JOIN 
 cdm.cdm_sitesales sales ON ddw.intr_xref::text = sales.order_number::text 
WHERE 
 ddw.cal_date  (CURRENT_DATE - 7) AND ddw.cal_date  CURRENT_DATE 
AND 
 ddw.intr_xref IS NOT NULL 
AND  trim(cdm.cdm_get_linkshare_id_safe(ddw.intr_xref, 10))  '' 
AND  cdm.cdm_utc_convert(cdm.cdm_get_linkshare_timestamp(ddw.intr_xref, 10), 
-5)::text::date = (CURRENT_DATE - 52) 
AND  sales.order_date = (CURRENT_DATE - 52) 
AND  (tran_typ_id = 'S'::bpchar) 
AND  btrim(item_group::text)  'EGC'::text 
AND  btrim(item_group::text)  'VGC'::text 
GROUP BY 
 ddw.intr_xref, 
 cdm.cdm_get_linkshare_id_safe(ddw.intr_xref, 10), 
 to_char(cdm.cdm_utc_convert(cdm.cdm_get_linkshare_timestamp(ddw.intr_xref, 
10), -5), '-MM-DD/HH24:MI:SS'), 
 to_char(cdm.cdm_utc_convert(to_char(sales.order_date, '-MM-DD 
HH24:MI:SS')::timestamp without time zone, -5), '-MM-DD/HH24:MI:SS'), 
 ddw.item_upc, 
 8, 9, 10, 11, 
 ddw.item_desc;
   
   
  HashAggregate  (cost=152555.97..152567.32 rows=267 width=162)
  -  Hash Join  (cost=139308.18..152547.96 rows=267 width=162)
Hash Cond: ((outer.intr_xref)::text = (inner.order_number)::text)
-  GroupAggregate  (cost=106793.14..109222.13 rows=4319 width=189)
  -  Sort  (cost=106793.14..106901.09 rows=43182 width=189)
Sort Key: cdm_ddw_tran_item.appl_xref, 
cdm_ddw_tran_item.intr_xref, cdm_ddw_tran_item.tran_typ_id, 
cdm_ddw_tran_item.cal_date, cdm_ddw_tran_item.cal_time, 
cdm_ddw_tran_item.tran_itm_total, cdm_ddw_tran_item.tran_tot_amt, 
cdm_ddw_tran_item.fill_store_div, cdm_ddw_tran_item.itm_price, 
cdm_ddw_tran_item.item_id, cdm_ddw_tran_item.item_upc, 
cdm_ddw_tran_item.item_pid, cdm_ddw_tran_item.item_desc, 
cdm_ddw_tran_item.nrf_color_name, cdm_ddw_tran_item.nrf_size_name, 
cdm_ddw_tran_item.dept_id, c
-  Index Scan using cdm_ddw_tranp_item_cal_date on 
cdm_ddw_tran_item  (cost=0.01..103468.52 rows=43182 width=189)
  Index Cond: ((cal_date  (('now'::text)::date - 7)) 
AND (cal_date  ('now'::text)::date))
  Filter: ((intr_xref IS NOT NULL) AND 
(btrim(cdm.cdm_get_linkshare_id_safe(intr_xref, 10))  ''::text) AND 
(((cdm.cdm_utc_convert(cdm.cdm_get_linkshare_timestamp(intr_xref, 10), 
-5))::text)::date = (('now'::text)::date - 52)) AND (tran_typ_id = 
'S'::bpchar) AND (btrim((item_group)::text)  'EGC'::text) AND 
(btrim((item_group)::text)  'VGC'::text))
-  Hash  (cost=31409.92..31409.92 rows=442050 width=20)
  -  Index Scan using cdm_sitesales_order_date on cdm_sitesales 
sales  (cost=0.00..31409.92 rows=442050 width=20)
Index Cond: (order_date = (('now'::text)::date - 52))
   

 
-
Bored stiff? Loosen up...
Download and play hundreds of games for free on Yahoo! Games.

Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-05-18 Thread Liviu Ionescu
 It is arguable, that updating the DB software version in an 
 enterprise environment requires exactly that: check all 
 production queries on the new software to identify any 
 issues. In part, this is brought on by the very tuning that 
 you performed against the previous software. Restore the 8.1 
 DB into 8.2. Then run the queries against both versions to 
 evaluate functioning and timing.

you're right. my previous message was not a complain, was a warning for
others to avoid the same mistake. I was overconfident and got bitten. in the
future I'll check my queries on 8.2/8.3/... on a development configuration
before upgrading the production server.

regards,

Liviu


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] CPU Intensive query

2007-05-18 Thread Steinar H. Gunderson
On Fri, May 18, 2007 at 09:02:52AM -0700, Abu Mushayeed wrote:
 I have an interesting problem. I have the following query that ran ok on
 Monday and Tuesday and it has been running ok since I have been at this
 job. I have seen it to be IO intensive, but since Wednesday it has become
 CPU intensive. Database wise fresh data has been put into the tables,
 vacuumed  analyzed, no other parameter has been modified.

What Postgres version is this?

   The query is as follows and it's explain plan is also attached:

Normally EXPLAIN ANALYZE data would be much better than EXPLAIN, but if the
query indeed does not finish, it's not going to help much.

   set enable_nestloop = off; 

What's the rationale for this?

   HashAggregate  (cost=152555.97..152567.32 rows=267 width=162)

152000 disk page fetches is a bit, but it shouldn't take 24 hours. There's
probably misestimation involved at some point here. Does it really return 267
rows, or many more?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] reading large BYTEA type is slower than expected

2007-05-18 Thread Mark Harris
We have recently ported our application to the postgres database. For
the most part performance has not been an issue; however there is one
situation that is a problem and that is the initial read of rows
containing BYTEA values that have an average size of 2 kilobytes or
greater. For BYTEA values postgres requires as much 3 seconds to read
the values from disk into its buffer cache. After the initial read into
buffer cache, performance is comparable to other commercial DBMS that we
have ported to. As would be expected the commercial DBMS are also slower
to display data that is not already in the buffer cache, but the
magnitude of difference for postgres for this type of data read from
disk as opposed to read from buffer cache is much greater.

 

We have vacuumed the table and played around with the database
initialization parameters in the postgresql.conf. Neither helped with
this problem.

 

Does anyone have any tips on improving the read from disk performance of
BYTEA data that is typically 2KB or larger?

 

Mark



Re: [PERFORM] reading large BYTEA type is slower than expected

2007-05-18 Thread Tom Lane
Mark Harris [EMAIL PROTECTED] writes:
 We have recently ported our application to the postgres database. For
 the most part performance has not been an issue; however there is one
 situation that is a problem and that is the initial read of rows
 containing BYTEA values that have an average size of 2 kilobytes or
 greater. For BYTEA values postgres requires as much 3 seconds to read
 the values from disk into its buffer cache.

How large is large?

(No, I don't believe it takes 3 sec to fetch a single 2Kb value.)

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] 121+ million record table perf problems

2007-05-18 Thread cyber-postgres

I need some help on recommendations to solve a perf problem.

I've got a table with ~121 million records in it.  Select count on it 
currently takes ~45 minutes, and an update to the table to set a value on 
one of the columns I finally killed after it ran 17 hours and had still 
not completed.  Queries into the table are butt slow, and


System:   SUSE LINUX 10.0 (X86-64)
Postgresql:   PostgreSQL 8.2.1
Index type:   btree

A select count took ~48 minutes before I made some changes to the 
postgresql.conf, going from default values to these:

shared_buffers = 24MB
work_mem = 256MB
maintenance_work_mem = 512MB
random_page_cost = 100
stats_start_collector = off
stats_row_level = off

As a test I am trying to do an update on state using the following queries:
update res set state=5001;
select count(resid) from res;

The update query that started this all I had to kill after 17hours.  It 
should have updated all 121+ million records.  That brought my select 
count down to 19 minutes, but still a far cry from acceptable.


The system has 2GB of RAM (more is alreads on order), but doesn't seem to 
show problems in TOP with running away with RAM.  If anything, I don't 
think it's using enough as I only see about 6 processes using 26-27 MB 
each) and is running on a single disk (guess I will likely have to at the 
minimum go to a RAID1).  Workload will primarily be comprised of queries 
against the indicies (thus why so many of them) and updates to a single 
record from about 10 clients where that one records will have md5, state, 
rval, speed, audit, and date columns updated.  Those updates don't seem to 
be a problem, and are generally processed in bulk of 500 to 5000 at a 
time.


Here is the schema for the table giving me problems:

CREATE TABLE res
(
  res_id integer NOT NULL DEFAULT nextval('result_id_seq'::regclass),
  res_client_id integer NOT NULL,
  time real DEFAULT 0,
  error integer DEFAULT 0,
  md5 character(32) DEFAULT 0,
  res_tc_id integer NOT NULL,
  state smallint DEFAULT 0,
  priority smallint,
  rval integer,
  speed real,
  audit real,
  date timestamp with time zone,
  gold_result_id integer,
  CONSTRAINT result_pkey PRIMARY KEY (res_id),
  CONSTRAINT unique_res UNIQUE (res_client_id, res_tc_id)
)
WITHOUT OIDS;
ALTER TABLE res OWNER TO postgres;

CREATE INDEX index_audit
  ON res
  USING btree
  (audit);

CREATE INDEX index_event
  ON res
  USING btree
  (error);

CREATE INDEX index_priority
  ON res
  USING btree
  (priority);

CREATE INDEX index_rval
  ON res
  USING btree
  (rval);

CREATE INDEX index_speed
  ON res
  USING btree
  (speed);

CREATE INDEX index_state
  ON res
  USING btree
  (state);

CREATE INDEX index_tc_id
  ON res
  USING btree
  (res_tc_id);

CREATE INDEX index_time
  ON res
  USING btree
  (time);

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] reading large BYTEA type is slower than expected

2007-05-18 Thread Y Sidhu

Mark,

I am no expert but this looks like a file system I/O thing. I set
hw.ata.wc=1 for a SATA drive and =0 for a SCSI drive in /boot/loader.conf on
my FreeBSD systems. That seems to provide some needed tweaking.

Yudhvir
==
On 5/18/07, Mark Harris [EMAIL PROTECTED] wrote:


 We have recently ported our application to the postgres database. For the
most part performance has not been an issue; however there is one situation
that is a problem and that is the initial read of rows containing BYTEA
values that have an average size of 2 kilobytes or greater. For BYTEA values
postgres requires as much 3 seconds to read the values from disk into its
buffer cache. After the initial read into buffer cache, performance is
comparable to other commercial DBMS that we have ported to. As would be
expected the commercial DBMS are also slower to display data that is not
already in the buffer cache, but the magnitude of difference for postgres
for this type of data read from disk as opposed to read from buffer cache is
much greater.



We have vacuumed the table and played around with the database
initialization parameters in the postgresql.conf. Neither helped with this
problem.



Does anyone have any tips on improving the read from disk performance of
BYTEA data that is typically 2KB or larger?



Mark





--
Yudhvir Singh Sidhu
408 375 3134 cell


Re: [PERFORM] 121+ million record table perf problems

2007-05-18 Thread Andrew Sullivan
On Fri, May 18, 2007 at 12:43:40PM -0500, [EMAIL PROTECTED] wrote:
 I've got a table with ~121 million records in it.  Select count on it 
 currently takes ~45 minutes, and an update to the table to set a value on 
 one of the columns I finally killed after it ran 17 hours and had still 
 not completed.  Queries into the table are butt slow, and

I don't think you've told us anything like enough to get started on
solving your problem.  But to start with, you know that in Postgres,
an unrestricted count() on a table always results in reading the
entire table, right?

Standard questions: have you performed any vacuum or analyse?

Your update statement is also a case where you have to touch every
row.  Note that, given that you seem to be setting the state field to
the same value for everything, an index on there will do you not one
jot of good until there's greater selectivity.

How fast is the disk?  Is it fast enough to read and touch every one
of those rows on the table inside of 17 hours?  

Note also that your approach of updating all 121 million records in
one statement is approximately the worst way to do this in Postgres,
because it creates 121 million dead tuples on your table.  (You've
created some number of those by killing the query as well.)

All of that said, 17 hours seems kinda long. 

 As a test I am trying to do an update on state using the following queries:
 update res set state=5001;
 select count(resid) from res;

What is this testing?

 The update query that started this all I had to kill after 17hours.  

Does that suggest that the update you're trying to make work well is
_not_ update res set state = 5001?

 each) and is running on a single disk (guess I will likely have to at the 
 minimum go to a RAID1).  Workload will primarily be comprised of queries 

I bet that single disk is your problem.  Iostat is your friend, I'd
say.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
Everything that happens in the world happens at some place.
--Jane Jacobs 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] 121+ million record table perf problems

2007-05-18 Thread Joshua D. Drake

[EMAIL PROTECTED] wrote:

I need some help on recommendations to solve a perf problem.

I've got a table with ~121 million records in it.  Select count on it 
currently takes ~45 minutes, and an update to the table to set a value 
on one of the columns I finally killed after it ran 17 hours and had 
still not completed.  Queries into the table are butt slow, and


Scanning 121 million rows is going to be slow even on 16 disks.



System:   SUSE LINUX 10.0 (X86-64)
Postgresql:   PostgreSQL 8.2.1
Index type:   btree


You really should be running 8.2.4.



A select count took ~48 minutes before I made some changes to the 
postgresql.conf, going from default values to these:

shared_buffers = 24MB


This could be increased.


work_mem = 256MB
maintenance_work_mem = 512MB
random_page_cost = 100
stats_start_collector = off
stats_row_level = off

As a test I am trying to do an update on state using the following queries:
update res set state=5001;


You are updating 121 million rows, that takes a lot of time considering 
you are actually (at a very low level) marking 121 million rows dead and 
inserting 121 million more.


The update query that started this all I had to kill after 17hours.  It 
should have updated all 121+ million records.  That brought my select 
count down to 19 minutes, but still a far cry from acceptable.


Not quite sure what you would considerable acceptable based on what you 
are trying to do.



Sincerely,

Joshua D. Drake





Here is the schema for the table giving me problems:

CREATE TABLE res
(
  res_id integer NOT NULL DEFAULT nextval('result_id_seq'::regclass),
  res_client_id integer NOT NULL,
  time real DEFAULT 0,
  error integer DEFAULT 0,
  md5 character(32) DEFAULT 0,
  res_tc_id integer NOT NULL,
  state smallint DEFAULT 0,
  priority smallint,
  rval integer,
  speed real,
  audit real,
  date timestamp with time zone,
  gold_result_id integer,
  CONSTRAINT result_pkey PRIMARY KEY (res_id),
  CONSTRAINT unique_res UNIQUE (res_client_id, res_tc_id)
)
WITHOUT OIDS;
ALTER TABLE res OWNER TO postgres;

CREATE INDEX index_audit
  ON res
  USING btree
  (audit);

CREATE INDEX index_event
  ON res
  USING btree
  (error);

CREATE INDEX index_priority
  ON res
  USING btree
  (priority);

CREATE INDEX index_rval
  ON res
  USING btree
  (rval);

CREATE INDEX index_speed
  ON res
  USING btree
  (speed);

CREATE INDEX index_state
  ON res
  USING btree
  (state);

CREATE INDEX index_tc_id
  ON res
  USING btree
  (res_tc_id);

CREATE INDEX index_time
  ON res
  USING btree
  (time);

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate




--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] reading large BYTEA type is slower than expected

2007-05-18 Thread Mark Harris
Tom,

No it is not 3 seconds to read a single value. Multiple records are
read, approximately 120 records if the raster dataset is created with
our application's default configuration.

Please read on to understand why, if you need to.

We are a GIS software company and have two basic types of data, raster
and vector. Both types of data are stored in a BYTEA.

Vector data are representations of geometry stored as a series of
vertices to represent points, lines and polygons. This type of data is
typically 30 to 200 bytes, but can be very large (consider how many
vertices would be required to represent the Pacific Ocean at a detailed
resolution). Vector data does not seem to exhibit the cold fetch issue
(fetch from disk as opposed to fetch from buffer cache).

It is with raster data that we see the problem. Raster data is image
data stored in the database. When we store a georeferenced image in the
database we block it up into tiles. The default tile size is 128 by 128
pixels.
We compress the data using either: LZ77, JPEG or JPEG2000. Typically the
pixel blocks stored in the BYTEA range in size from 800 bytes to 16000
bytes for 8-bit data stored with the default tile size, depending on the
type of compression and the variability of the data.

Our application is capable of mosaicking source images together into
huge raster datasets that can grow into terabytes. Consider the entire
landsat imagery with a resolution of 15 meters mosaicked into one raster
dataset. It requires less than a terabyte to store that data.

For practical reasons, as you can imagine, we construct a reduced
resolution pyramid on the raster base level, allowing applications to
view a reduced resolution level of the raster dataset as the user zooms
out, and a higher resolution level as the user zooms in. The pyramid
levels are also stored as pixel blocks in the table. Each pyramid level
is reduced in resolution by 1/2 in the X and Y dimension. Therefore
pyramid level 1 will be 1/4 of pyramid level 0 (the base).

As the application queries the raster blocks table which stores the
raster data tiles, it will request a raster tiles that fall within the
spatial extent of the window for a particular pyramid level. Therefore
the number of records queried from the raster blocks table containing
the BYTEA column of pixel data is fairly constant. For the screen
resolution of 1680 by 1050 that I am testing with about 120 records will
be fetched from the raster blocks table each time the user pans or
zooms.

Mark

  




-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Friday, May 18, 2007 10:48 AM
To: Mark Harris
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] reading large BYTEA type is slower than expected 

Mark Harris [EMAIL PROTECTED] writes:
 We have recently ported our application to the postgres database. For
 the most part performance has not been an issue; however there is one
 situation that is a problem and that is the initial read of rows
 containing BYTEA values that have an average size of 2 kilobytes or
 greater. For BYTEA values postgres requires as much 3 seconds to read
 the values from disk into its buffer cache.

How large is large?

(No, I don't believe it takes 3 sec to fetch a single 2Kb value.)

regards, tom lane


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] 121+ million record table perf problems

2007-05-18 Thread Brian Hurt

[EMAIL PROTECTED] wrote:


I need some help on recommendations to solve a perf problem.

I've got a table with ~121 million records in it.  Select count on it 
currently takes ~45 minutes, and an update to the table to set a value 
on one of the columns I finally killed after it ran 17 hours and had 
still not completed.  Queries into the table are butt slow, and


This is way too long.  I just did a select count(*) on a table of mine 
that has 48 million rows and it took only 178 seconds.  And this is on a 
serious POS disk subsystem that's giving me about 1/2 the read speed of 
a single off the shelf SATA disk.
As select count(*) has to read the whole table sequentially, the time it 
takes is linear with the size of the table (once you get large enough 
that the whole table doesn't get cached in memory).  So I'd be surprised 
if a 121 million record table took more than 500 or so seconds to read, 
and would expect it to be less.


So my advice: vacuum.  I'll bet you've got a whole boatload of dead 
tuples kicking around.  Then analyze.  Then consider firing off a 
reindex and/or cluster against the table.  The other thing I'd consider 
is dropping the money on some more hardware- a few hundred bucks to get 
a battery backed raid card and half a dozen SATA drives would probably 
do wonders for your performance.




shared_buffers = 24MB


Up your shared buffers.  This is a mistake I made originally as well- 
but this is the total number of shared buffers used by the system.  I 
had originally assumed that the number of shared buffers used was this 
times the number of backends, but it's not.


With 2G of memory, I'd start with shared buffers of 512MB, and consider 
upping it to 768MB or even 1024MB.  This will also really help performance.



stats_start_collector = off
stats_row_level = off


I think I'd also recommend turning these one.

Brian


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[PERFORM] Slow queries on big table

2007-05-18 Thread Tyrrill, Ed
I have a two column table with over 160 million rows in it.  As the size
of the table grows queries on this table get exponentially slower.  I am
using version 8.1.5 32-bit on Red Hat Enterprise Linux 3.  The hardware
is an Intel 3 Ghz Xeon with 4GB RAM, and 6 disks in a RAID 5
configuration.  For current testing I am running a single database
connection with no other applications running on the machine, and the
swap is not being used at all.

Here is the table definition:

mdsdb=# \d backup_location
 Table public.backup_location
  Column   |  Type   | Modifiers
---+-+---
 record_id | bigint  | not null
 backup_id | integer | not null
Indexes:
backup_location_pkey PRIMARY KEY, btree (record_id, backup_id)
backup_location_rid btree (record_id)
Foreign-key constraints:
backup_location_bfk FOREIGN KEY (backup_id) REFERENCES
backups(backup_id) ON DELETE CASCADE
 
Here is the table size:

mdsdb=# select count(*) from backup_location;
   count
---
 162101296
(1 row)

And here is a simple query on this table that takes nearly 20 minutes to
return less then 3000 rows.  I ran an analyze immediately before I ran
this query:

mdsdb=# explain analyze select record_id from backup_location where
backup_id = 1070;
 
QUERY PLAN



-
 Index Scan using backup_location_pkey on backup_location
(cost=0.00..1475268.53 rows=412394 width=8) (actual
time=3318.057..1196723.915 rows=2752 loops=1)
   Index Cond: (backup_id = 1070)
 Total runtime: 1196725.617 ms
(3 rows)

Obviously at this point the application is not usable.  If possible we
would like to grow this table to the 3-5 billion row range, but I don't
know if that is realistic.

Any guidance would be greatly appreciated.

Thanks,
Ed

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] 121+ million record table perf problems

2007-05-18 Thread Alan Hodgson
On Friday 18 May 2007 11:51, Joshua D. Drake [EMAIL PROTECTED] wrote:
  The update query that started this all I had to kill after 17hours.  It
  should have updated all 121+ million records.  That brought my select
  count down to 19 minutes, but still a far cry from acceptable.

You're going to want to drop all your indexes before trying to update 121 
million records.  Updates in PostgreSQL are really quite slow, mostly due 
to all the index updates. Drop indexes, do the updates, create a primary 
key, cluster the table on that key to free up the dead space, then recreate 
the rest of the indexes. That's about as fast as you can get that process.

Of course, doing anything big on one disk is also going to be slow, no 
matter what you do. I don't think a table scan should take 19 minutes, 
though, not for 121 million records. You should be able to get at least 
60-70MB/sec out of anything modern. I can only assume your disk is 
thrashing doing something else at the same time as the select.

-- 
We can no more blame our loss of freedom on Congressmen than we can
prostitution on pimps.  Both simply provide broker services for their
customers. -- Dr. Walter Williams


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Background vacuum

2007-05-18 Thread Ron Mayer
Tom Lane wrote:
 Ron Mayer [EMAIL PROTECTED] writes:
 Greg Smith wrote:
 Count me on the side that agrees adjusting the vacuuming parameters is
 the more straightforward way to cope with this problem.
 
 Agreed for vacuum; but it still seems interesting to me that
 across databases and workloads high priority transactions
 tended to get through faster than low priority ones.  Is there
 any reason to believe that the drawbacks of priority inversion
 outweigh the benefits of setting priorities?
 
 Well, it's unclear, and anecdotal evidence is unlikely to convince
 anybody.  I had put some stock in the CMU paper, but if it's based
 on PG 7.3 then you've got to **seriously** question its relevance
 to the current code.

I was thinking the paper's results might apply more generally
to RDBMS-like applications since they did test 3 of them with
different locking behavior and different bottlenecks.

But true, I should stop bringing up 7.3 examples.


Anecdotally ;-) I've found renice-ing reports to help; especially
in the (probably not too uncommon case) where slow running
batch reporting queries hit different tables than interactive
reporting queries.   I guess that's why I keep defending
priorities as a useful technique.   It seems even more useful
considering the existence of schedulers that have priority
inheritance features.

I'll admit there's still the theoretical possibility that
it's a foot-gun so I don't mind people having to write
their own stored procedure to enable it - but I'd be
surprised if anyone could find a real world case where
priorities would do more harm than good.

Though, yeah, it'd be easy to construct an artificial
case that'd demonstrate priority inversion (i.e. have
a low priority process that takes a lock and sits
and spins on some CPU-intensive stored procedure
without doing any I/O).

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Slow queries on big table

2007-05-18 Thread Scott Marlowe

Tyrrill, Ed wrote:

I have a two column table with over 160 million rows in it.  As the size
of the table grows queries on this table get exponentially slower.  I am
using version 8.1.5 32-bit on Red Hat Enterprise Linux 3.  The hardware
is an Intel 3 Ghz Xeon with 4GB RAM, and 6 disks in a RAID 5
configuration.  For current testing I am running a single database
connection with no other applications running on the machine, and the
swap is not being used at all.

Here is the table definition:

mdsdb=# \d backup_location
 Table public.backup_location
  Column   |  Type   | Modifiers
---+-+---
 record_id | bigint  | not null
 backup_id | integer | not null
Indexes:
backup_location_pkey PRIMARY KEY, btree (record_id, backup_id)
backup_location_rid btree (record_id)
Foreign-key constraints:
backup_location_bfk FOREIGN KEY (backup_id) REFERENCES
backups(backup_id) ON DELETE CASCADE
 
Here is the table size:


mdsdb=# select count(*) from backup_location;
   count
---
 162101296
(1 row)

And here is a simple query on this table that takes nearly 20 minutes to
return less then 3000 rows.  I ran an analyze immediately before I ran
this query:

mdsdb=# explain analyze select record_id from backup_location where
backup_id = 1070;
 
QUERY PLAN




-
 Index Scan using backup_location_pkey on backup_location
(cost=0.00..1475268.53 rows=412394 width=8) (actual
time=3318.057..1196723.915 rows=2752 loops=1)
   Index Cond: (backup_id = 1070)
 Total runtime: 1196725.617 ms
(3 rows)
  
I've got a few points.  Firstly, is your data amenable to partitioning?  
If so that might be a big winner.
Secondly, it might be more efficient for the planner to choose the 
backup_location_rid index than the combination primary key index.  You 
can test this theory with this cool pg trick:


begin;
alter table backup_location drop constraint backup_location_pkey;
explain analyze select 
rollback;

to see if it's faster.


Obviously at this point the application is not usable.  If possible we
would like to grow this table to the 3-5 billion row range, but I don't
know if that is realistic.

Any guidance would be greatly appreciated.
  


Without knowing more about your usage patterns, it's hard to say.  But 
partitioning seems like your best choice at the moment.


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] reading large BYTEA type is slower than expected

2007-05-18 Thread Mark Harris
Tom,

Actually the 120 records I quoted is a mistake. Since it is a three band
image the number of records should be 360 records or 120 records for
each band.

Mark

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Friday, May 18, 2007 10:48 AM
To: Mark Harris
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] reading large BYTEA type is slower than expected 

Mark Harris [EMAIL PROTECTED] writes:
 We have recently ported our application to the postgres database. For
 the most part performance has not been an issue; however there is one
 situation that is a problem and that is the initial read of rows
 containing BYTEA values that have an average size of 2 kilobytes or
 greater. For BYTEA values postgres requires as much 3 seconds to read
 the values from disk into its buffer cache.

How large is large?

(No, I don't believe it takes 3 sec to fetch a single 2Kb value.)

regards, tom lane


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Slow queries on big table

2007-05-18 Thread Tom Lane
Tyrrill, Ed [EMAIL PROTECTED] writes:
  Index Scan using backup_location_pkey on backup_location
 (cost=0.00..1475268.53 rows=412394 width=8) (actual
 time=3318.057..1196723.915 rows=2752 loops=1)
Index Cond: (backup_id = 1070)
  Total runtime: 1196725.617 ms

If we take that at face value it says the indexscan is requiring 434
msec per actual row fetched.  Which is just not very credible; the worst
case should be about 1 disk seek per row fetched.  So there's something
going on that doesn't meet the eye.

What I'm wondering about is whether the table is heavily updated and
seldom vacuumed, leading to lots and lots of dead tuples being fetched
and then rejected (hence they'd not show in the actual-rows count).

The other thing that seems pretty odd is that it's not using a bitmap
scan --- for such a large estimated rowcount I'd have expected a bitmap
scan not a plain indexscan.  What do you get from EXPLAIN ANALYZE if
you force a bitmap scan?  (Set enable_indexscan off, and enable_seqscan
too if you have to.)

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Slow queries on big table

2007-05-18 Thread Andrew Kroeger
Tyrrill, Ed wrote:
 mdsdb=# \d backup_location
  Table public.backup_location
   Column   |  Type   | Modifiers
 ---+-+---
  record_id | bigint  | not null
  backup_id | integer | not null
 Indexes:
 backup_location_pkey PRIMARY KEY, btree (record_id, backup_id)
 backup_location_rid btree (record_id)
 Foreign-key constraints:
 backup_location_bfk FOREIGN KEY (backup_id) REFERENCES
 backups(backup_id) ON DELETE CASCADE

[snip]

 mdsdb=# explain analyze select record_id from backup_location where
 backup_id = 1070;
  
 QUERY PLAN
 
 
 
 -
  Index Scan using backup_location_pkey on backup_location
 (cost=0.00..1475268.53 rows=412394 width=8) (actual
 time=3318.057..1196723.915 rows=2752 loops=1)
Index Cond: (backup_id = 1070)
  Total runtime: 1196725.617 ms
 (3 rows)

The backup_location_rid index on your table is not necessary.  The
primary key index on (record_id, backup_id) can be used by Postgres,
even if the query is only constrained by record_id.  See
http://www.postgresql.org/docs/8.2/interactive/indexes-multicolumn.html
for details.

The explain plan indicates that your query is filtered on backup_id, but
is using the primary key index on (record_id, backup_id).  Based on the
table definition, you do not have any good index for filtering on backup_id.

The explain plan also seems way off, as I would expect a sequential scan
would be used without a good index for backup_id.  Did you disable
sequential scans before running this query?  Have you altered any other
configuration or planner parameters?

As your backup_location_rid is not necessary, I would recommend
dropping that index and creating a new one on just backup_id.  This
should be a net wash on space, and the new index should make for a
straight index scan for the query you presented.  Don't forget to
analyze after changing the indexes.

Hope this helps.

Andrew


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Slow queries on big table

2007-05-18 Thread Tom Lane
Scott Marlowe [EMAIL PROTECTED] writes:
 Secondly, it might be more efficient for the planner to choose the 
 backup_location_rid index than the combination primary key index.

Oh, I'm an idiot; I didn't notice the way the index was set up.  Yeah,
that index pretty well sucks for a query on backup_id --- it has to scan
the entire index, since there's no constraint on the leading column.
So that's where the time is going.

This combination of indexes:

 Indexes:
 backup_location_pkey PRIMARY KEY, btree (record_id, backup_id)
 backup_location_rid btree (record_id)

is really just silly.  You should have the pkey and then an index on
backup_id alone.  See the discussion of multiple indexes in the fine
manual:
http://www.postgresql.org/docs/8.2/static/indexes-multicolumn.html
http://www.postgresql.org/docs/8.2/static/indexes-bitmap-scans.html

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] choosing fillfactor

2007-05-18 Thread Heikki Linnakangas

Gene Hart wrote:
I've tried searching the documentation to answer this question but could 
not find anything. When trying to choose the optimal fillfactor for an 
index, what is important the number of times the row is updated or the 
column indexed upon is updated? In my case each row is updated on 
average about 5 times but for some of the columns with indexes don't 
change after insertion ever. thanks for any advice


It's the number of times the row is updated, regardless of which columns 
are changed.



--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Slow queries on big table

2007-05-18 Thread Tyrrill, Ed
Tom Lane [EMAIL PROTECTED] writes:
 
 Scott Marlowe [EMAIL PROTECTED] writes:
  Secondly, it might be more efficient for the planner to choose the 
  backup_location_rid index than the combination primary key index.
 
 Oh, I'm an idiot; I didn't notice the way the index was set up.
 Yeah, that index pretty well sucks for a query on backup_id ---
 it has to scan the entire index, since there's no constraint on the
 leading column.
 So that's where the time is going.
 
 This combination of indexes:
 
  Indexes:
  backup_location_pkey PRIMARY KEY, btree (record_id, backup_id)
  backup_location_rid btree (record_id)
 
 is really just silly.  You should have the pkey and then an index on
 backup_id alone.  See the discussion of multiple indexes in the fine
 manual:
 http://www.postgresql.org/docs/8.2/static/indexes-multicolumn.html
 http://www.postgresql.org/docs/8.2/static/indexes-bitmap-scans.html
 
   regards, tom lane

Thanks for the help guys!  That was my problem.  I actually need the
backup_location_rid index for a different query so I am going to keep
it.  Here is the result with the new index:

mdsdb=# explain analyze select record_id from backup_location where
backup_id = 1070;
   QUERY
PLAN



 Index Scan using backup_location_bid on backup_location
(cost=0.00..9573.07 rows=415897 width=8) (actual time=0.106..3.486
rows=2752 loops=1)
   Index Cond: (backup_id = 1070)
 Total runtime: 4.951 ms
(3 rows)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Slow queries on big table

2007-05-18 Thread Tom Lane
Tyrrill, Ed [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 This combination of indexes:

 Indexes:
 backup_location_pkey PRIMARY KEY, btree (record_id, backup_id)
 backup_location_rid btree (record_id)

 is really just silly.  You should have the pkey and then an index on
 backup_id alone.

 Thanks for the help guys!  That was my problem.  I actually need the
 backup_location_rid index for a different query so I am going to keep
 it.

Well, you don't really *need* it; the two-column index on (record_id,
backup_id) will serve perfectly well for queries on its leading column
alone.  It'll be physically bigger and hence slightly slower to scan
than a single-column index; but unless the table is almost completely
read-only, the update overhead of maintaining all three indexes is
probably going to cost more than you can save with it.  Try that other
query with and without backup_location_rid and see how much you're
really saving.

  Index Scan using backup_location_bid on backup_location
 (cost=0.00..9573.07 rows=415897 width=8) (actual time=0.106..3.486
 rows=2752 loops=1)
Index Cond: (backup_id = 1070)
  Total runtime: 4.951 ms

That's more like it ;-)

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] CPU Intensive query

2007-05-18 Thread Abu Mushayeed
What Postgres version is this?
   
  8.1.3
  
 set enable_nestloop = off; 

What's the rationale for this?

  To eliminate nested loop. It does a nested loop betwwen to very large 
table(millions of rows).
   
   HashAggregate (cost=152555.97..152567.32 rows=267 width=162)

152000 disk page fetches is a bit, but it shouldn't take 24 hours. There's
probably misestimation involved at some point here. Does it really return 267
rows, or many more?

  It returns finally about 19-20 thousand rows.


Steinar H. Gunderson [EMAIL PROTECTED] wrote:
  On Fri, May 18, 2007 at 09:02:52AM -0700, Abu Mushayeed wrote:
 I have an interesting problem. I have the following query that ran ok on
 Monday and Tuesday and it has been running ok since I have been at this
 job. I have seen it to be IO intensive, but since Wednesday it has become
 CPU intensive. Database wise fresh data has been put into the tables,
 vacuumed  analyzed, no other parameter has been modified.

What Postgres version is this?

 The query is as follows and it's explain plan is also attached:

Normally EXPLAIN ANALYZE data would be much better than EXPLAIN, but if the
query indeed does not finish, it's not going to help much.

 set enable_nestloop = off; 

What's the rationale for this?

 HashAggregate (cost=152555.97..152567.32 rows=267 width=162)

152000 disk page fetches is a bit, but it shouldn't take 24 hours. There's
probably misestimation involved at some point here. Does it really return 267
rows, or many more?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


 
-
8:00? 8:25? 8:40?  Find a flick in no time
 with theYahoo! Search movie showtime shortcut.

Re: [PERFORM] CPU Intensive query

2007-05-18 Thread Tom Lane
Abu Mushayeed [EMAIL PROTECTED] writes:
   The query is as follows and it's explain plan is also attached:

The query itself seems to be a simple join over not too many rows, so
I don't see how it could be taking 24 hours.  What I suspect is you're
incurring lots and lots of invocations of those user-written functions
and one of them has suddenly decided to get very slow.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] CPU Intensive query

2007-05-18 Thread Steinar H. Gunderson
On Fri, May 18, 2007 at 02:37:27PM -0700, Abu Mushayeed wrote:
 set enable_nestloop = off; 
 What's the rationale for this?
 To eliminate nested loop. It does a nested loop betwwen to very large
 table(millions of rows).

If the planner chooses a nested loop, it is because it believes it is the
most efficient solution. I'd turn it back on and try to figure out why the
planner was wrong. Note that a nested loop with an index scan on one or both
sides can easily be as efficient as anything.

Did you ANALYZE your tables recently? If the joins are really between
millions of rows and the planner thinks it's a couple thousands, the stats
sound rather off... 

 HashAggregate (cost=152555.97..152567.32 rows=267 width=162)
 152000 disk page fetches is a bit, but it shouldn't take 24 hours. There's
 probably misestimation involved at some point here. Does it really return 267
 rows, or many more?
 It returns finally about 19-20 thousand rows.

So the planner is off by a factor of at least a hundred. That's a good
first-level explanation for why it's slow, at least...

If you can, please provide EXPLAIN ANALYZE output for your query (after
running ANALYZE on all your tables, if you haven't already); even though
it will take some time, it usually makes this kind of performance debugging
much easier.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] CPU Intensive query

2007-05-18 Thread Steinar H. Gunderson
On Sat, May 19, 2007 at 12:32:33AM +0200, Steinar H. Gunderson wrote:
 Did you ANALYZE your tables recently? If the joins are really between
 millions of rows and the planner thinks it's a couple thousands, the stats
 sound rather off... 

Sorry, I forgot your first e-mail where you said you had both vacuumed and
analyzed recently. The estimates are still off, though -- the WHERE query
might be difficult to estimate properly. (I'm not sure how Tom arrived on
his conclusion of expensive user-defined functions, but given the usual
precisions of his guesses, I'd check that too...)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] 121+ million record table perf problems

2007-05-18 Thread Craig James




I've got a table with ~121 million records in it.  Select count on it 
currently takes ~45 minutes, and an update to the table to set a value 
on one of the columns I finally killed after it ran 17 hours and had 
still not completed.  Queries into the table are butt slow, and


The update query that started this all I had to kill after 17hours.  
It should have updated all 121+ million records.  That brought my 
select count down to 19 minutes, but still a far cry from acceptable.


If you have a column that needs to be updated often for all rows, 
separate it into a different table, and create a view that joins it back 
to the main table so that your application still sees the old schema.


This will greatly speed your update since (in Postgres) and update is 
the same as a delete+insert.  By updating that one column, you're 
re-writing your entire 121 million rows.  If you separate it, you're 
only rewriting that one column.  Don't forget to vacuum/analyze and 
reindex when you're done.


Better yet, if you can stand a short down time, you can drop indexes on 
that column, truncate, then do 121 million inserts, and finally 
reindex.  That will be MUCH faster.


Craig



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Slow queries on big table

2007-05-18 Thread Tyrrill, Ed
Tom Lane [EMAIL PROTECTED] writes:
 Thanks for the help guys!  That was my problem.  I actually need the 
 backup_location_rid index for a different query so I am going to keep

 it.

 Well, you don't really *need* it; the two-column index on (record_id,
 backup_id) will serve perfectly well for queries on its leading column
 alone.  It'll be physically bigger and hence slightly slower to scan
 than a single-column index; but unless the table is almost completely
 read-only, the update overhead of maintaining all three indexes is
 probably going to cost more than you can save with it.  Try that other
 query with and without backup_location_rid and see how much you're
 really saving.

Well, the query that got me to add backup_location_rid took 105 minutes
using only the primary key index.  After I added backup_location_rid
the query was down to about 45 minutes.  Still not very good, and I am
still fiddling around with it.  The query is:

mdsdb=# explain analyze select backupobjects.record_id from
backupobjects left outer join backup_location using(record_id) where
backup_id is null;
 
QUERY PLAN



-
 Merge Left Join  (cost=0.00..21408455.06 rows=11790970 width=8) (actual
time=2784967.410..2784967.410 rows=0 loops=1)
   Merge Cond: (outer.record_id = inner.record_id)
   Filter: (inner.backup_id IS NULL)
   -  Index Scan using backupobjects_pkey on backupobjects
(cost=0.00..443484.31 rows=11790970 width=8) (actual
time=0.073..47865.957 rows=11805996 loops=1)
   -  Index Scan using backup_location_rid on backup_location
(cost=0.00..20411495.21 rows=162435366 width=12) (actual
time=0.110..2608485.437 rows=162426837 loops=1)
 Total runtime: 2784991.612 ms
(6 rows)

It is of course the same backup_location, but backupobjects is:

mdsdb=# \d backupobjects
   Table public.backupobjects
 Column |Type | Modifiers
+-+---
 record_id  | bigint  | not null
 dir_record_id  | integer |
 name   | text|
 extension  | character varying(64)   |
 hash   | character(40)   |
 mtime  | timestamp without time zone |
 size   | bigint  |
 user_id| integer |
 group_id   | integer |
 meta_data_hash | character(40)   |
Indexes:
backupobjects_pkey PRIMARY KEY, btree (record_id)
backupobjects_meta_data_hash_key UNIQUE, btree (meta_data_hash)
backupobjects_extension btree (extension)
backupobjects_hash btree (hash)
backupobjects_mtime btree (mtime)
backupobjects_size btree (size)

record_id has in backupobjects has a many to many relationship to
record_id
in backup_location.

Ed

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] 121+ million record table perf problems

2007-05-18 Thread Alvaro Herrera
Craig James wrote:

 Better yet, if you can stand a short down time, you can drop indexes on 
 that column, truncate, then do 121 million inserts, and finally 
 reindex.  That will be MUCH faster.

Or you can do a CLUSTER, which does all the same things automatically.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] 121+ million record table perf problems

2007-05-18 Thread Greg Smith

On Fri, 18 May 2007, [EMAIL PROTECTED] wrote:


shared_buffers = 24MB
work_mem = 256MB
maintenance_work_mem = 512MB


You should take a minute to follow the suggestions at 
http://www.westnet.com/~gsmith/content/postgresql/pg-5minute.htm and set 
dramatically higher values for shared_buffers and effective_cache_size for 
your server.  Also, your work_mem figure may be OK for now, but if ever do 
have 10 people connect to this database at once and run big queries you 
could have an issue with it set that high--that's a per client setting.


After you're done with that, you should also follow the suggestions there 
to do a VACCUM ANALYZE.  That may knock out two other potential issues at 
once.  It will take a while to run, but I think you need it badly to sort 
out what you've already done.



random_page_cost = 100


I'm not sure what logic prompted this change, but after you correct the 
above you should return this to its default; if this is helping now it's 
only because other things are so far off from where they should be.



update res set state=5001;
The update query that started this all I had to kill after 17hours.  It 
should have updated all 121+ million records.  That brought my select count 
down to 19 minutes, but still a far cry from acceptable.


You should work on the select side of this first.  If that isn't running 
in a moderate amount of time, trying to get the much more difficult update 
to happen quickly is hopeless.


Once the select is under control, there are a lot of parameters to adjust 
that will effect the speed of the updates.  The first thing to do is 
dramatically increase checkpoint_segments; I would set that to at least 30 
in your situation.


Also:  going to RAID-1 won't make a bit of difference to your update 
speed; could even make it worse.  Adding more RAM may not help much 
either.  If you don't have one already, the real key to improving 
performance in a heavy update situation is to get a better disk controller 
with a cache that helps accelerate writes.  Then the next step is to 
stripe this data across multiple disks in a RAID-0 configuration to split 
the I/O up.


You have a lot of work ahead of you.  Even after you resolve the gross 
issues here, you have a table that has around 10 indexes on it. 
Maintaining those is far from free; every time you update a single record 
in that table, the system has to update each of those indexes on top of 
the record update itself.  So you're really asking your system to do 
around 1.2 billion disk-related operations when you throw out your simple 
batch update against every row, and good luck getting that to run in a 
time frame that's less than days long.


The right way to get a feel for what's going on is to drop all the indexes 
except for the constraints and see how the bulk update runs after the 
parameter changes suggested above are in place and the database has been 
cleaned up with vacuum+analyze.  Once you have a feel for that, add some 
indexes back in and see how it degrades.  Then you'll know how adding each 
one of them impacts your performance.  I suspect you're going to have to 
redesign your indexing scheme before this is over.  I don't think your 
current design is ever going to work the way you expect it to.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Background vacuum

2007-05-18 Thread Greg Smith

On Fri, 18 May 2007, Ron Mayer wrote:


Anecdotally ;-) I've found renice-ing reports to help


Let's break this down into individual parts:

1) Is there enough CPU-intensive activity in some database tasks that they 
can be usefully be controlled by tools like nice?  Sure.


2) Is it so likely that you'll fall victim to a priority inversion problem 
that you shouldn't ever consider that technique?  No.


3) Does the I/O scheduler in modern OSes deal with a lot more things than 
just the CPU?  You bet.


4) Is vacuuming a challenging I/O demand?  Quite.

Add all this up, and that fact that you're satisfied with how nice has 
worked successfully for you doesn't have to conflict with an opinion that 
it's not the best approach for controlling vacuuming.  I just wouldn't 
extrapolate your experience too far here.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate