[PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Віталій Тимчишин
Hello.

I have a query that performs very poor because there is a limit on join
column that is not applied to other columns:

select * from company this_ left outer join company_tag this_1_ on
this_.id=this_1_.company_id left outer join company_measures companymea2_ on
this_.id=companymea2_.company_id left outer join company_descr ces3_ on
this_.id=ces3_.company_id where this_1_.tag_id = 7 and this_.id>5000
and this_1_.company_id>5000
order by this_.id asc limit 1000;

(plan1.txt)
Total runtime: 7794.692 ms

At the same time if I apply the limit (>5000) to other columns in query
itself it works like a charm:

select * from company this_ left outer join company_tag this_1_ on
this_.id=this_1_.company_id left outer join company_measures companymea2_ on
this_.id=companymea2_.company_id left outer join company_descr ces3_ on
this_.id=ces3_.company_id where this_1_.tag_id = 7 and this_.id>5000
and this_1_.company_id>5000
and companymea2_.company_id>5000 and ces3_.company_id>5000
order by this_.id asc limit 1000;

(plan2.txt)
Total runtime: 27.547 ms

I've thought and someone in this list've told me that this should be done
automatically. But I have pretty recent server:
PostgreSQL 8.4.2 on amd64-portbld-freebsd8.0, compiled by GCC cc (GCC) 4.2.1
20070719  [FreeBSD], 64-bit
and it still do not work

Do I misunderstand something or this feature don't work in such a query?

Best regards, Vitalii Tymchyshyn
Limit  (cost=497.46..1464.50 rows=1000 width=693) (actual 
time=7767.721..7793.047 rows=1000 loops=1)
  ->  Merge Left Join  (cost=497.46..5521612.64 rows=5709243 width=693) (actual 
time=7767.717..7790.274 rows=1000 loops=1)
Merge Cond: (this_.id = (companymea2_.company_id)::bigint)
->  Merge Left Join  (cost=404.31..4544508.54 rows=5709243 width=625) 
(actual time=4211.501..4227.215 rows=1000 loops=1)
  Merge Cond: (this_.id = (ces3_.company_id)::bigint)
  ->  Merge Join  (cost=37.87..2435536.00 rows=5709243 width=123) 
(actual time=0.069..8.584 rows=1000 loops=1)
Merge Cond: (this_.id = (this_1_.company_id)::bigint)
->  Index Scan using pk_comp_m on company this_  
(cost=0.00..2139324.53 rows=6886598 width=107) (actual time=0.031..1.876 
rows=1054 loops=1)
  Index Cond: (id > 5000)
->  Index Scan using company_tag_company_id_idx on 
company_tag this_1_  (cost=0.00..207678.85 rows=5709243 width=16) (actual 
time=0.027..1.757 rows=1000 loops=1)
  Index Cond: ((this_1_.company_id)::bigint > 5000)
  Filter: ((this_1_.tag_id)::bigint = 7)
  ->  Index Scan using company_descrs on company_descr ces3_  
(cost=0.00..2073526.89 rows=1996612 width=502) (actual time=0.014..2576.013 
rows=1097096 loops=1)
->  Index Scan using pk_comp_ms on company_measures companymea2_  
(cost=0.00..889427.81 rows=6821672 width=68) (actual time=0.020..1946.255 
rows=1097096 loops=1)
Total runtime: 7794.692 ms

Limit  (cost=330.57..4888.97 rows=1000 width=693) (actual time=0.125..26.011 
rows=1000 loops=1)
  ->  Merge Join  (cost=330.57..5260651.29 rows=1153986 width=693) (actual 
time=0.122..23.215 rows=1000 loops=1)
Merge Cond: (this_.id = (this_1_.company_id)::bigint)
->  Merge Join  (cost=292.71..5023728.99 rows=1391960 width=677) 
(actual time=0.090..16.615 rows=1054 loops=1)
  Merge Cond: ((companymea2_.company_id)::bigint = this_.id)
  ->  Merge Join  (cost=0.00..2846769.89 rows=1685017 width=570) 
(actual time=0.063..9.534 rows=1054 loops=1)
Merge Cond: ((companymea2_.company_id)::bigint = 
(ces3_.company_id)::bigint)
->  Index Scan using pk_comp_ms on company_measures 
companymea2_  (cost=0.00..848312.00 rows=565 width=68) (actual 
time=0.033..1.973 rows=1054 loops=1)
  Index Cond: ((company_id)::bigint > 5000)
->  Index Scan using company_descrs on company_descr ces3_  
(cost=0.00..1963172.96 rows=1685017 width=502) (actual time=0.020..2.152 
rows=1054 loops=1)
  Index Cond: ((ces3_.company_id)::bigint > 5000)
  ->  Index Scan using pk_comp_m on company this_  
(cost=0.00..2139324.53 rows=6886598 width=107) (actual time=0.022..1.855 
rows=1054 loops=1)
Index Cond: (this_.id > 5000)
->  Index Scan using company_tag_company_id_idx on company_tag this_1_  
(cost=0.00..207678.85 rows=5709243 width=16) (actual time=0.025..1.742 
rows=1000 loops=1)
  Index Cond: ((this_1_.company_id)::bigint > 5000)
  Filter: ((this_1_.tag_id)::bigint = 7)
Total runtime: 27.547 ms


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Yeb Havinga

Віталій Тимчишин wrote:

Hello.

I have a query that performs very poor because there is a limit on 
join column that is not applied to other columns:


select * from company this_ left outer join company_tag this_1_ on 
this_.id=this_1_.company_id left outer join company_measures 
companymea2_ on this_.id=companymea2_.company_id left outer join 
company_descr ces3_ on this_.id=ces3_.company_id where this_1_.tag_id 
= 7 and this_.id>5000 
and this_1_.company_id>5000

order by this_.id asc limit 1000;

(plan1.txt)
Total runtime: 7794.692 ms

At the same time if I apply the limit (>5000) to other columns in 
query itself it works like a charm:


select * from company this_ left outer join company_tag this_1_ on 
this_.id=this_1_.company_id left outer join company_measures 
companymea2_ on this_.id=companymea2_.company_id left outer join 
company_descr ces3_ on this_.id=ces3_.company_id where this_1_.tag_id 
= 7 and this_.id>5000 
and this_1_.company_id>5000

and companymea2_.company_id>5000 and ces3_.company_id>5000
order by this_.id asc limit 1000;

(plan2.txt)
Total runtime: 27.547 ms

I've thought and someone in this list've told me that this should be 
done automatically.
Yes, if you have in a query a=b and b=c, then the optimizer figures out 
that a=c as well. (a,b and c are then member of the same equivalence class).


However both queries are not the same, since the joins you're using are 
outer joins. In the first it's possible that records are returned for 
company records with no matching ces3_ records, the ces3_ records is 
null in that case. In the second query no NULL ces3_ information may be 
returned.


Another thing is it seems that the number of rows guessed is far off 
from the actual number of rows, is the number 500 artificial or are 
you're statistics old or too small histogram/mcv's?


regards,
Yeb Havinga


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Віталій Тимчишин
16 квітня 2010 р. 11:31 Yeb Havinga  написав:

> Віталій Тимчишин wrote:
>
>> Hello.
>>
>> I have a query that performs very poor because there is a limit on join
>> column that is not applied to other columns:
>>
>> select * from company this_ left outer join company_tag this_1_ on
>> this_.id=this_1_.company_id left outer join company_measures companymea2_ on
>> this_.id=companymea2_.company_id left outer join company_descr ces3_ on
>> this_.id=ces3_.company_id where this_1_.tag_id = 7 and this_.id>5000 and
>> this_1_.company_id>5000
>> order by this_.id asc limit 1000;
>>
>> (plan1.txt)
>> Total runtime: 7794.692 ms
>>
>> At the same time if I apply the limit (>5000) to other columns in
>> query itself it works like a charm:
>>
>> select * from company this_ left outer join company_tag this_1_ on
>> this_.id=this_1_.company_id left outer join company_measures companymea2_ on
>> this_.id=companymea2_.company_id left outer join company_descr ces3_ on
>> this_.id=ces3_.company_id where this_1_.tag_id = 7 and this_.id>5000 and
>> this_1_.company_id>5000
>> and companymea2_.company_id>5000 and ces3_.company_id>5000
>> order by this_.id asc limit 1000;
>>
>> (plan2.txt)
>> Total runtime: 27.547 ms
>>
>> I've thought and someone in this list've told me that this should be done
>> automatically.
>>
> Yes, if you have in a query a=b and b=c, then the optimizer figures out
> that a=c as well. (a,b and c are then member of the same equivalence class).
>
> However both queries are not the same, since the joins you're using are
> outer joins. In the first it's possible that records are returned for
> company records with no matching ces3_ records, the ces3_ records is null in
> that case. In the second query no NULL ces3_ information may be returned.
>

OK, but when I move limit to join condition the query is still fast:

select * from company this_ left outer join company_tag this_1_ on
this_.id=this_1_.company_id
left outer join company_measures companymea2_ on
this_.id=companymea2_.company_id and companymea2_.company_id>5000
left outer join company_descr ces3_ on this_.id=ces3_.company_id and
ces3_.company_id>5000
where this_1_.tag_id = 7 and this_.id>5000
and this_1_.company_id>5000
order by this_.id asc limit 1000;

(plan3.txt),
Total runtime: 26.327 ms
BTW: Changing slow query to inner joins do not make it fast


>
> Another thing is it seems that the number of rows guessed is far off from
> the actual number of rows, is the number 500 artificial or are you're
> statistics old or too small histogram/mcv's?
>

Nope, I suppose this is because of limit. If I remove the limit, the
estimations are quite correct. There are ~6 millions of row in each table.
Limit  (cost=293.40..1028.60 rows=1000 width=1209) (actual time=0.128..24.751 
rows=1000 loops=1)
  ->  Merge Left Join  (cost=293.40..4197731.11 rows=5709243 width=1209) 
(actual time=0.124..21.968 rows=1000 loops=1)
Merge Cond: (this_.id = (companymea2_.company_id)::bigint)
->  Merge Left Join  (cost=246.59..3681230.10 rows=5709243 width=1141) 
(actual time=0.099..15.284 rows=1000 loops=1)
  Merge Cond: (this_.id = (ces3_.company_id)::bigint)
  ->  Merge Join  (cost=37.87..2435536.00 rows=5709243 width=639) 
(actual time=0.074..8.487 rows=1000 loops=1)
Merge Cond: (this_.id = (this_1_.company_id)::bigint)
->  Index Scan using pk_comp_m on company this_  
(cost=0.00..2139324.53 rows=6886598 width=623) (actual time=0.037..1.860 
rows=1054 loops=1)
  Index Cond: (id > 5000)
->  Index Scan using company_tag_company_id_idx on 
company_tag this_1_  (cost=0.00..207678.85 rows=5709243 width=16) (actual 
time=0.027..1.758 rows=1000 loops=1)
  Index Cond: ((this_1_.company_id)::bigint > 5000)
  Filter: ((this_1_.tag_id)::bigint = 7)
  ->  Index Scan using cws_company_descr_unique on company_descr 
ces3_  (cost=0.00..1169511.92 rows=5757068 width=502) (actual time=0.020..1.788 
rows=1054 loops=1)
Index Cond: ((ces3_.company_id)::bigint > 5000)
->  Index Scan using pk_comp_ms on company_measures companymea2_  
(cost=0.00..440945.79 rows=565 width=68) (actual time=0.019..1.729 
rows=1054 loops=1)
  Index Cond: ((companymea2_.company_id)::bigint > 5000)
Total runtime: 26.327 ms


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Віталій Тимчишин
16 квітня 2010 р. 11:25 Hannu Krosing  написав:

> On Fri, 2010-04-16 at 11:02 +0300, Віталій Тимчишин wrote:
> > Hello.
> >
> >
> > I have a query that performs very poor because there is a limit on
> > join column that is not applied to other columns:
> >
> >
> > select * from company this_ left outer join company_tag this_1_ on
> > this_.id=this_1_.company_id left outer join company_measures
> > companymea2_ on this_.id=companymea2_.company_id left outer join
> > company_descr ces3_ on this_.id=ces3_.company_id where this_1_.tag_id
> > = 7 and this_.id>5000
> > and this_1_.company_id>5000
> > order by this_.id asc limit 1000;
> >
> >
> > (plan1.txt)
> > Total runtime: 7794.692 ms
> >
> >
> > At the same time if I apply the limit (>5000) to other columns in
> > query itself it works like a charm:
> >
> >
> > select * from company this_ left outer join company_tag this_1_ on
> > this_.id=this_1_.company_id left outer join company_measures
> > companymea2_ on this_.id=companymea2_.company_id left outer join
> > company_descr ces3_ on this_.id=ces3_.company_id where this_1_.tag_id
> > = 7 and this_.id>5000
> > and this_1_.company_id>5000
> > and companymea2_.company_id>5000 and ces3_.company_id>5000
> > order by this_.id asc limit 1000;
>
> The queries are not the same.
>
> 2nd variant will not return the rows where there are no matching rows
> inthis_1_ , companymea2_ or ces3_.company_id
>
> A query equivalent to first one would be:
>
>
> select * from company this_
>  left outer join company_tag this_1_
>   on (this_.id=this_1_.company_id
>and this_1_.company_id>5000)
>   left outer join company_measures companymea2_
>   on (this_.id=companymea2_.company_id
>and companymea2_.company_id>5000)
>   left outer join company_descr ces3_
>   on (this_.id=ces3_.company_id
>and ces3_.company_id>5000)
>  where this_1_.tag_id = 7
>   and this_.id>5000
>  order by this_.id asc
>  limit 1000;
>

And it's still fast (see plan in another mail), while "inner join" variant
of original query is still slow.


>
>
> I'm not sure that planner considers the above form of plan rewrite, nor
> that it would make much sense to do so unless there was a really small
> number of rows where x_.company_id>5000
>
> Actually no,
select id > 5000, count(*) from company group by 1
f,1096042
t,5725630

I don't know why the planner wishes to perform few merges of 1000 to a
million of records (and the merges is the thing that takes time) instead of
taking a 1000 of records from main table and then doing a nested loop. And
it must read all the records that DO NOT match the criteria for secondary
tables before getting to correct records if it do not filter secondary
tables with index on retrieve.

set enable_mergejoin=false helps original query, but this is another problem
and first solution is simpler and can be used by planner automatically,
while second requires rethinking/rewrite of LIMIT estimation logic
(Plan of nested loop attached)
Limit  (cost=0.00..2369.36 rows=1000 width=1209) (actual time=0.179..41.155 
rows=1000 loops=1)
  ->  Nested Loop  (cost=0.00..15727940.41 rows=6638046 width=1209) (actual 
time=0.174..38.312 rows=1000 loops=1)
->  Nested Loop  (cost=0.00..11165483.75 rows=6701224 width=1141) 
(actual time=0.134..26.421 rows=1000 loops=1)
  ->  Nested Loop  (cost=0.00..5763844.03 rows=6765004 width=639) 
(actual time=0.066..14.389 rows=1000 loops=1)
->  Index Scan using pk_comp_m on company this_  
(cost=0.00..1152936.77 rows=6886598 width=623) (actual time=0.038..1.908 
rows=1054 loops=1)
  Index Cond: (id > 5000)
->  Index Scan using company_tag_company_id_idx on 
company_tag this_1_  (cost=0.00..0.66 rows=1 width=16) (actual 
time=0.004..0.006 rows=1 loops=1054)
  Index Cond: ((this_1_.company_id)::bigint = this_.id)
  Filter: ((this_1_.tag_id)::bigint = 7)
  ->  Index Scan using cws_company_descr_unique on company_descr 
ces3_  (cost=0.00..0.79 rows=1 width=502) (actual time=0.004..0.006 rows=1 
loops=1000)
Index Cond: ((ces3_.company_id)::bigint = this_.id)
->  Index Scan using pk_comp_ms on company_measures companymea2_  
(cost=0.00..0.67 rows=1 width=68) (actual time=0.004..0.006 rows=1 loops=1000)
  Index Cond: ((companymea2_.company_id)::bigint = this_.id)
Total runtime: 42.940 ms


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Yeb Havinga

Віталій Тимчишин wrote:


BTW: Changing slow query to inner joins do not make it fast

I'm interested to see the query andplan of the slow query with inner joins.




Another thing is it seems that the number of rows guessed is far
off from the actual number of rows, is the number 500
artificial or are you're statistics old or too small histogram/mcv's?


Nope, I suppose this is because of limit. If I remove the limit, the 
estimations are quite correct. There are ~6 millions of row in each table.

Yes, that makes sense.

regards,
Yeb Havinga


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Віталій Тимчишин
16 квітня 2010 р. 16:21 Yeb Havinga  написав:

> Віталій Тимчишин wrote:
>
>>
>> BTW: Changing slow query to inner joins do not make it fast
>>
> I'm interested to see the query andplan of the slow query with inner joins.
>
>
> Here you are. The query:

select * from company this_ inner join company_tag this_1_ on
this_.id=this_1_.company_id
inner join company_measures companymea2_ on
this_.id=companymea2_.company_id
inner join company_descr ces3_ on this_.id=ces3_.company_id
where this_1_.tag_id = 7 and this_.id>5000
order by this_.id asc
limit 1000
;
Total runtime: 14088.942 ms
(plan is attached)

Best regards, Vitalii Tymchyshyn
Limit  (cost=227.15..883.22 rows=1000 width=1209) (actual 
time=14062.106..14087.375 rows=1000 loops=1)
  ->  Merge Join  (cost=227.15..4355277.70 rows=6638046 width=1209) (actual 
time=14062.101..14084.577 rows=1000 loops=1)
Merge Cond: (this_.id = (this_1_.company_id)::bigint)
->  Merge Join  (cost=185.14..4025978.59 rows=6757358 width=1193) 
(actual time=10692.975..10708.923 rows=1054 loops=1)
  Merge Cond: ((companymea2_.company_id)::bigint = this_.id)
  ->  Merge Join  (cost=0.00..1784574.44 rows=6821672 width=570) 
(actual time=0.111..9138.804 rows=1097096 loops=1)
Merge Cond: ((companymea2_.company_id)::bigint = 
(ces3_.company_id)::bigint)
->  Index Scan using pk_comp_ms on company_measures 
companymea2_  (cost=0.00..456350.36 rows=6821672 width=68) (actual 
time=0.066..1747.291 rows=1097096 loops=1)
->  Index Scan using cws_company_descr_unique on 
company_descr ces3_  (cost=0.00..1225899.00 rows=6821672 width=502) (actual 
time=0.033..1822.085 rows=1097096 loops=1)
  ->  Index Scan using pk_comp_m on company this_  
(cost=0.00..2139324.53 rows=6886598 width=623) (actual time=0.026..1.853 
rows=1054 loops=1)
Index Cond: (this_.id > 5000)
->  Index Scan using company_tag_company_id_idx on company_tag this_1_  
(cost=0.00..229167.56 rows=6765004 width=16) (actual time=0.028..1859.967 
rows=1075634 loops=1)
  Filter: ((this_1_.tag_id)::bigint = 7)
Total runtime: 14088.942 ms


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Tom Lane
=?KOI8-U?B?96bUwcymyiD0yc3eydvJzg==?=  writes:
> I've thought and someone in this list've told me that this should be done
> automatically.

No, that's not true.  We do make deductions about transitive equalities,
ie, given WHERE a=b AND b=c the planner will infer a=c and use that if
it's helpful.  We don't make deductions about inequalities such as a>c.
In theory there's enough information available to do so, but overall
trying to do that would probably waste more cycles than it would save.
You'd need a lot of expensive new planner infrastructure, and in the
vast majority of queries it wouldn't produce anything very helpful.

As was pointed out, even if we had such logic it wouldn't apply in this
example, because the equality conditions aren't real equalities but
OUTER JOIN conditions.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Josh Kupershmidt
On Thu, Apr 15, 2010 at 6:31 PM, Tom Lane  wrote:
> Chris  writes:
>> I have a lot of centos servers which are running postgres.  Postgres isn't 
>> used
>> that heavily on any of them, but lately, the stats collector process keeps
>> causing tons of IO load.  It seems to happen only on servers with centos 5.
>
> Say, I just realized that both of you are complaining about stats
> collector overhead on centos 5 servers.  I hadn't been thinking in terms
> of OS-specific causes, but maybe that is what we need to consider.
> Can you tell me the exact kernel versions you are seeing these problems
> with?

uname -a says "... 2.6.18-92.1.13.el5 #1 SMP ... x86_64", and it's CentOS 5.2.

I'm not sure whether this is related to the stats collector problems
on this machine, but I noticed alarming table bloat in the catalog
tables pg_attribute, pg_attrdef, pg_depend, and pg_type. Perhaps this
has happened slowly over the past few months, but I discovered the
bloat when I ran the query from:
http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html

on the most-active database on this server (OID 16389 from the
pgstat.stat I sent in). See attached table_bloat.txt. The autovacuum
settings for this server haven't been tweaked from the default; they
probably should have been, given the heavy bulk updates/inserts done.
Maybe there's another cause for this extreme catalog bloat, besides
the weak autovacuum settings, though.

Table sizes, according to pg_size_pretty(pg_total_relation_size(...)):
 * pg_attribute: 145 GB
 * pg_attrdef: 85 GB
 * pg_depend: 38 GB
 * pg_type: 3465 MB

I'll try to send in strace outputs later today.

Josh
  schemaname   |   tablename   | reltuples | relpages |  otta  |  
tbloat  | wastedpages | wastedbytes | wastedsize | iname
  | ituples  | ipages  | iotta  | ibloat  | wastedipages | 
wastedibytes | wastedisize 
---+---+---+--++--+-+-+++--+-++-+--+--+-
 pg_catalog| pg_attribute  | 0 | 12178069 |  0 |  
0.0 |12178069 | 99762741248 | 93 GB  | pg_attribute_relid_attnam_index  
  |0 | 4525199 |  0 | 0.0 |  4525199 |  
37070430208 | 35 GB
 pg_catalog| pg_attribute  | 0 | 12178069 |  0 |  
0.0 |12178069 | 99762741248 | 93 GB  | pg_attribute_relid_attnum_index  
  |0 |  848842 |  0 | 0.0 |   848842 |   
6953713664 | 6632 MB
 pg_catalog| pg_attrdef|   767 |  8789472 | 30 | 
292982.4 | 8789442 | 72003108864 | 67 GB  | pg_attrdef_oid_index
   |  767 |  361513 | 28 | 12911.2 |   361485 |   
2961285120 | 2824 MB
 pg_catalog| pg_attrdef|   767 |  8789472 | 30 | 
292982.4 | 8789442 | 72003108864 | 67 GB  | 
pg_attrdef_adrelid_adnum_index |  767 |  359805 | 28 | 
12850.2 |   359777 |   2947293184 | 2811 MB

 pg_catalog| pg_depend | 14270 |  1890691 | 98 |  
19292.8 | 1890593 | 15487737856 | 14 GB  | pg_depend_depender_index 
  |14270 | 1649557 | 65 | 25377.8 |  1649492 |  
13512638464 | 13 GB
 pg_catalog| pg_depend | 14270 |  1890691 | 98 |  
19292.8 | 1890593 | 15487737856 | 14 GB  | pg_depend_reference_index
  |14270 | 1371522 | 65 | 21100.3 |  1371457 |  
11234975744 | 10 GB

 pg_catalog| pg_type   |  1333 |   385445 |  31 |  
12433.7 |  385414 |  3157311488 | 3011 MB| pg_type_oid_index
  | 1333 |   54965 | 24 |  2290.2 |54941 |
450076672 | 429 MB
 pg_catalog| pg_type   |  1333 |   385445 |  31 |  
12433.7 |  385414 |  3157311488 | 3011 MB| pg_type_typname_nsp_index
  | 1333 |3129 | 24 |   130.4 | 3105 | 
25436160 | 24 MB



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Tom Lane
Josh Kupershmidt  writes:
> I'm not sure whether this is related to the stats collector problems
> on this machine, but I noticed alarming table bloat in the catalog
> tables pg_attribute, pg_attrdef, pg_depend, and pg_type.

Hmm.  That makes me wonder if autovacuum is functioning properly at all.
What does pg_stat_all_tables show for the last vacuum and analyze times
of those tables?  Try something like

select 
relname,n_live_tup,n_dead_tup,last_vacuum,last_autovacuum,last_analyze,last_autoanalyze
 from pg_stat_all_tables where schemaname = 'pg_catalog' order by 1;

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Tom Lane
Chris  writes:
> After the file was made larger and I stopped the vacuum process, I started
> seeing the problem.  All other postgress processes were quiet, but the stats
> collector was constantly causing anywhere from 20-60 of the IO on the server.
> Since all the other postgres processes weren't really doing anything, and it 
> is
> a busy web server which is predominately MySQL, I'm fairly curious as to what
> it is doing.

Yeah, the stats collector rewrites the stats file every half second, if
there have been any changes since last time --- so the bigger the file,
the more overhead.  (8.4 is smarter about this, but that doesn't help
you on 8.3.)

> I straced the stats collector process.  I wasn't sure what else to trace as
> there wasn't a single other postgres process doing anything.

That strace doesn't really prove much; it's what I'd expect.  Here's
what to do: start a PG session, and strace that session's backend *and*
the stats collector while you manually do VACUUM some-small-table.
The VACUUM command should try to send some messages to the stats collector
process.  I'm wondering if those get dropped somehow.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Josh Kupershmidt
On Fri, Apr 16, 2010 at 11:23 AM, Tom Lane  wrote:
> Josh Kupershmidt  writes:
>> I'm not sure whether this is related to the stats collector problems
>> on this machine, but I noticed alarming table bloat in the catalog
>> tables pg_attribute, pg_attrdef, pg_depend, and pg_type.
>
> Hmm.  That makes me wonder if autovacuum is functioning properly at all.
> What does pg_stat_all_tables show for the last vacuum and analyze times
> of those tables?  Try something like
>
> select 
> relname,n_live_tup,n_dead_tup,last_vacuum,last_autovacuum,last_analyze,last_autoanalyze
>  from pg_stat_all_tables where schemaname = 'pg_catalog' order by 1;
>

Output attached. Note that I ran pg_stat_reset() a few days ago.
Josh
# select 
relname,n_live_tup,n_dead_tup,last_vacuum,last_autovacuum,last_analyze,last_autoanalyze
 from pg_stat_all_tables where schemaname = 'pg_catalog' order by 1;
 relname  | n_live_tup | n_dead_tup | last_vacuum |
last_autovacuum| last_analyze |   last_autoanalyze
--+++-+---+--+---
 pg_aggregate |  0 |  0 | | 
  |  | 
 pg_am|  0 |  0 | | 
  |  | 
 pg_amop  |  0 |  0 | | 
  |  | 
 pg_amproc|  0 |  0 | | 
  |  | 
 pg_attrdef   |  2 |3527675 | | 
  |  | 
 pg_attribute | 56 |   12771286 | | 
  |  | 
 pg_auth_members  | 11 |  0 | | 
  |  | 
 pg_authid| 16 |  2 | | 
  |  | 
 pg_autovacuum|  0 |  0 | | 
  |  | 
 pg_cast  |  0 |  0 | | 
  |  | 
 pg_class |  30191 | 155416 | | 2010-04-15 
21:14:06.826358+00 |  | 2010-04-15 21:14:06.826358+00
 pg_constraint|   2435 |701 | | 2010-04-15 
16:13:33.92506+00  |  | 2010-04-15 16:13:33.92506+00
 pg_conversion|  0 |  0 | | 
  |  | 
 pg_database  |  0 |  0 | | 
  |  | 
 pg_depend| 55 |4961774 | | 
  |  | 
 pg_description   |  0 |  0 | | 
  |  | 
 pg_enum  |  0 |  0 | | 
  |  | 
 pg_index |   4603 |   2272 | | 2010-04-15 
16:18:21.289279+00 |  | 2010-04-15 16:18:21.289279+00
 pg_inherits  |  2 |  0 | | 
  |  | 
 pg_language  |  0 |  0 | | 
  |  | 
 pg_largeobject   |  0 |  0 | | 
  |  | 
 pg_listener  | 37 | 117512 | | 
  |  | 
 pg_namespace |  0 |  0 | | 
  |  | 
 pg_opclass   |  0 |  0 | | 
  |  | 
 pg_operator  |  0 |  0 | | 
  |  | 
 pg_opfamily  |  0 |  0 | | 
  |  | 
 pg_pltemplate|  0 |  0 | | 
  |  | 
 pg_proc  |  0 |  0 | | 
  |  | 
 pg_rewrite   |  4 |  0 | | 
  |  | 
 pg_shdepend  |717 | 172806 | | 2010-04-15 
16:56:26.296809+00 |  | 2010-04-15 16:56:26.296809+00
 pg_shdescription |  0 |  0 | | 
  |  | 
 pg_statistic |  0 |  0 | | 
  |  | 
 pg_tablespace|  0 |  0 | | 
  |  | 
 pg_trigger   |   7021 |   1232 | | 2010-04-15 
16:24:55.054294+00 |  | 2010-04-15 16:24:5

Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Yeb Havinga

Tom Lane wrote:

=?KOI8-U?B?96bUwcymyiD0yc3eydvJzg==?=  writes:
  

I've thought and someone in this list've told me that this should be done
automatically.



No, that's not true.  We do make deductions about transitive equalities,
ie, given WHERE a=b AND b=c the planner will infer a=c and use that if
it's helpful.  We don't make deductions about inequalities such as a>c.
In theory there's enough information available to do so, but overall
trying to do that would probably waste more cycles than it would save.
You'd need a lot of expensive new planner infrastructure, and in the
vast majority of queries it wouldn't produce anything very helpful.
  
New expensive planner infrastructure to support from a>b and b>c infer 
a>c, yes.


But I wonder if something like Leibniz's principle of identity holds for 
members of the same equivalence class, e.g. like if x,y are both members 
of the same EC, then for every predicate P, P(x) iff P(y). Probably not 
for every predicate (like varno = 2 or attname='x'), but for the query 
evaluation, the object denoted by the variables are the same, since that 
is the standard meaning of the = operator. I cannot think of any 
standard (btree) operator where 'Leibniz' would fail in this case.


regards,
Yeb Havinga



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Tom Lane
Josh Kupershmidt  writes:
> On Fri, Apr 16, 2010 at 11:23 AM, Tom Lane  wrote:
>> Hmm.  That makes me wonder if autovacuum is functioning properly at all.
>> What does pg_stat_all_tables show for the last vacuum and analyze times
>> of those tables?  Try something like
>> 
>> select 
>> relname,n_live_tup,n_dead_tup,last_vacuum,last_autovacuum,last_analyze,last_autoanalyze
>>  from pg_stat_all_tables where schemaname = 'pg_catalog' order by 1;

> Output attached. Note that I ran pg_stat_reset() a few days ago.

Wow.  Well, we have a smoking gun here: for some reason, autovacuum
isn't running, or isn't doing its job if it is.  If it's not running
at all, that would explain failure to prune the stats collector's file
too.

Is there anything in the postmaster log that would suggest autovac
difficulties?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Tom Lane
Yeb Havinga  writes:
> New expensive planner infrastructure to support from a>b and b>c infer 
> a>c, yes.

> But I wonder if something like Leibniz's principle of identity holds for 
> members of the same equivalence class, e.g. like if x,y are both members 
> of the same EC, then for every predicate P, P(x) iff P(y).

This could only be assumed to apply for predicates constructed from
operators that are in the equivalence operator's btree opfamily.
Now, that would certainly be a large enough set of cases to sometimes
give useful results --- but I stand by the opinion that it wouldn't
win often enough to justify the added planner overhead.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Tom Lane
Josh Kupershmidt  writes:
> I made a small half-empty table like this:
>   CREATE TABLE test_vacuum (i  int  PRIMARY KEY);
>   INSERT INTO test_vacuum (i) SELECT a FROM generate_series(1,50) AS a;
>   DELETE FROM test_vacuum WHERE RANDOM() < 0.5;

> and then ran:
>   VACUUM test_vacuum;

> while an strace of the stats collector process was running. Then after
> a few seconds, found the PID of the VACUUM process, and ran strace on
> it. I killed them after the VACUUM finished. Outputs attached.

Huh.  The VACUUM strace clearly shows a boatload of TABPURGE messages
being sent:

sendto(7, 
"\2\0\0\0\350\3\0\...@\0\0\366\0\0\0\324\206<\24\321uC\24\320\350)\24\225\345,\24"...,
 1000, 0, NULL, 0) = 1000
sendto(7, 
"\2\0\0\0\350\3\0\...@\0\0\366\0\0\0c\274?\24\365\323?\24\241n@\24\217\0309\24"...,
 1000, 0, NULL, 0) = 1000
sendto(7, 
"\2\0\0\0\350\3\0\...@\0\0\366\0\0\0\375z2\24\211\f@\0241\3047\24\357mH\24"..., 
1000, 0, NULL, 0) = 1000
sendto(7, 
"\2\0\0\0\350\3\0\...@\0\0\366\0\0\0\242\3529\24\234k\'\24\17\227)\24\300\22+\24"...,
 1000, 0, NULL, 0) = 1000

and the stats collector is receiving them:

recvfrom(7, 
"\2\0\0\0\350\3\0\...@\0\0\366\0\0\0\324\206<\24\321uC\24\320\350)\24\225\345,\24"...,
 1000, 0, NULL, NULL) = 1000
recvfrom(7, 
"\2\0\0\0\350\3\0\...@\0\0\366\0\0\0c\274?\24\365\323?\24\241n@\24\217\0309\24"...,
 1000, 0, NULL, NULL) = 1000
recvfrom(7, 
"\2\0\0\0\350\3\0\...@\0\0\366\0\0\0\375z2\24\211\f@\0241\3047\24\357mH\24"..., 
1000, 0, NULL, NULL) = 1000

So this *should* have resulted in the stats file shrinking.  Did you
happen to notice if it did, after you did this?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Josh Kupershmidt
On Fri, Apr 16, 2010 at 11:41 AM, Tom Lane  wrote:
> Wow.  Well, we have a smoking gun here: for some reason, autovacuum
> isn't running, or isn't doing its job if it is.  If it's not running
> at all, that would explain failure to prune the stats collector's file
> too.

Hrm, well autovacuum is at least trying to do work: it's currently
stuck on those bloated pg_catalog tables, of course. Another developer
killed an autovacuum of pg_attribute (or maybe it was pg_attrdef)
after it had been running for two weeks. See current pg_stat_activity
output attached, which shows the three autovacuum workers running plus
two manual VACUUM ANALYZEs I started yesterday.

> Is there anything in the postmaster log that would suggest autovac
> difficulties?

Yup, there are logs from April 1st which I just grepped through. I
attached the redacted output, and I see a few warnings about "[table]
contains more than "max_fsm_pages" pages with useful free space", as
well as "ERROR:  canceling autovacuum task".

Perhaps bumping up max_fsm_pages and making autovacuum settings more
aggressive will help me? I was also planning to run a CLUSTER of those
four bloated pg_catalog tables -- is this safe, particularly for
tables like pg_attrdef which rely on OIDs?

Josh
# SELECT * FROM pg_stat_activity WHERE usename = 'postgres' AND current_query 
ILIKE '%vacuum%';
  datid  |  datname   | procpid | usesysid | usename  | 
current_query | waiting 
|  xact_start   |  query_start  | 
backend_start |  client_addr  | client_port 
-++-+--+--+---+-+---+---+---+---+-
 7651893 | master_dev |   20288 |   10 | postgres | VACUUM ANALYZE 
myschema.mytable;  | f   | 
2010-04-14 16:24:51.421088+00 | 2010-04-14 16:24:51.421088+00 | 2010-04-14 
16:24:46.875911+00 | 192.168.24.63 |   54065
[snip SELECT * FROM pg_stat_activity]
   16389 | master |9596 |   10 | postgres | autovacuum: VACUUM 
ANALYZE pg_catalog.pg_attrdef  | f  
 | 2010-04-15 21:18:41.583044+00 | 2010-04-15 21:18:41.583044+00 | 
2010-04-01 11:25:53.903941+00 |   |
   16389 | master |   18043 |   10 | postgres | autovacuum: VACUUM 
ANALYZE pg_catalog.pg_attribute| f  
 | 2010-04-15 17:18:36.071479+00 | 2010-04-15 17:18:36.071479+00 | 
2010-04-15 16:51:50.230827+00 |   |
   16389 | master |   19608 |   10 | postgres | autovacuum: VACUUM 
ANALYZE pg_catalog.pg_depend   | f  
 | 2010-04-15 16:26:41.941069+00 | 2010-04-15 16:26:41.941069+00 | 
2010-04-08 17:08:07.417122+00 |   |
   16389 | master |   20278 |   10 | postgres | VACUUM ANALYZE 
myschema.mytable;  | f   | 
2010-04-14 16:24:35.7458+00   | 2010-04-14 16:24:35.7458+00   | 2010-04-14 
16:24:28.632431+00 | 192.168.24.63 |   54064
(6 rows)
# grep -A 2 -B 2 -i "vacuum" *.log
postgresql-2010-04-01_00.log-2010-04-01 06:48:40 UTC username1 1.2.3.4 
master WARNING:  there is already a transaction in progress
postgresql-2010-04-01_00.log-2010-04-01 06:48:44 UTCWARNING:  relation 
"pg_catalog.pg_depend" contains more than "max_fsm_pages" pages with useful 
free space
postgresql-2010-04-01_00.log:2010-04-01 06:48:44 UTCHINT:  Consider 
using VACUUM FULL on this relation or increasing the configuration parameter 
"max_fsm_pages".
postgresql-2010-04-01_00.log-2010-04-01 06:48:51 UTC username1 1.2.3.4 
master WARNING:  there is already a transaction in progress
postgresql-2010-04-01_00.log-2010-04-01 06:48:52 UTC username1 1.2.3.4 
master WARNING:  there is already a transaction in progress
--
[snip connection info]
postgresql-2010-04-01_00.log-2010-04-01 10:41:38 UTCWARNING:  relation 
"schema.table" contains more than "max_fsm_pages" pages with useful free space
postgresql-2010-04-01_00.log:2010-04-01 10:41:38 UTCHINT:  Consider 
using VACUUM FULL on this relation or increasing the configuration parameter 
"max_fsm_pages".
postgresql-2010-04-01_00.log-2010-04-01 10:41:46 UTC username1 1.2.3.4 
master WARNING:  there is already a transaction in progress
postgresql-2010-04-01_00.log-2010-04-01 10:41:47 UTC username1 1.2.3.4 
master WARNING:  there is already a transaction in progress
--
postgresql-2010-04-01_00.log-2010-04-01 10:49:33 UTC username1 1.2.3.4 
master WARNING:  there is already a transaction in progress
postgresql-2010-04-01_00.log-2010-04-01 10:

Re: [PERFORM] Autovaccum with cost_delay does not complete on one solaris 5.10 machine

2010-04-16 Thread Josh Berkus

Tom,

Neither database has and per-table autovacuum settings.

However, since this is a production database, I had to try something, 
and set vacuum_cost_limit up to 1000.  The issue with vacuuming one page 
at a time went away, or at least I have not seen it repeat in the last 
16 hours.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Tom Lane
I wrote:
> So this *should* have resulted in the stats file shrinking.  Did you
> happen to notice if it did, after you did this?

Oh, never mind that --- I can see that it did shrink, just from counting
the write() calls in the collector's strace.  So what we have here is a
demonstration that the tabpurge mechanism does work for you, when it's
invoked.  Which is further evidence that for some reason autovacuum is
not running for you.

What I'd suggest at this point is cranking up log_min_messages to DEBUG2
or so in postgresql.conf, restarting the postmaster, and keeping an eye
on the log to see if you can spot anything about why autovac isn't
working.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Tom Lane
Josh Kupershmidt  writes:
> On Fri, Apr 16, 2010 at 11:41 AM, Tom Lane  wrote:
>> Wow.  Well, we have a smoking gun here: for some reason, autovacuum
>> isn't running, or isn't doing its job if it is.  If it's not running
>> at all, that would explain failure to prune the stats collector's file
>> too.

> Hrm, well autovacuum is at least trying to do work: it's currently
> stuck on those bloated pg_catalog tables, of course. Another developer
> killed an autovacuum of pg_attribute (or maybe it was pg_attrdef)
> after it had been running for two weeks. See current pg_stat_activity
> output attached, which shows the three autovacuum workers running plus
> two manual VACUUM ANALYZEs I started yesterday.

Two weeks?  What have you got the autovacuum cost delays set to?

Once you're up to three AV workers, no new ones can get launched until
one of those finishes or is killed.  So that would explain failure to
prune the stats collector's tables (the tabpurge code is only run during
AV worker launch).  So what we need to figure out is why it's taking so
obscenely long to vacuum these tables ...

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Autovaccum with cost_delay does not complete on one solaris 5.10 machine

2010-04-16 Thread Alvaro Herrera
Josh Berkus wrote:
> Tom,
> 
> Neither database has and per-table autovacuum settings.
> 
> However, since this is a production database, I had to try
> something, and set vacuum_cost_limit up to 1000.  The issue with
> vacuuming one page at a time went away, or at least I have not seen
> it repeat in the last 16 hours.

How many autovac workers are there?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Autovaccum with cost_delay does not complete on one solaris 5.10 machine

2010-04-16 Thread Josh Berkus



How many autovac workers are there?


Max_workers is set to 3.  However, I've never seen more than one active 
at a time.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Scott Carey

On Apr 16, 2010, at 9:48 AM, Tom Lane wrote:

> Josh Kupershmidt  writes:
>> On Fri, Apr 16, 2010 at 11:41 AM, Tom Lane  wrote:
>>> Wow.  Well, we have a smoking gun here: for some reason, autovacuum
>>> isn't running, or isn't doing its job if it is.  If it's not running
>>> at all, that would explain failure to prune the stats collector's file
>>> too.
> 
>> Hrm, well autovacuum is at least trying to do work: it's currently
>> stuck on those bloated pg_catalog tables, of course. Another developer
>> killed an autovacuum of pg_attribute (or maybe it was pg_attrdef)
>> after it had been running for two weeks. See current pg_stat_activity
>> output attached, which shows the three autovacuum workers running plus
>> two manual VACUUM ANALYZEs I started yesterday.
> 
> Two weeks?  What have you got the autovacuum cost delays set to?
> 
> Once you're up to three AV workers, no new ones can get launched until
> one of those finishes or is killed.  So that would explain failure to
> prune the stats collector's tables (the tabpurge code is only run during
> AV worker launch).  So what we need to figure out is why it's taking so
> obscenely long to vacuum these tables ...
> 

On any large system with good I/O I have had to significantly increase the 
aggressiveness of autovacuum.
Even with the below settings, it doesn't interfere with other activity 
(~2200iops random, ~900MB/sec sequential capable I/O).

My relevant autovacuum parameters are (from 'show *'):
 autovacuum  | on| 
Starts the autovacuum subprocess.
 autovacuum_analyze_scale_factor | 0.1   | 
Number of tuple inserts, updates or deletes prior to analyze as a fraction of 
reltuples.
 autovacuum_analyze_threshold| 50| 
Minimum number of tuple inserts, updates or deletes prior to analyze.
 autovacuum_freeze_max_age   | 2 | Age 
at which to autovacuum a table to prevent transaction ID wraparound.
 autovacuum_max_workers  | 3 | Sets 
the maximum number of simultaneously running autovacuum worker processes.
 autovacuum_naptime  | 1min  | Time 
to sleep between autovacuum runs.
 autovacuum_vacuum_cost_delay| 20ms  | 
Vacuum cost delay in milliseconds, for autovacuum.
 autovacuum_vacuum_cost_limit| 2000  | 
Vacuum cost amount available before napping, for autovacuum.
 autovacuum_vacuum_scale_factor  | 0.2   | 
Number of tuple updates or deletes prior to vacuum as a fraction of reltuples.
 autovacuum_vacuum_threshold | 50   




For what it is worth, I just went onto one of my systems -- one with lots of 
partition tables and temp table creation/destruction -- and looked at the 
system tables in question there.

Postgres 8.4, using dt+  (trimmed result below to interesting tables)

   Schema   |  Name   | Type  |  Owner   |Size| 
Description 
+-+---+--++-
 pg_catalog | pg_attrdef  | table | postgres | 195 MB | 
 pg_catalog | pg_attribute| table | postgres | 1447 MB| 
 pg_catalog | pg_class| table | postgres | 1694 MB| 
 pg_catalog | pg_constraint   | table | postgres | 118 MB | 
 pg_catalog | pg_depend   | table | postgres | 195 MB | 
 pg_catalog | pg_statistic| table | postgres | 2300 MB| 
 pg_catalog | pg_type | table | postgres | 181 MB | 


So, I did a vacuum full; reindex table; analyze;   sequence on each of these.  
I wish I could just CLUSTER them but the above works.

now the tables are:
   Schema   |  Name   | Type  |  Owner   |Size| 
Description 
+-+---+--++-
 pg_catalog | pg_attrdef  | table | postgres | 44 MB  | 
 pg_catalog | pg_attribute| table | postgres | 364 MB | 
 pg_catalog | pg_class| table | postgres | 1694 MB| 
 pg_catalog | pg_constraint   | table | postgres | 118 MB | 
 pg_catalog | pg_depend   | table | postgres | 195 MB | 
 pg_catalog | pg_statistic| table | postgres | 656 MB | 
 pg_catalog | pg_type | table | postgres | 45 MB  | 


I've learned to accept about 50% bloat (2x the compacted size) in postgres as 
just the way it usually is on a busy table, but the 3x and 4x bloat of 
statistic, attrdef, and attribute have me wondering.

I have had some 'idle in transaction' connections hanging out from time to time 
that have caused issues on this machine that could explain the above 
perma-bloat.  That is one thing that could affect the case reported here as 

Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Scott Carey
> 
> I have had some 'idle in transaction' connections hanging out from time to 
> time that have caused issues on this machine that could explain the above 
> perma-bloat.  That is one thing that could affect the case reported here as 
> well.  The worst thing about those, is you can't even force kill those 
> connections from within postgres (pg_cancel_backend doesn't work on them, and 
> killing them via the OS bounces postgres ...) so you have to hunt down the 
> offending client.
> 

Ooh, I just noticed pg_terminate_backend() ... maybe this will let me kill 
annoying idle in transaction clients.  I guess this arrived in 8.4?  Hopefully 
this won't cause the whole thing to bounce and close all other backends


> 
>>  regards, tom lane
>> 
>> -- 
>> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
> 
> 
> -- 
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Josh Kupershmidt
On Fri, Apr 16, 2010 at 12:48 PM, Tom Lane  wrote:
> Josh Kupershmidt  writes:
>> Hrm, well autovacuum is at least trying to do work: it's currently
>> stuck on those bloated pg_catalog tables, of course. Another developer
>> killed an autovacuum of pg_attribute (or maybe it was pg_attrdef)
>> after it had been running for two weeks. See current pg_stat_activity
>> output attached, which shows the three autovacuum workers running plus
>> two manual VACUUM ANALYZEs I started yesterday.
>
> Two weeks?  What have you got the autovacuum cost delays set to?

SELECT name, current_setting(name), source FROM pg_settings WHERE
source != 'default' AND name ILIKE '%vacuum%';
 name | current_setting |   source
--+-+
 vacuum_cost_delay| 200ms   | configuration file
 vacuum_cost_limit| 100 | configuration file
 vacuum_cost_page_hit | 6   | configuration file
(3 rows)

I'm guessing these values and the default autovacuum configuration
values need to be cranked significantly to make vacuum much more
aggressive :-(

> Once you're up to three AV workers, no new ones can get launched until
> one of those finishes or is killed.  So that would explain failure to
> prune the stats collector's tables (the tabpurge code is only run during
> AV worker launch).  So what we need to figure out is why it's taking so
> obscenely long to vacuum these tables ...
>

Hopefully changing those three vacuum_cost_* params will speed up the
manual- and auto-vacuums.. it'll take me a few days to see any
results, since I still need to do something about the bloat that's
already there.

Josh

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Greg Smith

Josh Kupershmidt wrote:

SELECT name, current_setting(name), source FROM pg_settings WHERE
source != 'default' AND name ILIKE '%vacuum%';
 name | current_setting |   source
--+-+
 vacuum_cost_delay| 200ms   | configuration file
 vacuum_cost_limit| 100 | configuration file
 vacuum_cost_page_hit | 6   | configuration file

  
Hopefully changing those three vacuum_cost_* params will speed up the

manual- and auto-vacuums..


Those only impact manual VACUUM statements.  There's a different set 
with names like autovacuum_vacuum_cost_delay that control the daemon.  
You can set those to "-1" in order to match the regular VACUUM, but 
that's not the default.


You really need to sort out the max_fsm_pages setting too, because until 
that issue goes away these tables are unlikely to ever stop growing.  
And, no, you can't use CLUSTER on the system tables to clean those up.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Josh Kupershmidt
On Fri, Apr 16, 2010 at 2:14 PM, Greg Smith  wrote:
> Josh Kupershmidt wrote:
>>
>> SELECT name, current_setting(name), source FROM pg_settings WHERE
>> source != 'default' AND name ILIKE '%vacuum%';
>>         name         | current_setting |       source
>> --+-+
>>  vacuum_cost_delay    | 200ms           | configuration file
>>  vacuum_cost_limit    | 100             | configuration file
>>  vacuum_cost_page_hit | 6               | configuration file
>>
>>  Hopefully changing those three vacuum_cost_* params will speed up the
>> manual- and auto-vacuums..
>
> Those only impact manual VACUUM statements.  There's a different set with
> names like autovacuum_vacuum_cost_delay that control the daemon.  You can
> set those to "-1" in order to match the regular VACUUM, but that's not the
> default.

It looks like the default which I have of autovacuum_vacuum_cost_limit
= -1, which means it's inheriting the vacuum_cost_limit of 100 I had
set. I'll try bumping vacuum_cost_limit up to 1000 or so.

> You really need to sort out the max_fsm_pages setting too, because until
> that issue goes away these tables are unlikely to ever stop growing.  And,
> no, you can't use CLUSTER on the system tables to clean those up.

I have max_fsm_pages = 524288 , but from the hints in the logfiles
this obviously needs to go up much higher. And it seems the only way
to compact the pg_catalog tables is VACUUM FULL + REINDEX on 8.3 -- I
had tried the CLUSTER on my 9.0 machine and wrongly assumed it would
work on 8.3, too.

Josh

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Greg Smith

Josh Kupershmidt wrote:

And it seems the only way
to compact the pg_catalog tables is VACUUM FULL + REINDEX on 8.3 -- I
had tried the CLUSTER on my 9.0 machine and wrongly assumed it would
work on 8.3, too.
  


Right; that just got implemented a couple of months ago.  See the news 
from http://www.postgresql.org/community/weeklynews/pwn20100214 for a 
summary of how the code was gyrated around to support that.  This is a 
tough situation to get out of in <9.0 because VACUUM FULL is slow and 
takes an exclusive lock on the table.  That tends to lead toward an 
unpredictable window for required downtime, which is never good.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Tom Lane
Josh Kupershmidt  writes:
> name | current_setting |   source
> --+-+
>  vacuum_cost_delay| 200ms   | configuration file
>  vacuum_cost_limit| 100 | configuration file
>  vacuum_cost_page_hit | 6   | configuration file
> 
> It looks like the default which I have of autovacuum_vacuum_cost_limit
> = -1, which means it's inheriting the vacuum_cost_limit of 100 I had
> set. I'll try bumping vacuum_cost_limit up to 1000 or so.

Actually I think the main problem is that cost_delay value, which is
probably an order of magnitude too high.  The way to limit vacuum's
I/O impact on other stuff is to make it take frequent short delays,
not have it run full speed and then sleep a long time.  In any case,
your current settings have got it sleeping way too much.  Two WEEKS !!!??

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] stats collector suddenly causing lots of IO

2010-04-16 Thread Josh Kupershmidt
On Fri, Apr 16, 2010 at 3:22 PM, Tom Lane  wrote:
> Josh Kupershmidt  writes:
>>         name         | current_setting |       source
>> --+-+
>>  vacuum_cost_delay    | 200ms           | configuration file
>>  vacuum_cost_limit    | 100             | configuration file
>>  vacuum_cost_page_hit | 6               | configuration file
>>
>> It looks like the default which I have of autovacuum_vacuum_cost_limit
>> = -1, which means it's inheriting the vacuum_cost_limit of 100 I had
>> set. I'll try bumping vacuum_cost_limit up to 1000 or so.
>
> Actually I think the main problem is that cost_delay value, which is
> probably an order of magnitude too high.  The way to limit vacuum's
> I/O impact on other stuff is to make it take frequent short delays,
> not have it run full speed and then sleep a long time.  In any case,
> your current settings have got it sleeping way too much.  Two WEEKS !!!??

Yup, I was going to turn vacuum_cost_delay down to 20. The two weeks
was for the pg_catalog table which has bloated to 145 GB, I think. One
of those manual VACUUMs I kicked off just finished, after 48 hours --
and that table was only 25 GB or so. I wasn't the one who set up this
postgresql.conf, but I am stuck fixing things :/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Віталій Тимчишин
16 квітня 2010 р. 17:19 Tom Lane  написав:

> =?KOI8-U?B?96bUwcymyiD0yc3eydvJzg==?=  writes:
> > I've thought and someone in this list've told me that this should be done
> > automatically.
>
> As was pointed out, even if we had such logic it wouldn't apply in this
> example, because the equality conditions aren't real equalities but
> OUTER JOIN conditions.
>
>
In this case you can copy condition to "ON" condition, not to where cause
and this would work correct, e.g. "select something from a join b on a.x=b.y
where a.x > n" <=> "select something from a join b on a.x=b.y and b.y > n
where a.x > n".

As of making planner more clever, may be it is possible to introduce
division on "fast queries" and "long queries", so that if after fast
planning cost is greater then some configurable threshold, advanced planning
techniques (or settings) are used. As far as I have seen in this list, many
techniques are not used simply because they are too complex and could make
planning take too much time for really fast queries, but they are vital for
long ones.
Also same (or similar) threshold could be used to enable replanning for each
run of prepared query - also an often complaint is that planned query is not
that fast as is could be.

-- 
Best regards,
Vitalii Tymchyshyn