Re: [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Tatsuo Ishii
>>> We recently fixed a couple of O(N^2) loops in pg_dump, but those covered
>>> extremely specific cases that might or might not have anything to do
>>> with what you're seeing.  The complainant was extremely helpful about
>>> tracking down the problems:
>>> http://archives.postgresql.org/pgsql-general/2012-03/msg00957.php
>>> http://archives.postgresql.org/pgsql-committers/2012-03/msg00225.php
>>> http://archives.postgresql.org/pgsql-committers/2012-03/msg00230.php
>> 
>> I'm wondering if these fixes (or today's commit) include the case for
>> a database has ~100 thounsands of tables, indexes. One of my customers
>> has had troubles with pg_dump for the database, it takes over 10
>> hours.
> 
> So I did qucik test with old PostgreSQL 9.0.2 and current (as of
> commit 2755abf386e6572bad15cb6a032e504ad32308cc). In a fresh initdb-ed
> database I created 100,000 tables, and each has two integer
> attributes, one of them is a primary key. Creating tables were
> resonably fast as expected (18-20 minutes). This created a 1.4GB
> database cluster.
> 
> pg_dump dbname >/dev/null took 188 minutes on 9.0.2, which was pretty
> long time as the customer complained. Now what was current?  Well it
> took 125 minutes. Ps showed that most of time was spent in backend.
> 
> Below is the script to create tables.
> 
> cnt=10
> while [ $cnt -gt 0 ]
> do
> psql -e -p 5432 -c "create table t$cnt(i int primary key, j int);" test
> cnt=`expr $cnt - 1`
> done
> 
> p.s. You need to increate max_locks_per_transaction before running
> pg_dump (I raised to 640 in my case).

Just for record, I rerun the test again with my single-LOCK patch, and
now total runtime of pg_dump is 113 minutes.
188 minutes(9.0)->125 minutes(git master)->113 minutes(with my patch).

So far, I'm glad to see 40% time savings at this point.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Robert Klemme
On Thu, May 31, 2012 at 10:45 AM, Tatsuo Ishii  wrote:
> Just for record, I rerun the test again with my single-LOCK patch, and
> now total runtime of pg_dump is 113 minutes.
> 188 minutes(9.0)->125 minutes(git master)->113 minutes(with my patch).
>
> So far, I'm glad to see 40% time savings at this point.

I see only 9.6% savings (100 * (113/125 - 1)).  What am I missing?

Cheers

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Tatsuo Ishii
> On Thu, May 31, 2012 at 10:45 AM, Tatsuo Ishii  wrote:
>> Just for record, I rerun the test again with my single-LOCK patch, and
>> now total runtime of pg_dump is 113 minutes.
>> 188 minutes(9.0)->125 minutes(git master)->113 minutes(with my patch).
>>
>> So far, I'm glad to see 40% time savings at this point.
> 
> I see only 9.6% savings (100 * (113/125 - 1)).  What am I missing?

What I meant was (100 * (113/188 - 1)).
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Robert Klemme
On Thu, May 31, 2012 at 4:07 PM, Tatsuo Ishii  wrote:
>> On Thu, May 31, 2012 at 10:45 AM, Tatsuo Ishii  wrote:
>>> Just for record, I rerun the test again with my single-LOCK patch, and
>>> now total runtime of pg_dump is 113 minutes.
>>> 188 minutes(9.0)->125 minutes(git master)->113 minutes(with my patch).
>>>
>>> So far, I'm glad to see 40% time savings at this point.
>>
>> I see only 9.6% savings (100 * (113/125 - 1)).  What am I missing?
>
> What I meant was (100 * (113/188 - 1)).

OK, my fault was to assume you wanted to measure only your part, while
apparently you meant overall savings.  But Tom had asked for separate
measurements if I understood him correctly.  Also, that measurement of
your change would go after the O(N^2) fix.  It could actually turn out
to be much more than 9% because the overall time would be reduced even
more dramatic.  So it might actually be good for your fix to wait a
bit. ;-)

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Claudio Freire
On Thu, May 31, 2012 at 11:17 AM, Robert Klemme
 wrote:
>
> OK, my fault was to assume you wanted to measure only your part, while
> apparently you meant overall savings.  But Tom had asked for separate
> measurements if I understood him correctly.  Also, that measurement of
> your change would go after the O(N^2) fix.  It could actually turn out
> to be much more than 9% because the overall time would be reduced even
> more dramatic.  So it might actually be good for your fix to wait a
> bit. ;-)

It's not clear whether Tom is already working on that O(N^2) fix in locking.

I'm asking because it doesn't seem like a complicated patch,
contributors may want to get working if not ;-)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Tom Lane
Claudio Freire  writes:
> It's not clear whether Tom is already working on that O(N^2) fix in locking.

I'm not; Jeff Janes is.  But you shouldn't be holding your breath
anyway, since it's 9.3 material at this point.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Robert Haas
On Thu, May 31, 2012 at 10:31 AM, Tom Lane  wrote:
> Claudio Freire  writes:
>> It's not clear whether Tom is already working on that O(N^2) fix in locking.
>
> I'm not; Jeff Janes is.  But you shouldn't be holding your breath
> anyway, since it's 9.3 material at this point.

I agree we can't back-patch that change, but then I think we ought to
consider back-patching some variant of Tatsuo's patch.  Maybe it's not
reasonable to thunk an arbitrary number of relation names in there on
one line, but how about 1000 relations per LOCK statement or so?  I
guess we'd need to see how much that erodes the benefit, but we've
certainly done back-branch rearrangements in pg_dump in the past to
fix various kinds of issues, and this is pretty non-invasive.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Tom Lane
Robert Haas  writes:
> On Thu, May 31, 2012 at 10:31 AM, Tom Lane  wrote:
>> I'm not; Jeff Janes is.  But you shouldn't be holding your breath
>> anyway, since it's 9.3 material at this point.

> I agree we can't back-patch that change, but then I think we ought to
> consider back-patching some variant of Tatsuo's patch.  Maybe it's not
> reasonable to thunk an arbitrary number of relation names in there on
> one line, but how about 1000 relations per LOCK statement or so?  I
> guess we'd need to see how much that erodes the benefit, but we've
> certainly done back-branch rearrangements in pg_dump in the past to
> fix various kinds of issues, and this is pretty non-invasive.

I am not convinced either that this patch will still be useful after
Jeff's fix goes in, or that it provides any meaningful savings when
you consider a complete pg_dump run.  Yeah, it will make the lock
acquisition phase faster, but that's not a big part of the runtime
except in very limited scenarios (--schema-only, perhaps).

The performance patches we applied to pg_dump over the past couple weeks
were meant to relieve pain in situations where the big server-side
lossage wasn't the dominant factor in runtime (ie, partial dumps).
But this one is targeting exactly that area, which is why it looks like
a band-aid and not a fix to me.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Bruce Momjian
On Thu, May 31, 2012 at 10:50:51AM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > On Thu, May 31, 2012 at 10:31 AM, Tom Lane  wrote:
> >> I'm not; Jeff Janes is. �But you shouldn't be holding your breath
> >> anyway, since it's 9.3 material at this point.
> 
> > I agree we can't back-patch that change, but then I think we ought to
> > consider back-patching some variant of Tatsuo's patch.  Maybe it's not
> > reasonable to thunk an arbitrary number of relation names in there on
> > one line, but how about 1000 relations per LOCK statement or so?  I
> > guess we'd need to see how much that erodes the benefit, but we've
> > certainly done back-branch rearrangements in pg_dump in the past to
> > fix various kinds of issues, and this is pretty non-invasive.
> 
> I am not convinced either that this patch will still be useful after
> Jeff's fix goes in, or that it provides any meaningful savings when
> you consider a complete pg_dump run.  Yeah, it will make the lock
> acquisition phase faster, but that's not a big part of the runtime
> except in very limited scenarios (--schema-only, perhaps).

FYI, that is the pg_upgrade use-case, and pg_dump/restore time is
reportedly taking the majority of time in many cases.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Claudio Freire
On Thu, May 31, 2012 at 11:50 AM, Tom Lane  wrote:
> The performance patches we applied to pg_dump over the past couple weeks
> were meant to relieve pain in situations where the big server-side
> lossage wasn't the dominant factor in runtime (ie, partial dumps).
> But this one is targeting exactly that area, which is why it looks like
> a band-aid and not a fix to me.

No, Tatsuo's patch attacks a phase dominated by latency in some
setups. That it's also becoming slow currently because of the locking
cost is irrelevant, with locking sped up, the patch should only
improve the phase even further. Imagine the current timeline:

* = locking
. = waiting

*.*.**.**.***.***...*.

Tatsuo's patch converts it to:

*.**

The locking fix would turn the timeline into:

*.*.*.*.*.*.*

Tatsuo's patch would turn that into:

***

And, as noted before, pg_dump --schema-only is a key bottleneck in pg_upgrade.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Robert Haas
On Thu, May 31, 2012 at 10:50 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, May 31, 2012 at 10:31 AM, Tom Lane  wrote:
>>> I'm not; Jeff Janes is.  But you shouldn't be holding your breath
>>> anyway, since it's 9.3 material at this point.
>
>> I agree we can't back-patch that change, but then I think we ought to
>> consider back-patching some variant of Tatsuo's patch.  Maybe it's not
>> reasonable to thunk an arbitrary number of relation names in there on
>> one line, but how about 1000 relations per LOCK statement or so?  I
>> guess we'd need to see how much that erodes the benefit, but we've
>> certainly done back-branch rearrangements in pg_dump in the past to
>> fix various kinds of issues, and this is pretty non-invasive.
>
> I am not convinced either that this patch will still be useful after
> Jeff's fix goes in, ...

But people on older branches are not going to GET Jeff's fix.

> or that it provides any meaningful savings when
> you consider a complete pg_dump run.  Yeah, it will make the lock
> acquisition phase faster, but that's not a big part of the runtime
> except in very limited scenarios (--schema-only, perhaps).

That is not a borderline scenario, as others have also pointed out.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Bruce Momjian
On Thu, May 31, 2012 at 11:04:12AM -0400, Robert Haas wrote:
> On Thu, May 31, 2012 at 10:50 AM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> On Thu, May 31, 2012 at 10:31 AM, Tom Lane  wrote:
> >>> I'm not; Jeff Janes is.  But you shouldn't be holding your breath
> >>> anyway, since it's 9.3 material at this point.
> >
> >> I agree we can't back-patch that change, but then I think we ought to
> >> consider back-patching some variant of Tatsuo's patch.  Maybe it's not
> >> reasonable to thunk an arbitrary number of relation names in there on
> >> one line, but how about 1000 relations per LOCK statement or so?  I
> >> guess we'd need to see how much that erodes the benefit, but we've
> >> certainly done back-branch rearrangements in pg_dump in the past to
> >> fix various kinds of issues, and this is pretty non-invasive.
> >
> > I am not convinced either that this patch will still be useful after
> > Jeff's fix goes in, ...
> 
> But people on older branches are not going to GET Jeff's fix.

FYI, if it got into Postgres 9.2, everyone upgrading to Postgres 9.2
would benefit because pg_upgrade uses the new cluster's pg_dumpall.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Tom Lane
Claudio Freire  writes:
> On Thu, May 31, 2012 at 11:50 AM, Tom Lane  wrote:
>> The performance patches we applied to pg_dump over the past couple weeks
>> were meant to relieve pain in situations where the big server-side
>> lossage wasn't the dominant factor in runtime (ie, partial dumps).
>> But this one is targeting exactly that area, which is why it looks like
>> a band-aid and not a fix to me.

> No, Tatsuo's patch attacks a phase dominated by latency in some
> setups.

No, it does not.  The reason it's a win is that it avoids the O(N^2)
behavior in the server.  Whether the bandwidth savings is worth worrying
about cannot be proven one way or the other as long as that elephant
is in the room.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Claudio Freire
On Thu, May 31, 2012 at 12:25 PM, Tom Lane  wrote:
>> No, Tatsuo's patch attacks a phase dominated by latency in some
>> setups.
>
> No, it does not.  The reason it's a win is that it avoids the O(N^2)
> behavior in the server.  Whether the bandwidth savings is worth worrying
> about cannot be proven one way or the other as long as that elephant
> is in the room.
>
>                        regards, tom lane

I understand that, but if the locking is fixed and made to be O(N)
(and hence each table locking O(1)), then latency suddenly becomes the
dominating factor.

I'm thinking, though, pg_upgrade runs locally, contrary to pg_dump
backups, so in that case latency would be negligible and Tatsuo's
patch inconsequential.

I'm also thinking, whether the ResourceOwner patch you've proposed
would get negated by Tatsuo's patch, because suddenly a "portal"
(IIRC) has a lot more locks than ResourceOwner could accomodate,
forcing a reversal to O(N²) behavior. In that case, that patch would
in fact be detrimental... huh... way to go 180

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Trouble with plan statistics for behaviour for query.

2012-05-31 Thread Trevor Campbell

We are having trouble with a particular query being slow in a strange manner.

The query is a join over two large tables that are suitably indexed.

select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, 
CI.NEWSTRING
   from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID = CI.GROUPID where CG.ISSUEID=? order by 
CG.CREATED asc, CI.ID asc


For some tasks we run this particular query a very large number of times and it has a significant performance impact 
when it runs slowly.


If we run ANALYSE over the CHANGEITEM table then the performance picks up by a factor of 5 or more.  The problem is that 
a day later the performance will have dropped back to its previously slow state.


The reason this is so hard to understand is that the activity on this table is very low, with no updates and only a 
relatively small number of inserts each day, < 0.1% of the table size.


Explain output:
Sort  (cost=86.90..86.93 rows=11 width=118) (actual time=0.086..0.087 rows=14 
loops=1)
  Sort Key: cg.created, ci.id
  Sort Method: quicksort  Memory: 26kB
  ->  Nested Loop  (cost=0.00..86.71 rows=11 width=118) (actual 
time=0.022..0.061 rows=14 loops=1)
->  Index Scan using chggroup_issue on changegroup cg  (cost=0.00..17.91 rows=8 width=33) (actual 
time=0.012..0.015 rows=7 loops=1)

  Index Cond: (issueid = 81001::numeric)
->  Index Scan using chgitem_chggrp on changeitem ci  (cost=0.00..8.58 rows=2 width=91) (actual 
time=0.005..0.005 rows=2 loops=7)

  Index Cond: (groupid = cg.id)
Total runtime: 0.116 ms

The explain output always seems the same even when the performance is poor, but 
I can't be sure of that.

Overall it seems like PostgreSQL just forgets about the statistics it has 
gathered after a short while.

Schema details:
CREATE TABLE changegroup
(
  id numeric(18,0) NOT NULL,
  issueid numeric(18,0),
  author character varying(255),
  created timestamp with time zone,
  CONSTRAINT pk_changegroup PRIMARY KEY (id )
)
WITH (
  OIDS=FALSE
);
CREATE INDEX chggroup_issue
  ON changegroup
  USING btree
  (issueid );

CREATE TABLE changeitem
(
  id numeric(18,0) NOT NULL,
  groupid numeric(18,0),
  fieldtype character varying(255),
  field character varying(255),
  oldvalue text,
  oldstring text,
  newvalue text,
  newstring text,
  CONSTRAINT pk_changeitem PRIMARY KEY (id )
)
WITH (
  OIDS=FALSE
);

CREATE INDEX chgitem_chggrp
  ON changeitem
  USING btree
  (groupid );

CREATE INDEX chgitem_field
  ON changeitem
  USING btree
  (field COLLATE pg_catalog."default" );

Table sizes
changegroup  : 2,000,000 rows
changeitem:  2,500,000  rows

The changegroup table has on average about 4 rows per issueid value, which is 
the query parameter.

We run autovacuum and autoanalyse, but as the activity in the table is low 
these are rarely if ever invoked on these tables.

Environment.
Testing using PostgreSQL 9.1.3 on x86_64-redhat-linux-gnu,  although this is a problem across a variety of postgres 
versions.




Re: [PERFORM] Trouble with plan statistics for behaviour for query.

2012-05-31 Thread Craig James
On Thu, May 31, 2012 at 3:29 PM, Trevor Campbell wrote:

>  We are having trouble with a particular query being slow in a strange
> manner.
>
> The query is a join over two large tables that are suitably indexed.
>
> select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE,
> CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
>from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID =
> CI.GROUPID where CG.ISSUEID=? order by CG.CREATED asc, CI.ID asc
>

This has an unbound variable '?' in it.


> For some tasks we run this particular query a very large number of times
> and it has a significant performance impact when it runs slowly.
>
> If we run ANALYSE over the CHANGEITEM table then the performance picks up
> by a factor of 5 or more.  The problem is that a day later the performance
> will have dropped back to its previously slow state.
>
> The reason this is so hard to understand is that the activity on this
> table is very low, with no updates and only a relatively small number of
> inserts each day, < 0.1% of the table size.
>
> Explain output:
> Sort  (cost=86.90..86.93 rows=11 width=118) (actual time=0.086..0.087
> rows=14 loops=1)
>   Sort Key: cg.created, ci.id
>   Sort Method: quicksort  Memory: 26kB
>   ->  Nested Loop  (cost=0.00..86.71 rows=11 width=118) (actual
> time=0.022..0.061 rows=14 loops=1)
> ->  Index Scan using chggroup_issue on changegroup cg
> (cost=0.00..17.91 rows=8 width=33) (actual time=0.012..0.015 rows=7 loops=1)
>   Index Cond: (issueid = 81001::numeric)
> ->  Index Scan using chgitem_chggrp on changeitem ci
> (cost=0.00..8.58 rows=2 width=91) (actual time=0.005..0.005 rows=2 loops=7)
>   Index Cond: (groupid = cg.id)
> Total runtime: 0.116 ms
>

What's the exact SQL you used to get this ... did you use a specific
CG.ISSUEID to run your test?  If that's the case, this EXPLAIN ANALYZE
won't be the same as the one generated for your actual application.

Craig



>
> The explain output always seems the same even when the performance is
> poor, but I can't be sure of that.
>
> Overall it seems like PostgreSQL just forgets about the statistics it has
> gathered after a short while.
>
> Schema details:
> CREATE TABLE changegroup
> (
>   id numeric(18,0) NOT NULL,
>   issueid numeric(18,0),
>   author character varying(255),
>   created timestamp with time zone,
>   CONSTRAINT pk_changegroup PRIMARY KEY (id )
> )
> WITH (
>   OIDS=FALSE
> );
> CREATE INDEX chggroup_issue
>   ON changegroup
>   USING btree
>   (issueid );
>
> CREATE TABLE changeitem
> (
>   id numeric(18,0) NOT NULL,
>   groupid numeric(18,0),
>   fieldtype character varying(255),
>   field character varying(255),
>   oldvalue text,
>   oldstring text,
>   newvalue text,
>   newstring text,
>   CONSTRAINT pk_changeitem PRIMARY KEY (id )
> )
> WITH (
>   OIDS=FALSE
> );
>
> CREATE INDEX chgitem_chggrp
>   ON changeitem
>   USING btree
>   (groupid );
>
> CREATE INDEX chgitem_field
>   ON changeitem
>   USING btree
>   (field COLLATE pg_catalog."default" );
>
> Table sizes
> changegroup  :  2,000,000 rows
> changeitem:  2,500,000  rows
>
> The changegroup table has on average about 4 rows per issueid value, which
> is the query parameter.
>
> We run autovacuum and autoanalyse, but as the activity in the table is low
> these are rarely if ever invoked on these tables.
>
> Environment.
> Testing using PostgreSQL 9.1.3 on x86_64-redhat-linux-gnu,  although this
> is a problem across a variety of postgres versions.
>
>


Re: [PERFORM] does the query planner consider work_mem?

2012-05-31 Thread Mark Kirkwood

On 31/05/12 05:57, Murat Tasan wrote:

hi all - i'm having a bit of trouble with some queries that are
running painfully slowly after migrating my database from one machine
using PostgreSQL 8.2 to another machine with PostgreSQL 8.4.
as far as i can tell, the two *servers* (not the physical machines)
are set up pretty close to identically, and as far as query planning
is concerned, the only setting that seems to be different is
'default_statistics_target', which is 10 on the 8.2 sever and 100 on
the 8.4 server (the defaults)... and presumably this should be giving
the 8.4 server more data to work with for better query plans (AFAIK).
(other settings, e.g. cost estimates for page/row/etc access are
identical between the servers.)


It would probably be useful know what release of 8.4 you have - i.e 
8.4.x. There were some significant planner changes at 8.4.9 or thereabouts.


I think it would also be useful to know all of your non default 
parameters for 8.4 (SELECT name,setting FROM pg_settings WHERE source != 
'default').



3) here's the biggest problem/issue in my brain: work_mem on the 8.2
server was also set to the 1 MB default! but ran quite speedily!
the full migration will take a while, so there will still be query
development/optimization on one system, and i'd love for those many
hours testing to be worth something when ported over to the other
system.
in this particular example, the Nested Loop seems to fit in the 1 MB
work_mem space on the 8.2 server, but not the 8.4? does this seem
right to anybody?





Well 8.4 has 100 stats buckets to get distribution info, so typically 
has a better idea about things, however sometimes more info is just 
enough to tip the planner into believing that it needs more space to do 
something. The other possibility is that the 8.2 box is 32-bit and the 
8.4 one is 64-bit and really does need more memory to hold the loop data 
structures.


Regards

Mark

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Trouble with plan statistics for behaviour for query.

2012-05-31 Thread Trevor Campbell

On 01/06/12 08:55, Craig James wrote:



On Thu, May 31, 2012 at 3:29 PM, Trevor Campbell mailto:tcampb...@atlassian.com>> wrote:

We are having trouble with a particular query being slow in a strange 
manner.

The query is a join over two large tables that are suitably indexed.

select CG.ID , CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID 
, CI.FIELDTYPE, CI.FIELD,
CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
   from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID 
 = CI.GROUPID where
CG.ISSUEID=? order by CG.CREATED asc, CI.ID  asc


This has an unbound variable '?' in it.
These queries are being run from a java application using JDBC and when run the variable is bound to an long integer 
value.  While trying to investigate the problem, I have been just hard coding a value in the statement.



For some tasks we run this particular query a very large number of times 
and it has a significant performance
impact when it runs slowly.

If we run ANALYSE over the CHANGEITEM table then the performance picks up 
by a factor of 5 or more.  The problem
is that a day later the performance will have dropped back to its 
previously slow state.

The reason this is so hard to understand is that the activity on this table 
is very low, with no updates and only
a relatively small number of inserts each day, < 0.1% of the table size.

Explain output:
Sort  (cost=86.90..86.93 rows=11 width=118) (actual time=0.086..0.087 
rows=14 loops=1)
  Sort Key: cg.created, ci.id 
  Sort Method: quicksort  Memory: 26kB
  ->  Nested Loop  (cost=0.00..86.71 rows=11 width=118) (actual 
time=0.022..0.061 rows=14 loops=1)
->  Index Scan using chggroup_issue on changegroup cg  
(cost=0.00..17.91 rows=8 width=33) (actual
time=0.012..0.015 rows=7 loops=1)
  Index Cond: (issueid = 81001::numeric)
->  Index Scan using chgitem_chggrp on changeitem ci  
(cost=0.00..8.58 rows=2 width=91) (actual
time=0.005..0.005 rows=2 loops=7)
  Index Cond: (groupid = cg.id )
Total runtime: 0.116 ms


What's the exact SQL you used to get this ... did you use a specific CG.ISSUEID to run your test?  If that's the case, 
this EXPLAIN ANALYZE won't be the same as the one generated for your actual application.


Craig


The explain output always seems the same even when the performance is poor, 
but I can't be sure of that.

Overall it seems like PostgreSQL just forgets about the statistics it has 
gathered after a short while.

Schema details:
CREATE TABLE changegroup
(
  id numeric(18,0) NOT NULL,
  issueid numeric(18,0),
  author character varying(255),
  created timestamp with time zone,
  CONSTRAINT pk_changegroup PRIMARY KEY (id )
)
WITH (
  OIDS=FALSE
);
CREATE INDEX chggroup_issue
  ON changegroup
  USING btree
  (issueid );

CREATE TABLE changeitem
(
  id numeric(18,0) NOT NULL,
  groupid numeric(18,0),
  fieldtype character varying(255),
  field character varying(255),
  oldvalue text,
  oldstring text,
  newvalue text,
  newstring text,
  CONSTRAINT pk_changeitem PRIMARY KEY (id )
)
WITH (
  OIDS=FALSE
);

CREATE INDEX chgitem_chggrp
  ON changeitem
  USING btree
  (groupid );

CREATE INDEX chgitem_field
  ON changeitem
  USING btree
  (field COLLATE pg_catalog."default" );

Table sizes
changegroup  : 2,000,000 rows
changeitem:  2,500,000  rows

The changegroup table has on average about 4 rows per issueid value, which 
is the query parameter.

We run autovacuum and autoanalyse, but as the activity in the table is low 
these are rarely if ever invoked on
these tables.

Environment.
Testing using PostgreSQL 9.1.3 on x86_64-redhat-linux-gnu,  although this 
is a problem across a variety of
postgres versions.




Re: [PERFORM] Trouble with plan statistics for behaviour for query.

2012-05-31 Thread Craig James
On Thu, May 31, 2012 at 4:01 PM, Trevor Campbell wrote:

>  On 01/06/12 08:55, Craig James wrote:
>
>
>
> On Thu, May 31, 2012 at 3:29 PM, Trevor Campbell 
> wrote:
>
>>  We are having trouble with a particular query being slow in a strange
>> manner.
>>
>> The query is a join over two large tables that are suitably indexed.
>>
>> select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE,
>> CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
>>from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID =
>> CI.GROUPID where CG.ISSUEID=? order by CG.CREATED asc, CI.ID asc
>>
>
> This has an unbound variable '?' in it.
>
> These queries are being run from a java application using JDBC and when
> run the variable is bound to an long integer value.  While trying to
> investigate the problem, I have been just hard coding a value in the
> statement.
>

I use Perl, not JDBC, but this thread may be relevant to your problem.

http://postgresql.1045698.n5.nabble.com/Slow-statement-when-using-JDBC-td3368379.html

Craig

>
>
>> For some tasks we run this particular query a very large number of times
>> and it has a significant performance impact when it runs slowly.
>>
>> If we run ANALYSE over the CHANGEITEM table then the performance picks
>> up by a factor of 5 or more.  The problem is that a day later the
>> performance will have dropped back to its previously slow state.
>>
>> The reason this is so hard to understand is that the activity on this
>> table is very low, with no updates and only a relatively small number of
>> inserts each day, < 0.1% of the table size.
>>
>> Explain output:
>> Sort  (cost=86.90..86.93 rows=11 width=118) (actual time=0.086..0.087
>> rows=14 loops=1)
>>   Sort Key: cg.created, ci.id
>>   Sort Method: quicksort  Memory: 26kB
>>   ->  Nested Loop  (cost=0.00..86.71 rows=11 width=118) (actual
>> time=0.022..0.061 rows=14 loops=1)
>> ->  Index Scan using chggroup_issue on changegroup cg
>> (cost=0.00..17.91 rows=8 width=33) (actual time=0.012..0.015 rows=7 loops=1)
>>   Index Cond: (issueid = 81001::numeric)
>> ->  Index Scan using chgitem_chggrp on changeitem ci
>> (cost=0.00..8.58 rows=2 width=91) (actual time=0.005..0.005 rows=2 loops=7)
>>   Index Cond: (groupid = cg.id)
>> Total runtime: 0.116 ms
>>
>
> What's the exact SQL you used to get this ... did you use a specific
> CG.ISSUEID to run your test?  If that's the case, this EXPLAIN ANALYZE
> won't be the same as the one generated for your actual application.
>
> Craig
>
>
>
>>
>> The explain output always seems the same even when the performance is
>> poor, but I can't be sure of that.
>>
>> Overall it seems like PostgreSQL just forgets about the statistics it has
>> gathered after a short while.
>>
>> Schema details:
>> CREATE TABLE changegroup
>> (
>>   id numeric(18,0) NOT NULL,
>>   issueid numeric(18,0),
>>   author character varying(255),
>>   created timestamp with time zone,
>>   CONSTRAINT pk_changegroup PRIMARY KEY (id )
>> )
>> WITH (
>>   OIDS=FALSE
>> );
>> CREATE INDEX chggroup_issue
>>   ON changegroup
>>   USING btree
>>   (issueid );
>>
>> CREATE TABLE changeitem
>> (
>>   id numeric(18,0) NOT NULL,
>>   groupid numeric(18,0),
>>   fieldtype character varying(255),
>>   field character varying(255),
>>   oldvalue text,
>>   oldstring text,
>>   newvalue text,
>>   newstring text,
>>   CONSTRAINT pk_changeitem PRIMARY KEY (id )
>> )
>> WITH (
>>   OIDS=FALSE
>> );
>>
>> CREATE INDEX chgitem_chggrp
>>   ON changeitem
>>   USING btree
>>   (groupid );
>>
>> CREATE INDEX chgitem_field
>>   ON changeitem
>>   USING btree
>>   (field COLLATE pg_catalog."default" );
>>
>> Table sizes
>> changegroup  :  2,000,000 rows
>> changeitem:  2,500,000  rows
>>
>> The changegroup table has on average about 4 rows per issueid value,
>> which is the query parameter.
>>
>> We run autovacuum and autoanalyse, but as the activity in the table is
>> low these are rarely if ever invoked on these tables.
>>
>> Environment.
>> Testing using PostgreSQL 9.1.3 on x86_64-redhat-linux-gnu,  although this
>> is a problem across a variety of postgres versions.
>>
>>
>


Re: [PERFORM] Trouble with plan statistics for behaviour for query.

2012-05-31 Thread Trevor Campbell

Thanks Craig, that certainly leads down the right path.

The following is all done in pgAdmin3:

Using an actual value we I get the plan I expect
explain analyze select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, 
CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
   from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID = CI.GROUPID where CG.ISSUEID=10006 order by 
CG.CREATED asc, CI.ID asc


"Sort  (cost=106.18..106.22 rows=13 width=434) (actual time=0.115..0.115 rows=12 
loops=1)"
"  Sort Key: cg.created, ci.id"
"  Sort Method: quicksort  Memory: 29kB"
"  ->  Nested Loop  (cost=0.00..105.94 rows=13 width=434) (actual time=0.019..0.067 
rows=12 loops=1)"
"->  Index Scan using chggroup_issue on changegroup cg  (cost=0.00..19.73 rows=10 width=29) (actual 
time=0.009..0.013 rows=10 loops=1)"

"  Index Cond: (issueid = 10006::numeric)"
"->  Index Scan using chgitem_chggrp on changeitem ci  (cost=0.00..8.58 rows=3 width=411) (actual 
time=0.004..0.005 rows=1 loops=10)"

"  Index Cond: (groupid = cg.id)"
"Total runtime: 0.153 ms"

Using a prepared statement with a variable , I get a poor plan requiring a 
sequential scan
prepare t2(real) as
  select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, 
CI.NEWVALUE, CI.NEWSTRING
   from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID = CI.GROUPID where CG.ISSUEID=$1 order by 
CG.CREATED asc, CI.ID asc;


  explain analyze execute t2 (10006);

"Sort  (cost=126448.89..126481.10 rows=12886 width=434) (actual 
time=1335.615..1335.616 rows=12 loops=1)"
"  Sort Key: cg.created, ci.id"
"  Sort Method: quicksort  Memory: 29kB"
"  ->  Nested Loop  (cost=0.00..125569.19 rows=12886 width=434) (actual 
time=0.046..1335.556 rows=12 loops=1)"
"->  Seq Scan on changegroup cg  (cost=0.00..44709.26 rows=10001 width=29) (actual time=0.026..1335.460 rows=10 
loops=1)"

"  Filter: ((issueid)::double precision = $1)"
"->  Index Scan using chgitem_chggrp on changeitem ci  (cost=0.00..8.05 rows=3 width=411) (actual 
time=0.007..0.008 rows=1 loops=10)"

"  Index Cond: (groupid = cg.id)"
"Total runtime: 1335.669 ms"

Using a prepared statement with a cast of the variable to the right type, I get 
the good plan back
prepare t2(real) as
  select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, 
CI.NEWVALUE, CI.NEWSTRING
   from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID = CI.GROUPID where CG.ISSUEID=cast($1 as 
numeric) order by CG.CREATED asc, CI.ID asc;


  explain analyze execute t2 (10006);

"Sort  (cost=106.19..106.22 rows=13 width=434) (actual time=0.155..0.156 rows=12 
loops=1)"
"  Sort Key: cg.created, ci.id"
"  Sort Method: quicksort  Memory: 29kB"
"  ->  Nested Loop  (cost=0.00..105.95 rows=13 width=434) (actual time=0.048..0.111 
rows=12 loops=1)"
"->  Index Scan using chggroup_issue on changegroup cg  (cost=0.00..19.73 rows=10 width=29) (actual 
time=0.031..0.042 rows=10 loops=1)"

"  Index Cond: (issueid = ($1)::numeric)"
"->  Index Scan using chgitem_chggrp on changeitem ci  (cost=0.00..8.58 rows=3 width=411) (actual 
time=0.006..0.006 rows=1 loops=10)"

"  Index Cond: (groupid = cg.id)"
"Total runtime: 0.203 ms"

Now the challenge is to get java/jdbc to get this done right.  We make a big effort to ensure we always use prepared 
statements and variable bindings to help protect from SQL injection vulnerabilities.




On 01/06/12 09:08, Craig James wrote:

I use Perl, not JDBC, but this thread may be relevant to your problem.

http://postgresql.1045698.n5.nabble.com/Slow-statement-when-using-JDBC-td3368379.html




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Trouble with plan statistics for behaviour for query.

2012-05-31 Thread Craig James
On Thu, May 31, 2012 at 4:34 PM, Trevor Campbell wrote:

> Thanks Craig, that certainly leads down the right path.
>
> The following is all done in pgAdmin3:
>
> Using an actual value we I get the plan I expect
> explain analyze select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID,
> CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
>   from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID =
> CI.GROUPID where CG.ISSUEID=10006 order by CG.CREATED asc, CI.ID asc
>
> "Sort  (cost=106.18..106.22 rows=13 width=434) (actual time=0.115..0.115
> rows=12 loops=1)"
> "  Sort Key: cg.created, ci.id"
> "  Sort Method: quicksort  Memory: 29kB"
> "  ->  Nested Loop  (cost=0.00..105.94 rows=13 width=434) (actual
> time=0.019..0.067 rows=12 loops=1)"
> "->  Index Scan using chggroup_issue on changegroup cg
>  (cost=0.00..19.73 rows=10 width=29) (actual time=0.009..0.013 rows=10
> loops=1)"
> "  Index Cond: (issueid = 10006::numeric)"
> "->  Index Scan using chgitem_chggrp on changeitem ci
>  (cost=0.00..8.58 rows=3 width=411) (actual time=0.004..0.005 rows=1
> loops=10)"
> "  Index Cond: (groupid = cg.id)"
> "Total runtime: 0.153 ms"
>
> Using a prepared statement with a variable , I get a poor plan requiring a
> sequential scan
> prepare t2(real) as
>  select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE,
> CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
>   from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID =
> CI.GROUPID where CG.ISSUEID=$1 order by CG.CREATED asc, CI.ID asc;
>
>  explain analyze execute t2 (10006);
>
> "Sort  (cost=126448.89..126481.10 rows=12886 width=434) (actual
> time=1335.615..1335.616 rows=12 loops=1)"
> "  Sort Key: cg.created, ci.id"
> "  Sort Method: quicksort  Memory: 29kB"
> "  ->  Nested Loop  (cost=0.00..125569.19 rows=12886 width=434) (actual
> time=0.046..1335.556 rows=12 loops=1)"
> "->  Seq Scan on changegroup cg  (cost=0.00..44709.26 rows=10001
> width=29) (actual time=0.026..1335.460 rows=10 loops=1)"
> "  Filter: ((issueid)::double precision = $1)"
> "->  Index Scan using chgitem_chggrp on changeitem ci
>  (cost=0.00..8.05 rows=3 width=411) (actual time=0.007..0.008 rows=1
> loops=10)"
> "  Index Cond: (groupid = cg.id)"
> "Total runtime: 1335.669 ms"
>
> Using a prepared statement with a cast of the variable to the right type,
> I get the good plan back
> prepare t2(real) as
>  select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE,
> CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
>   from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID =
> CI.GROUPID where CG.ISSUEID=cast($1 as numeric) order by CG.CREATED asc,
> CI.ID asc;
>
>  explain analyze execute t2 (10006);
>
> "Sort  (cost=106.19..106.22 rows=13 width=434) (actual time=0.155..0.156
> rows=12 loops=1)"
> "  Sort Key: cg.created, ci.id"
> "  Sort Method: quicksort  Memory: 29kB"
> "  ->  Nested Loop  (cost=0.00..105.95 rows=13 width=434) (actual
> time=0.048..0.111 rows=12 loops=1)"
> "->  Index Scan using chggroup_issue on changegroup cg
>  (cost=0.00..19.73 rows=10 width=29) (actual time=0.031..0.042 rows=10
> loops=1)"
> "  Index Cond: (issueid = ($1)::numeric)"
> "->  Index Scan using chgitem_chggrp on changeitem ci
>  (cost=0.00..8.58 rows=3 width=411) (actual time=0.006..0.006 rows=1
> loops=10)"
> "  Index Cond: (groupid = cg.id)"
> "Total runtime: 0.203 ms"
>
> Now the challenge is to get java/jdbc to get this done right.  We make a
> big effort to ensure we always use prepared statements and variable
> bindings to help protect from SQL injection vulnerabilities.
>

JDBC has some features that are supposed to be convenient (automatic
preparing based on a number-of-executions threshold) that strike me as
misguided.  It's one thing to hide irrelevant details from the app, and
another thing entirely to cause a huge change in the exact SQL that's sent
to the server ... which is what JDBC seems to do.

I think the trick is that if you use JDBC prepared statements, you have to
understand how it's trying to be trickly and circumvent it so that you're
always in full control of what it's doing.

Craig


>
>
>
> On 01/06/12 09:08, Craig James wrote:
>
>> I use Perl, not JDBC, but this thread may be relevant to your problem.
>>
>> http://postgresql.1045698.n5.**nabble.com/Slow-statement-**
>> when-using-JDBC-td3368379.html
>>
>>
>>