Re: [HACKERS] Performance problem in PLPgSQL

2013-09-19 Thread dlight
We intsall postgresql 9.3.0 server from FreeBSD ports  
http://svnweb.freebsd.org/ports/head/databases/postgresql93-server/ the
administrator says that he already contains this patch.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Performance-problem-in-PLPgSQL-tp5764796p5771639.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance problem in PLPgSQL

2013-09-18 Thread dlight
So if I run function 1 with varible inside the query in one session it's
replan each time. 

But if whant to teling postgres  do not doing this, what shoud i do? 

We have more than 1 runs in one session with varible inside sql. And
have big performance problem in 9.2 and 9.3.

Here is my tests.
http://postgresql.1045698.n5.nabble.com/function-execute-on-v-9-2-slow-down-tp5767170p5771403.html

We want change pg version but can't doing that because of performance
regression in 9.2 and 9.3.:(





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Performance-problem-in-PLPgSQL-tp5764796p5771405.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance problem in PLPgSQL

2013-09-18 Thread Pavel Stehule
Hello

can you try this patch (commit)
https://github.com/postgres/postgres/commit/a5f11e24a4d1afb213c780812a3df14c04d7f845#diff-fc73a24ee7d0692c2a0c639870223d70?

Regards

Pavel


2013/9/18 dlight avinf...@gmail.com

 So if I run function 1 with varible inside the query in one session
 it's
 replan each time.

 But if whant to teling postgres  do not doing this, what shoud i do?

 We have more than 1 runs in one session with varible inside sql. And
 have big performance problem in 9.2 and 9.3.

 Here is my tests.

 http://postgresql.1045698.n5.nabble.com/function-execute-on-v-9-2-slow-down-tp5767170p5771403.html

 We want change pg version but can't doing that because of performance
 regression in 9.2 and 9.3.:(





 --
 View this message in context:
 http://postgresql.1045698.n5.nabble.com/Performance-problem-in-PLPgSQL-tp5764796p5771405.html
 Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Performance problem in PLPgSQL

2013-09-18 Thread Andrew Gierth
 dlight == dlight  avinf...@gmail.com writes:

 dlight So if I run function 1 with varible inside the query in
 dlight one session it's replan each time.

 dlight But if whant to teling postgres do not doing this, what shoud
 dlight i do?

 dlight We have more than 1 runs in one session with varible
 dlight inside sql. And have big performance problem in 9.2 and 9.3.

 dlight Here is my tests.
 dlight 
http://postgresql.1045698.n5.nabble.com/function-execute-on-v-9-2-slow-down-tp5767170p5771403.html

One interesting question is: why on earth is that query taking ~500ms
just to plan?

(Second question is, what are the plans and costs for the generic and
custom plans being generated, but that's harder to determine)

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance problem in PLPgSQL

2013-08-25 Thread Marc Cousin

On 24/08/2013 21:16, Tom Lane wrote:

Marc Cousin cousinm...@gmail.com writes:

On 23/08/2013 23:55, Tom Lane wrote:

My previous suggestion was to estimate planning cost as
10 * (length(plan-rangetable) + 1)
but on reflection it ought to be scaled by one of the cpu cost constants,
so perhaps
1000 * cpu_operator_cost * (length(plan-rangetable) + 1)
which'd mean a custom plan has to be estimated to save a minimum of
about 5 cost units (more if more than 1 table is used) before it'll
be chosen.  I'm tempted to make the multiplier be 1 not 1000,
but it seems better to be conservative about changing the behavior
until we see how well this works in practice.

Objections, better ideas?

No better idea as far as I'm concerned, of course :)
But it is a bit tricky to understand what is going on when you get
hit by it, and using a very approximated cost of the planning time
seems the most logical to me. So I'm all for this solution.

I've pushed a patch along this line.  I verified it fixes your original
example, but maybe you could try it on your real application?
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=005f583ba4e6d4d19b62959ef8e70a3da4d188a5

regards, tom lane

I think that won't be possible :(

It's one of those environments where you have to ask lots of permissions 
before doing anything. I'll do my best to have them do a test with this 
patch.


Thanks a lot.

Marc


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance problem in PLPgSQL

2013-08-25 Thread Pavel Stehule
2013/8/23 Tom Lane t...@sss.pgh.pa.us

 Pavel Stehule pavel.steh...@gmail.com writes:
  please, can you send a self explained test
  this issue should be fixed, and we need a examples.

 We already had a perfectly good example at the beginning of this thread.
 What's missing is a decision on how we ought to approximate the cost of
 planning (relative to execution costs).

 As I mentioned upthread, it doesn't seem unreasonable to me to do
 something quick-and-dirty based on the length of the plan's rangetable.
 Pretty nearly anything would fix these specific situations where the
 estimated execution cost is negligible.  It's possible that there are
 more complicated cases where we'll need a more accurate estimate, but
 we've not seen an example of that yet.

 My previous suggestion was to estimate planning cost as
10 * (length(plan-rangetable) + 1)
 but on reflection it ought to be scaled by one of the cpu cost constants,
 so perhaps
1000 * cpu_operator_cost * (length(plan-rangetable) + 1)
 which'd mean a custom plan has to be estimated to save a minimum of
 about 5 cost units (more if more than 1 table is used) before it'll
 be chosen.  I'm tempted to make the multiplier be 1 not 1000,
 but it seems better to be conservative about changing the behavior
 until we see how well this works in practice.

 Objections, better ideas?


I am thinking so this problem is little bit more complex and using only
this formula is too much simplification - although it is big step forward.

* first 5 queries uses a custom plan - it means so lot of procedures uses
custom plan for ever (if are executed without pooling, because almost all
functions with SQL are not called twice in one connect ) - and there are
really only a few reports related to prepared statements or PL/pgSQL
performance - so it can demonstrates so planning in PostgreSQL is relative
fast process and probably we don't be afraid of more wide using custom
plans. Custom plans has a nice a secondary effect - it solve a problems
with predicates in form: field = some_constant OR field IS NULL without any
special support in planner. But it sometimes 6. query can be slow, because
a generic plan is used.

where we can expect a performance problems?

* frequently fast simple statements:

** INSERT INTO table, { UPDATE | DELETE | SELECT } WHERE PK= const - these
queries can use a generic plan directly

* planer expensive queries with fast result - these queries can use a
generic plans too, with some logic as you describe.

In other cases probably using a custom plans doesn't do a performance
issue, we can use it directly.

What I see, a prepared plans (prepared statements) are used now more due
protection against SQL injection than due saving a planner time - and badly
using a generic plan is more worse than repeated planning.

P.S. Can be magic constant 5 (using custom plans) controlled via GUC? Then
we can have very good control for some special using where default
mechanism fails (0 .. use a generic plans ever, -1 use a generic plan newer)

Regards

Pavel






 regards, tom lane



Re: [HACKERS] Performance problem in PLPgSQL

2013-08-24 Thread Marc Cousin

On 23/08/2013 23:55, Tom Lane wrote:

Pavel Stehule pavel.steh...@gmail.com writes:

please, can you send a self explained test
this issue should be fixed, and we need a examples.

We already had a perfectly good example at the beginning of this thread.
What's missing is a decision on how we ought to approximate the cost of
planning (relative to execution costs).

As I mentioned upthread, it doesn't seem unreasonable to me to do
something quick-and-dirty based on the length of the plan's rangetable.
Pretty nearly anything would fix these specific situations where the
estimated execution cost is negligible.  It's possible that there are
more complicated cases where we'll need a more accurate estimate, but
we've not seen an example of that yet.

My previous suggestion was to estimate planning cost as
10 * (length(plan-rangetable) + 1)
but on reflection it ought to be scaled by one of the cpu cost constants,
so perhaps
1000 * cpu_operator_cost * (length(plan-rangetable) + 1)
which'd mean a custom plan has to be estimated to save a minimum of
about 5 cost units (more if more than 1 table is used) before it'll
be chosen.  I'm tempted to make the multiplier be 1 not 1000,
but it seems better to be conservative about changing the behavior
until we see how well this works in practice.

Objections, better ideas?

regards, tom lane

No better idea as far as I'm concerned, of course :)

But it is a bit tricky to understand what is going on when you get
hit by it, and using a very approximated cost of the planning time
seems the most logical to me. So I'm all for this solution.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance problem in PLPgSQL

2013-08-24 Thread Tom Lane
Marc Cousin cousinm...@gmail.com writes:
 On 23/08/2013 23:55, Tom Lane wrote:
 My previous suggestion was to estimate planning cost as
 10 * (length(plan-rangetable) + 1)
 but on reflection it ought to be scaled by one of the cpu cost constants,
 so perhaps
 1000 * cpu_operator_cost * (length(plan-rangetable) + 1)
 which'd mean a custom plan has to be estimated to save a minimum of
 about 5 cost units (more if more than 1 table is used) before it'll
 be chosen.  I'm tempted to make the multiplier be 1 not 1000,
 but it seems better to be conservative about changing the behavior
 until we see how well this works in practice.
 
 Objections, better ideas?

 No better idea as far as I'm concerned, of course :)

 But it is a bit tricky to understand what is going on when you get
 hit by it, and using a very approximated cost of the planning time
 seems the most logical to me. So I'm all for this solution.

I've pushed a patch along this line.  I verified it fixes your original
example, but maybe you could try it on your real application?
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=005f583ba4e6d4d19b62959ef8e70a3da4d188a5

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance problem in PLPgSQL

2013-08-23 Thread Fábio Telles Rodriguez
 It just inserts nb records in a loop in 4 different maneers:
 - Directly in an int field
 - Then in a numeric field (that's where we're having problems)
 - Then in the same numeric field, but trying a cast (it doesn't change a
 thing)
 - Then tries with an intermediary temp variable of numeric type (which
 solves the problem).


 Here are the runtimes (tables were truncated beforehand):

 9.1.9:
 select test_insert(100);
 NOTICE:  time for int:00:00:09.526009
 NOTICE:  time for numeric:00:00:10.557126
 NOTICE:  time for numeric, casted:00:00:10.821369
 NOTICE:  time for numeric with tmp variable:00:00:10.850847


 9.2.4:
 select test_insert(100);
 NOTICE:  time for int:00:00:09.477044
 NOTICE:  time for numeric:00:00:24.757032  
 NOTICE:  time for numeric, casted:00:00:24.791016  
 NOTICE:  time for numeric with tmp variable:00:00:10.89332


 I really don't know exactly where the problem comes from… but it's been
 hurting a function very badly (there are several of these static queries
 with types mismatch). And of course, the problem is not limited to
 numeric… text has the exact same problem.

 Regards,

 Marc


I got the same problem today. Unfortunately, we need to rollback to 9.1 in
our production site. Of course the team needed to make better tests before
go to production. Of course they really need to write better functions in
PL/pgSQL, but this problem was a really no go for us.

Just don't let this gotcha gone in our to do.

-- 
Regards,
Fábio Telles Rodriguez
blog: http:// http://www.midstorm.org/~telles/shttp://tellesr.wordpress.com/
avepoint.blog.br
e-mail / gtalk / MSN: fabio.tel...@gmail.com
Skype: fabio_telles

Timbira - The Brazilian Postgres Company
http://www.timbira.com.br


Re: [HACKERS] Performance problem in PLPgSQL

2013-08-23 Thread Pavel Stehule
2013/8/23 Fábio Telles Rodriguez fabio.tel...@gmail.com


 It just inserts nb records in a loop in 4 different maneers:
 - Directly in an int field
 - Then in a numeric field (that's where we're having problems)
 - Then in the same numeric field, but trying a cast (it doesn't change a
 thing)
 - Then tries with an intermediary temp variable of numeric type (which
 solves the problem).


 Here are the runtimes (tables were truncated beforehand):

 9.1.9:
 select test_insert(100);
 NOTICE:  time for int:00:00:09.526009
 NOTICE:  time for numeric:00:00:10.557126
 NOTICE:  time for numeric, casted:00:00:10.821369
 NOTICE:  time for numeric with tmp variable:00:00:10.850847


 9.2.4:
 select test_insert(100);
 NOTICE:  time for int:00:00:09.477044
 NOTICE:  time for numeric:00:00:24.757032  
 NOTICE:  time for numeric, casted:00:00:24.791016  
 NOTICE:  time for numeric with tmp variable:00:00:10.89332


 I really don't know exactly where the problem comes from… but it's been
 hurting a function very badly (there are several of these static queries
 with types mismatch). And of course, the problem is not limited to
 numeric… text has the exact same problem.

 Regards,

 Marc


 I got the same problem today. Unfortunately, we need to rollback to 9.1 in
 our production site. Of course the team needed to make better tests before
 go to production. Of course they really need to write better functions in
 PL/pgSQL, but this problem was a really no go for us.


please, can you send a self explained test

this issue should be fixed, and we need a examples.



 Just don't let this gotcha gone in our to do.

 --
 Regards,
 Fábio Telles Rodriguez
 blog: http:// 
 http://www.midstorm.org/~telles/shttp://tellesr.wordpress.com/
 avepoint.blog.br
 e-mail / gtalk / MSN: fabio.tel...@gmail.com
 Skype: fabio_telles

 Timbira - The Brazilian Postgres Company
 http://www.timbira.com.br



Re: [HACKERS] Performance problem in PLPgSQL

2013-08-23 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 please, can you send a self explained test
 this issue should be fixed, and we need a examples.

We already had a perfectly good example at the beginning of this thread.
What's missing is a decision on how we ought to approximate the cost of
planning (relative to execution costs).

As I mentioned upthread, it doesn't seem unreasonable to me to do
something quick-and-dirty based on the length of the plan's rangetable.
Pretty nearly anything would fix these specific situations where the
estimated execution cost is negligible.  It's possible that there are
more complicated cases where we'll need a more accurate estimate, but
we've not seen an example of that yet.

My previous suggestion was to estimate planning cost as
   10 * (length(plan-rangetable) + 1)
but on reflection it ought to be scaled by one of the cpu cost constants,
so perhaps
   1000 * cpu_operator_cost * (length(plan-rangetable) + 1)
which'd mean a custom plan has to be estimated to save a minimum of
about 5 cost units (more if more than 1 table is used) before it'll
be chosen.  I'm tempted to make the multiplier be 1 not 1000,
but it seems better to be conservative about changing the behavior
until we see how well this works in practice.

Objections, better ideas?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance problem in PLPgSQL

2013-07-24 Thread Amit Kapila
On Wednesday, July 24, 2013 4:23 AM Tom Lane wrote:
 I wrote:
  Marc Cousin cousinm...@gmail.com writes:
  The example below is of course extremely simplified, and obviously
 not
  what we are really doing in the database, but it exhibits the
 slowdown
  between 9.1.9 and 9.2.4.
 
  Hm.  Some experimentation shows that the plan cache is failing to
 switch
  to a generic plan, but I'm not sure why the cast would have anything
 to
  do with that ...
 
 Hah, I see why:
 
 (gdb) s
 1009if (plansource-generic_cost  avg_custom_cost * 1.1)
 (gdb) p plansource-generic_cost
 $18 = 0.012501
 (gdb) p avg_custom_cost
 $19 = 0.01
 (gdb) p avg_custom_cost * 1.1
 $20 = 0.011001
 
 That is, the estimated cost of the custom plan is just the evaluation
 time for a simple constant, while the estimated cost of the generic
 plan
 includes a charge for evaluation of int4_numeric().  That makes the
 latter more than ten percent more expensive, and since this logic isn't
 considering anything else at all (particularly not the cost of
 planning), it thinks that makes the custom plan worth picking.
 
 We've been around on this before, but not yet thought of a reasonable
 way to estimate planning cost, let alone compare it to estimated
 execution costs.  Up to now I hadn't thought that was a particularly
 urgent issue, but this example shows it's worth worrying about.

 One thing that was suggested in the previous discussion is to base the
 planning cost estimate on the length of the rangetable.  We could
 do something trivial like add 10 * (length(plan-rangetable) + 1)
 in this comparison.
 
 Another thing that would fix this particular issue, though perhaps not
 related ones, is to start charging something nonzero for ModifyTable
 nodes, say along the lines of one seq_page_cost per inserted/modified
 row.  That would knock the total estimated cost for an INSERT up enough
 that the ten percent threshold wouldn't be exceeded.

Shouldn't it consider new value of boundparam to decide whether a new custom
plan is needed,
as that can be one of the main reason why it would need different plan.

Current behavior is either it will choose generic plan or build a new custom
plan with new parameters based on 
Choose_custom_plan().

Shouldn't the behavior of this be as below:
a. choose generic plan
b. choose one of existing custom plan
c. create new custom plan 

The choice can be made based on the new value of bound parameter.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance problem in PLPgSQL

2013-07-24 Thread Amit Kapila
On Wednesday, July 24, 2013 11:38 AM Amit Kapila wrote:
 On Wednesday, July 24, 2013 4:23 AM Tom Lane wrote:
  I wrote:
   Marc Cousin cousinm...@gmail.com writes:
   The example below is of course extremely simplified, and obviously
  not
   what we are really doing in the database, but it exhibits the
  slowdown
   between 9.1.9 and 9.2.4.
 
   Hm.  Some experimentation shows that the plan cache is failing to
  switch
   to a generic plan, but I'm not sure why the cast would have
 anything
  to
   do with that ...
 
  Hah, I see why:
 
  (gdb) s
  1009if (plansource-generic_cost  avg_custom_cost * 1.1)
  (gdb) p plansource-generic_cost
  $18 = 0.012501
  (gdb) p avg_custom_cost
  $19 = 0.01
  (gdb) p avg_custom_cost * 1.1
  $20 = 0.011001
 
  That is, the estimated cost of the custom plan is just the evaluation
  time for a simple constant, while the estimated cost of the generic
  plan
  includes a charge for evaluation of int4_numeric().  That makes the
  latter more than ten percent more expensive, and since this logic
 isn't
  considering anything else at all (particularly not the cost of
  planning), it thinks that makes the custom plan worth picking.
 
  We've been around on this before, but not yet thought of a reasonable
  way to estimate planning cost, let alone compare it to estimated
  execution costs.  Up to now I hadn't thought that was a particularly
  urgent issue, but this example shows it's worth worrying about.
 
  One thing that was suggested in the previous discussion is to base
 the
  planning cost estimate on the length of the rangetable.  We could
  do something trivial like add 10 * (length(plan-rangetable) + 1)
  in this comparison.
 
  Another thing that would fix this particular issue, though perhaps
 not
  related ones, is to start charging something nonzero for ModifyTable
  nodes, say along the lines of one seq_page_cost per inserted/modified
  row.  That would knock the total estimated cost for an INSERT up
 enough
  that the ten percent threshold wouldn't be exceeded.
 
 Shouldn't it consider new value of boundparam to decide whether a new
 custom
 plan is needed,
 as that can be one of the main reason why it would need different plan.
 
 Current behavior is either it will choose generic plan or build a new
 custom
 plan with new parameters based on
 Choose_custom_plan().
 
 Shouldn't the behavior of this be as below:
 a. choose generic plan
 b. choose one of existing custom plan
 c. create new custom plan
 
 The choice can be made based on the new value of bound parameter.

For the case of Insert where no scan is involved (as in this case), do we
anytime need different plan?
Can we always use generic plan for such cases?

With Regards,
Amit Kapila.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance problem in PLPgSQL

2013-07-23 Thread Tom Lane
Marc Cousin cousinm...@gmail.com writes:
 The example below is of course extremely simplified, and obviously not
 what we are really doing in the database, but it exhibits the slowdown
 between 9.1.9 and 9.2.4.

Hm.  Some experimentation shows that the plan cache is failing to switch
to a generic plan, but I'm not sure why the cast would have anything to
do with that ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance problem in PLPgSQL

2013-07-23 Thread Tom Lane
I wrote:
 Marc Cousin cousinm...@gmail.com writes:
 The example below is of course extremely simplified, and obviously not
 what we are really doing in the database, but it exhibits the slowdown
 between 9.1.9 and 9.2.4.

 Hm.  Some experimentation shows that the plan cache is failing to switch
 to a generic plan, but I'm not sure why the cast would have anything to
 do with that ...

Hah, I see why:

(gdb) s
1009if (plansource-generic_cost  avg_custom_cost * 1.1)
(gdb) p plansource-generic_cost
$18 = 0.012501
(gdb) p avg_custom_cost
$19 = 0.01
(gdb) p avg_custom_cost * 1.1
$20 = 0.011001

That is, the estimated cost of the custom plan is just the evaluation
time for a simple constant, while the estimated cost of the generic plan
includes a charge for evaluation of int4_numeric().  That makes the
latter more than ten percent more expensive, and since this logic isn't
considering anything else at all (particularly not the cost of
planning), it thinks that makes the custom plan worth picking.

We've been around on this before, but not yet thought of a reasonable
way to estimate planning cost, let alone compare it to estimated
execution costs.  Up to now I hadn't thought that was a particularly
urgent issue, but this example shows it's worth worrying about.

One thing that was suggested in the previous discussion is to base the
planning cost estimate on the length of the rangetable.  We could
do something trivial like add 10 * (length(plan-rangetable) + 1)
in this comparison.

Another thing that would fix this particular issue, though perhaps not
related ones, is to start charging something nonzero for ModifyTable
nodes, say along the lines of one seq_page_cost per inserted/modified
row.  That would knock the total estimated cost for an INSERT up enough
that the ten percent threshold wouldn't be exceeded.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers