Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-18 Thread Tom Lane
Robert Haas  writes:
> This isn't the first complaint about this mechanism that we've gotten,
> and it won't be the last.  Way too many of our users are way more
> aware than they should be that the threshold here is five rather than
> any other number, which to me is a clear-cut sign that this needs to
> be improved.

No argument there.

> How to improve it is a harder question.

Exactly.  I'm very suspicious of any easy answers to this; they'll
most likely just shift the pain around.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-18 Thread Robert Haas
On Wed, Jan 13, 2016 at 10:47 AM, Tom Lane  wrote:
> Vladimir Sitnikov  writes:
>> Note: I state that mixing "kinds" of bind values is a bad application
>> design anyway. In other words, application developer should understand
>> if a query is DWH-like (requires replans) or OLTP-like (does not
>> require replans). Agreed?
>
> No, not agreed.  As was already pointed out upthread, such information
> is not available in many use-cases for the plancache.
>
> The real problem here IMO is inaccurate plan cost estimates, and that's
> not something that there is any easy fix for.

Not really.  Even if the cost estimates for all of the plans tried are
perfectly accurate, you'll have only seen 5 values when you decide to
switch to a generic plan.  If the 6th, 60th, 600th, or 6000th
execution uses a parameter where a custom plan would have been a big
win, you will blindly use the generic plan anyway and lose bigtime.
On the other hand, if first five plans are all equivalent to each
other and to the generic plan, then you've spent the cost of uselessly
replanning six times instead of just caching the first plan and being
done with it.  I'm aware of an actual case where that extra
re-planning causes a serious performance problem, aggregated across
many queries and many backends.

This isn't the first complaint about this mechanism that we've gotten,
and it won't be the last.  Way too many of our users are way more
aware than they should be that the threshold here is five rather than
any other number, which to me is a clear-cut sign that this needs to
be improved.  How to improve it is a harder question.  We lack the
ability to do any kind of sensitivity analysis on a plan, so we can't
know whether there are other parameter values that would have resulted
in a different plan, nor can we test whether a particular set of
parameter values would have changed the outcome.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-18 Thread Bruce Momjian
On Mon, Jan 18, 2016 at 02:14:11PM -0500, Tom Lane wrote:
> Bruce Momjian  writes:
> > I never understood why we don't just keep the selectivity estimates of
> > previous plans and just reuse the plan if the selectivity estimates are
> > similar.  Isn't parameter selectivity the only thing that distinguishes
> > on plan with parameter from another?
> 
> > Checking selectivity estimates must be cheaper than replanning.  This
> > could be done at the second use of the prepared plan, and maybe for all
> > plan reuses, rather than waiting for five and then perhaps getting this
> > bad behavior.
> 
> You're imagining that a selectivity recheck could be separated out from
> the rest of the planner.  That's nowhere near feasible, IMO.  Even if it

I think you would have to do the checks before entering the planner and
save them off for use in the planner.

> were, what would we do with it?  There's no reliable way to determine
> whether X% change in one or another selectivity number would change the
> selected plan, other than by redoing practically all of the planning work.

Well, if it is +/-1%, I think we can assume we can reuse the plan just
fine from the second prepared call until we see a major selectivity
change.  While we have never exposed the count of prepared queries
before we choose a generic plan, I can see us exposing this percentage.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-18 Thread Tom Lane
Bruce Momjian  writes:
> I never understood why we don't just keep the selectivity estimates of
> previous plans and just reuse the plan if the selectivity estimates are
> similar.  Isn't parameter selectivity the only thing that distinguishes
> on plan with parameter from another?

> Checking selectivity estimates must be cheaper than replanning.  This
> could be done at the second use of the prepared plan, and maybe for all
> plan reuses, rather than waiting for five and then perhaps getting this
> bad behavior.

You're imagining that a selectivity recheck could be separated out from
the rest of the planner.  That's nowhere near feasible, IMO.  Even if it
were, what would we do with it?  There's no reliable way to determine
whether X% change in one or another selectivity number would change the
selected plan, other than by redoing practically all of the planning work.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-18 Thread Bruce Momjian
On Wed, Jan 13, 2016 at 10:47:18AM -0500, Tom Lane wrote:
> Vladimir Sitnikov  writes:
> > Note: I state that mixing "kinds" of bind values is a bad application
> > design anyway. In other words, application developer should understand
> > if a query is DWH-like (requires replans) or OLTP-like (does not
> > require replans). Agreed?
> 
> No, not agreed.  As was already pointed out upthread, such information
> is not available in many use-cases for the plancache.
> 
> The real problem here IMO is inaccurate plan cost estimates, and that's
> not something that there is any easy fix for.
> 
> However ... one specific aspect of that is that to some extent, the cost
> estimate made for the generic plan is incommensurate with the estimates
> for the custom plans because the latter are made with more information.
> I don't remember the details of your specific case anymore, but we've
> seen cases where the generic plan is falsely estimated to be cheaper
> than custom plans because of this.

I never understood why we don't just keep the selectivity estimates of
previous plans and just reuse the plan if the selectivity estimates are
similar.  Isn't parameter selectivity the only thing that distinguishes
on plan with parameter from another?

Checking selectivity estimates must be cheaper than replanning.  This
could be done at the second use of the prepared plan, and maybe for all
plan reuses, rather than waiting for five and then perhaps getting this
bad behavior.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Pavel Stehule
2016-01-13 17:12 GMT+01:00 Vladimir Sitnikov :

> >If plan is ok for one value parameters, then can be pretty bad for
> following parameters.
>
> Happy statements are all alike; every unhappy statement is unhappy in
> its own way (see [1]).
> If user is sending different kinds of parameters, he is shooting in the
> foot.
>
> >Albe's proposal can be good enough for 2/3 cases and it doesn't block any
> other enhancing
>
> Albe's proposal effectively disables plan cache, thus it blocks enhancing.
> If a user goes "replan every time" route, there is no way you
> introduce plan caching there.
>

I am sorry, I disagree. Albe's proposal should be compatible with current
state, so your argument is too strong. Default is same.

Pavel


>
> [1]: https://en.wikipedia.org/wiki/Anna_Karenina_principle
>
> Vladimir
>


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>If plan is ok for one value parameters, then can be pretty bad for following 
>parameters.

Happy statements are all alike; every unhappy statement is unhappy in
its own way (see [1]).
If user is sending different kinds of parameters, he is shooting in the foot.

>Albe's proposal can be good enough for 2/3 cases and it doesn't block any 
>other enhancing

Albe's proposal effectively disables plan cache, thus it blocks enhancing.
If a user goes "replan every time" route, there is no way you
introduce plan caching there.

[1]: https://en.wikipedia.org/wiki/Anna_Karenina_principle

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>(1) the generic plan is falsely optimistic

That is my case.
Application is sending most common value on every execution while
backend is optimistic and it things that the app would stop sending
MCVs.

Costs for the plans are OK. However, there is a data skew, so it is
hard to tell what is the "true" selectivity of the skewed column in
general, thus the discussion.

VS>>In other words, application developer should understand
VS>> if a query is DWH-like (requires replans) or OLTP-like (does not
VS>> require replans). Agreed?
Tom>No, not agreed.  As was already pointed out upthread, such information
Tom>is not available in many use-cases for the plancache.

I think you answer the wrong question.
I was asking if you agree that _application_ developer (not pg backed
developer) should know if a query is OLTP or DWH like.

Do you really think app developer should not care which plan would be
chosen for a particular query he is working on?
Why all that "explain" stuff in documentation then?

In the plancache.c you have CURSOR_OPT_GENERIC_PLAN and
CURSOR_OPT_CUSTOM_PLAN flags.
It is obvious that those flags are not yet exposed/used by
applications, but my message is that "one should *not* think that DB
has artificial intelligence to properly identify a plan for each bind
sets and cache plans at the same time".

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Tom Lane
Vladimir Sitnikov  writes:
> Note: I state that mixing "kinds" of bind values is a bad application
> design anyway. In other words, application developer should understand
> if a query is DWH-like (requires replans) or OLTP-like (does not
> require replans). Agreed?

No, not agreed.  As was already pointed out upthread, such information
is not available in many use-cases for the plancache.

The real problem here IMO is inaccurate plan cost estimates, and that's
not something that there is any easy fix for.

However ... one specific aspect of that is that to some extent, the cost
estimate made for the generic plan is incommensurate with the estimates
for the custom plans because the latter are made with more information.
I don't remember the details of your specific case anymore, but we've
seen cases where the generic plan is falsely estimated to be cheaper
than custom plans because of this.

I wonder whether it would be useful to reject a generic plan anytime its
estimate is less than the average (or minimum?) estimate for the custom
plans.  If it is less, then either (1) the generic plan is falsely
optimistic, or (2) the specific parameter values provided for the custom
plans were all ones for which the planner could see that the generic plan
was non-optimal.  If (2) holds for the first few custom plans then it's
not unreasonable to suppose that it will keep on holding, and we had
better not use the generic plan.

Basically, the case we're *expecting* to see is that a custom plan is the
same or better cost as the generic plan --- same cost if it's really the
same plan, better cost if knowing the parameter values allows some
optimization to be performed (LIKE-pattern-to-index conversion, partition
scan suppression via constraint exclusion, etc).  If we get a higher cost
estimate for the custom plan then something is fishy and we shouldn't
believe it.

Maybe I'm missing some case where that situation would arise naturally.
Or maybe such a rule wouldn't actually help in very many real-world
cases.  But it seems worth looking into.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Pavel Stehule
2016-01-13 16:22 GMT+01:00 Vladimir Sitnikov :

> >A value of -1 could disable generic plans
>
> I do not like the idea.
>
> I've seen dramatic performance improvements from using cached plans.
> The numbers are like "20ms to plan vs 1ms to execute" for an often
> used OLTP query. Query text is involved (~5-10KiB).
>

but currently we have not any tool how to check the quality of plan for new
set of parameters. If plan is ok for one value parameters, then can be
pretty bad for following parameters.

Albe's proposal can be good enough for 2/3 cases and it doesn't block any
other enhancing. There is still 1/3 of queries - too complex (slow
planning) too dynamic plan (the generic plan doesn't work).

Regards

Pavel


>
> Vladimir
>


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Pavel Stehule
2016-01-13 16:18 GMT+01:00 Albe Laurenz :

> Pavel Stehule wrote:
> > I like a strategy based on risks. Probably there are situation, when the
> generic plan is great every
> > time - INSERTs, UPDATEs via PK, simple SELECTs via PK. generic plan can
> be well if almost all data has
> > similar probability. Elsewhere on bigger data, the probability of pretty
> slow plan is higher, and then
> > we should to prefer custom plan.
> >
> > so the strategy - if cost of generic plan is less than some MAGIC
> CONSTANT (can be specified by GUC),
> > then use generic plan. Elsewhere use a custom plan everytime.
> >
> > It allow to controll the plan reusing. When MAGIC CONSTANT = 0 .. use
> custom plan everytime, When
> > MAGIC CONSTANT = M, then use generic plan always.
>
> I have a different idea:
>
> What about a GUC "custom_plan_threshold" that controls after how many
> executions a generic plan will be considered, with a default value of 5.
>
> A value of -1 could disable generic plans.
>

yes, I though about it too - it is simple, and almost deterministic

Pavel


>
> Yours,
> Laurenz Albe
>


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>A value of -1 could disable generic plans

I do not like the idea.

I've seen dramatic performance improvements from using cached plans.
The numbers are like "20ms to plan vs 1ms to execute" for an often
used OLTP query. Query text is involved (~5-10KiB).

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Albe Laurenz
Pavel Stehule wrote:
> I like a strategy based on risks. Probably there are situation, when the 
> generic plan is great every
> time - INSERTs, UPDATEs via PK, simple SELECTs via PK. generic plan can be 
> well if almost all data has
> similar probability. Elsewhere on bigger data, the probability of pretty slow 
> plan is higher, and then
> we should to prefer custom plan.
> 
> so the strategy - if cost of generic plan is less than some MAGIC CONSTANT 
> (can be specified by GUC),
> then use generic plan. Elsewhere use a custom plan everytime.
> 
> It allow to controll the plan reusing. When MAGIC CONSTANT = 0 .. use custom 
> plan everytime, When
> MAGIC CONSTANT = M, then use generic plan always.

I have a different idea:

What about a GUC "custom_plan_threshold" that controls after how many
executions a generic plan will be considered, with a default value of 5.

A value of -1 could disable generic plans.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>Basically you're arguing to fix one specific edge case which bugs you
>personally, by creating a lot of others, which don't bug you. Not
>convincing.

It bugs me.
It bugs clients of pgjdbc (e.g. Thomas who started the thread).

Note: support of prepared statements for java applications has just
landed. The release dates are 2015-08-27 for pgjdbc, and 2014-02-24
for pgjdbc-ng.
I think current report is just a tip of the iceberg.

> by creating a lot of others, which don't bug you

I think it will not create "lots of others".
Do you have any statistics why particular flavour of generic plan was
committed in 9.2?

Suppose there are two type of binds: "non_unique" (N) and "unique" (U)
that require different plans for perfect response times.

I see the following sequences
 -- all clear, all the approaches would converge to plan for
"unique values".

 -- query for non-unique value is executed again and again.
  Perfect optimizer would either replan or reuse plan with regard to "MCV"
  Current behaviour would switch to "optimistic" plan at 6th
iteration. It is the case of the thread.
  My suggestion is to learn that "MCV is used -> use plan optimized for MCV"

^^^ note that above are "recommended" uses of the database. Each
statement is used for its own purpose: one for MCVs, another for "good
values".

Then there are cases of mixed executions.
Note: I state that mixing "kinds" of bind values is a bad application
design anyway. In other words, application developer should understand
if a query is DWH-like (requires replans) or OLTP-like (does not
require replans). Agreed?

NUUU
  Current behavior optimized for exactly this pattern.
  Well, why was it chosen over "UNNN"?

In other words, a pattern like UNNN would "create a lot of
others" as you say.

NUNUNUNUNUN -- perfect optimizer would replan every time (or have two
sets of plans, but let's leave that out)
  Neither my suggestion nor current behaviour properly covers the case.


I suggest to spare "NUUU" pattern in order to improve "".

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
> Generally using a very small sample

That is another issue. Inventing some other algorithm instead of
current "cache after 5 executions" is another effort.

However, I suggest to "learn" from what client is sending.
You suggest to completely ignore that and just prepare for the case
he/she will send "a random value".
Why expect client would stop sending MCVs if we have already seen them
during previous 5 executions?

> That'd not change with the change you propose.

It will.
In my suggestion, the first "explain analyze execute" will match the
"finally cached plan" provided the plan is not treated in a special
way (e.g. replan every time, etc).

> That a prepared statement suddenly performs way differently
>depending on which the first bind values are is not, in any way, easier
>to debug.

It is way easier to debug since *the first* execution plan you get out
of "explain" *matches* the one that will finally be used.
Lots of developers are just not aware of "5 replans by backend".
Lots of the remaining confuse it with "5 non-server-prepared
executions by pgjdbc driver".

In other way: in order to identify a root cause of a slow query you
find bind values. Then you perform explain analyze and you find shiny
fast plan.
Does that immediately ring bells that you need to execute it 6 times
to ensure the plan would still be good?
Well, try being someone not that smart as you are when you answering
this question.

2) In my suggestion, "the first execution would likely to match the plan".

VS>> 3) What about "client sends top most common value 5 times in a row"?
VS>> Why assume "it will stop doing that"?
AF>If 20% of your values are nonunique and the rest is unique you'll get
AF>*drastically* different plans, each performing badly for the other case;
AF>with the unique cardinality plan being extremly bad.

Can you elaborate a bit? I can hardly follow that.



Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>so you don't get to (or want to) have any control over the underlying prepared 
>statement.

That is pl/pgsql's problem, isn't it?
In the mean time, user can use different query texts (e.g. by adding
offset 0, offset 0*1, offset 0*2, etc kind of stuff they typically use
to tune queries) to convince plpgsql to use different statement ids.

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Andres Freund
On 2016-01-13 17:38:22 +0300, Vladimir Sitnikov wrote:
> >so you don't get to (or want to) have any control over the underlying 
> >prepared statement.
> 
> That is pl/pgsql's problem, isn't it?
> In the mean time, user can use different query texts (e.g. by adding
> offset 0, offset 0*1, offset 0*2, etc kind of stuff they typically use
> to tune queries) to convince plpgsql to use different statement ids.

Basically you're arguing to fix one specific edge case which bugs you
personally, by creating a lot of others, which don't bug you. Not
convincing.

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Marko Tiikkaja

On 13/01/16 15:34, Vladimir Sitnikov wrote:

This completely screws over PL/PgSQL, among other things.


Can you elaborate a bit?


You just write a query like this:

  SELECT * FROM foo WHERE bar = _Variable;

so you don't get to (or want to) have any control over the underlying 
prepared statement.



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Andres Freund
On 2016-01-13 17:26:43 +0300, Vladimir Sitnikov wrote:
> > consider e.g a table with one somewhat common and otherwise just unique 
> > values.> 
> So what?
> If I understand you properly, you mean: "if client sends unique binds
> first 5-6 executions and bad non-unique afterwards, then cached plan
> would be bad". Is that what you are saying?

That's one of several problems, yes. Generally using a very small sample
("bind values in the the first query"), to plan every future query isn't
going to be fun.

> I agree that is the corner-case for my suggestion.
> Is is really happening often?

Yes.

> I state the following:
> 1) It is way easier to debug & analyze.

Meh. That a prepared statement suddenly performs way differently
depending on which the first bind values are is not, in any way, easier
to debug.

> For instance: current documentation does *not* list a way to get a
> *generic plan*.

Which doensn't have anything to do with your proposal. That'd not change
with the change you propose.

> Is that obvious that "you just need to EXPLAIN ANALYZE EXECUTE *6
> times in a row*" just to get a generic plan?

No. And I hate that. I think it'd be very good to expand EXPLAIN's
output to include information about custom/generic plans.


> 3) What about "client sends top most common value 5 times in a row"?
> Why assume "it will stop doing that"?
> I think the better assumption is "it will continue doing that".

If 20% of your values are nonunique and the rest is unique you'll get
*drastically* different plans, each performing badly for the other case;
with the unique cardinality plan being extremly bad.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>This completely screws over PL/PgSQL, among other things.

Can you elaborate a bit?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Marko Tiikkaja

On 13/01/16 15:26, Vladimir Sitnikov wrote:

2) It is likely to be more performant. We just need to explain users
that "if different plans required, just use different statements".


This completely screws over PL/PgSQL, among other things.


.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
> for one custom plans can be much better than the generic plan, independent of 
> cardinalities

So what? I do not suggest dropping custom plans entirely.
I perfectly understand there are cases when better replan every time.

> consider e.g a table with one somewhat common and otherwise just unique 
> values.

So what?
If I understand you properly, you mean: "if client sends unique binds
first 5-6 executions and bad non-unique afterwards, then cached plan
would be bad". Is that what you are saying?
I agree that is the corner-case for my suggestion.
Is is really happening often?

I state the following:
1) It is way easier to debug & analyze.
For instance: current documentation does *not* list a way to get a
*generic plan*.
Is that obvious that "you just need to EXPLAIN ANALYZE EXECUTE *6
times in a row*" just to get a generic plan?

2) It is likely to be more performant. We just need to explain users
that "if different plans required, just use different statements".
Isn't that obvious?
Frankly speaking, I do not like "plug&pray" kind of code that just
sends bind values and expects magically optimized plan for each bind
combination.

3) What about "client sends top most common value 5 times in a row"?
Why assume "it will stop doing that"?
I think the better assumption is "it will continue doing that".

At the end, if a client wants specific treatment of a query, then
he/she might be better using separate server-prepared statements (the
one for "unique values", and another one for "non-unique").

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Andres Freund
On January 13, 2016 3:02:27 PM GMT+01:00, Vladimir Sitnikov 
 wrote:
>> The custom plan is *more expensive*;
>
>You compare costs of custom vs generic plans.
>I suggest: do not compare costs *at all*.
>
>>I don't know, it's your proposal :-)  But it looks like I
>misunderstood.
>
>It is not.
>
>My suggestion is: build a generic plan (that is the plan that will
>return proper result for every possible bind value), yet refer to the
>values of 6th binds when estimating cardinalitites.
>Is it clear now?

That's not going to fly for two reasons: for one custom plans can be much 
better than the generic plan, independent of cardinalities. Consider e.g. a 
partitioned table, where the generic scan will scan all partitions. For 
another, just using the specific values for the generic plan will have horrible 
results if the distribution isn't entirely boring, consider e.g a table with 
one somewhat common and otherwise just unique values.

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
> The custom plan is *more expensive*;

You compare costs of custom vs generic plans.
I suggest: do not compare costs *at all*.

>I don't know, it's your proposal :-)  But it looks like I misunderstood.

It is not.

My suggestion is: build a generic plan (that is the plan that will
return proper result for every possible bind value), yet refer to the
values of 6th binds when estimating cardinalitites.
Is it clear now?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Marko Tiikkaja

On 13/01/16 14:36, Vladimir Sitnikov wrote:

Say you already have a plan which looks like this:
Now the plan gets invoked with  $1 = 5.  What exactly in your mind would happen 
here?


A sequential scan with $1=5 condition. What else could be there?


I don't know, it's your proposal :-)  But it looks like I misunderstood.


Note: I do not suggest changing already cached plans yet.
I suggest looking into "6th bind values" when building a cached plan.


But that wouldn't have helped your case.  The custom plan is *more 
expensive*; I'm guessing because the generic plan gambles on a better 
average case instead of preparing for the worst case.



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
Note: I do not suggest changing already cached plans yet.
I suggest looking into "6th bind values" when building a cached plan.
In other words, "if first 5 execution do not reveal dependence on bind
values, then cache the generated plan".

>Say you already have a plan which looks like this:
>Now the plan gets invoked with  $1 = 5.  What exactly in your mind would 
>happen here?

A sequential scan with $1=5 condition. What else could be there?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Marko Tiikkaja

On 13/01/16 14:27, Vladimir Sitnikov wrote:

TL;DR: I suggest to create "generic plan" with regard to current bind values.
What's wrong with that approach?


I don't understand how this is supposed to work.  Say you already have a 
plan which looks like this:


 Seq Scan on foo  (cost=0.00..100.00 rows=1 width=630)
   Filter: (bar = $1)

Now the plan gets invoked with  $1 = 5.  What exactly in your mind would 
happen here?



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Marko Tiikkaja

On 13/01/16 14:12, Pavel Stehule wrote:

so the strategy - if cost of generic plan is less than some MAGIC CONSTANT
(can be specified by GUC), then use generic plan. Elsewhere use a custom
plan everytime.

It allow to controll the plan reusing. When MAGIC CONSTANT = 0 .. use
custom plan everytime, When MAGIC CONSTANT = M, then use generic plan
always.


I don't think that would solve even the original problem without 
effectively disabling generic plans, despite the problem being 
relatively simple.  The generic plan appears to be really cheap, because 
the planner doesn't have the concept of a "worst case".



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
Please, read my suggestion again.

TL;DR: I suggest to create "generic plan" with regard to current bind values.
What's wrong with that approach?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Pavel Stehule
2016-01-13 11:44 GMT+01:00 Vladimir Sitnikov :

> >the implementation is simply - but it hard to design some really general
> - it is task for UI
>
> Can you please rephrase?
>

Sorry - It is task for artifical inteligency

>
> Current design is "if the cost of a generic plan is less than the one
> of a custom plan+replan, prefer generic".
> I think that is wrong.
>
> "Generic plan" misunderestimates a cost in a sense that it assumes
> some pre-defined selectivities.
>

Generic plan in Postgres is optimized for most common values - so in
avarage it should be optimal. But the reality is different - the wrong
estimation can be everywhere and the estimation can be lower or upper than
reality.


> In other words, if "skewed" values are used, "custom plan" would
> likely to have *worse cost* than the one of a generic plan, yet custom
> plan is much more suitable for a particular parameter set.
> As backend refers to boundParams, it does see that particular
> condition is tough, while generic estimator just the cost.
>

And there is a second issue - you have not a idea, what parameter vector
will follow. You cannot to check and optimize plans every where, because a
planning can be expensive, and you should to reuse plan more times. What
was true, for first iterations, then it should not be true in following
iterations.

I like a strategy based on risks. Probably there are situation, when the
generic plan is great every time - INSERTs, UPDATEs via PK, simple SELECTs
via PK. generic plan can be well if almost all data has similar
probability. Elsewhere on bigger data, the probability of pretty slow plan
is higher, and then we should to prefer custom plan.

so the strategy - if cost of generic plan is less than some MAGIC CONSTANT
(can be specified by GUC), then use generic plan. Elsewhere use a custom
plan everytime.

It allow to controll the plan reusing. When MAGIC CONSTANT = 0 .. use
custom plan everytime, When MAGIC CONSTANT = M, then use generic plan
always.

Regards

Pavel


> Looking into plancache.c comments I see 3 possible plans:
> 1) custom plan with PARAM_FLAG_CONST=1. It should probably
> constant-fold based on input parameters.
>
> 2) custom plan with PARAM_FLAG_CONST=0. I think it should just use
> given parameters for selectivity estimations. The generated plan
> should still be valid for use with other input values.
> 3) generic plan. The plan with all variables. <-- here's current behavior
>
> 1 has a replan cost.
>
> 2&3 can be cached and reused.
>
> Is that correct?
> I think #2 is better option than #3 since it gives better plan
> stability, thus it is much easier to test and reason about.
>
> This all boils down to adjustment in a single line:
>
> https://github.com/postgres/postgres/blob/ee943004466418595363d567f18c053bae407792/src/backend/utils/cache/plancache.c#L1151-L1152
>
> Does that make sense?
>
> Vladimir
>


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>the implementation is simply - but it hard to design some really general - it 
>is task for UI

Can you please rephrase?

Current design is "if the cost of a generic plan is less than the one
of a custom plan+replan, prefer generic".
I think that is wrong.

"Generic plan" misunderestimates a cost in a sense that it assumes
some pre-defined selectivities.
In other words, if "skewed" values are used, "custom plan" would
likely to have *worse cost* than the one of a generic plan, yet custom
plan is much more suitable for a particular parameter set.
As backend refers to boundParams, it does see that particular
condition is tough, while generic estimator just the cost.

Looking into plancache.c comments I see 3 possible plans:
1) custom plan with PARAM_FLAG_CONST=1. It should probably
constant-fold based on input parameters.

2) custom plan with PARAM_FLAG_CONST=0. I think it should just use
given parameters for selectivity estimations. The generated plan
should still be valid for use with other input values.
3) generic plan. The plan with all variables. <-- here's current behavior

1 has a replan cost.

2&3 can be cached and reused.

Is that correct?
I think #2 is better option than #3 since it gives better plan
stability, thus it is much easier to test and reason about.

This all boils down to adjustment in a single line:
https://github.com/postgres/postgres/blob/ee943004466418595363d567f18c053bae407792/src/backend/utils/cache/plancache.c#L1151-L1152

Does that make sense?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-12 Thread Pavel Stehule
2016-01-12 17:01 GMT+01:00 Vladimir Sitnikov :

> VS>>Why doesn't backend reuse already existing good plan?
> PS>this logic is driven by plan cost, not by plan execution time.
>
> It do understand that currently PG replans with $1, $2 and uses
> default selectivities for that.
>
> What I am asking is to make PG aware of "previously used bind values",
> so it would calculate proper selectivities for $1, $2.
>

the implementation is simply - but it hard to design some really general -
it is task for UI

Pavel



>
> PS. It is not the first time the problem bites me, so I hand-crafted a
> testcase.
>



>
> Vladimir
>


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-12 Thread Vladimir Sitnikov
VS>>Why doesn't backend reuse already existing good plan?
PS>this logic is driven by plan cost, not by plan execution time.

It do understand that currently PG replans with $1, $2 and uses
default selectivities for that.

What I am asking is to make PG aware of "previously used bind values",
so it would calculate proper selectivities for $1, $2.

PS. It is not the first time the problem bites me, so I hand-crafted a testcase.

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-12 Thread Pavel Stehule
2016-01-12 16:52 GMT+01:00 Vladimir Sitnikov :

> > I don't know if there is a good solution except disabling server
> prepared statements.
>
> Why doesn't backend reuse already existing good plan?
>

probably good plan is more expensive than wrong plan :(

this logic is driven by plan cost, not by plan execution time.



> The plan does account for the skew.
>
> Can backend take selectivities from the original bind values?
>

> Vladimir
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-12 Thread Vladimir Sitnikov
> I don't know if there is a good solution except disabling server prepared 
> statements.

Why doesn't backend reuse already existing good plan?
The plan does account for the skew.

Can backend take selectivities from the original bind values?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-12 Thread Albe Laurenz
Vladimir Sitnikov wrote:
> Here's the simplified testcase:
> https://gist.github.com/vlsi/df08cbef370b2e86a5c1
> 
> It reproduces the problem in both 9.4.4 and 9.5rc1.
> It is reproducible via both psql and pgjdbc.
> 
> I use a single table, however my production case includes a join of
> two tables and the query is like
> select ... from foo, bar where foo.skewed=? and bar.non_skewed=? and
> foo.bar_id=bar.id
> 
> Note: my application _always_ sends *the same* *bad* value for skewed
> column (it effectively is used as a filtering column in the particular
> query).
> Unfortunately, on 6th execution backend switches to the plan that uses
> skewed index access.
> 
> Is it something that can be fixed/improved?
> 
> Good plan (the first 5 executions):
> Index Scan using non_skewed__flipper on plan_flipper
> (cost=0.43..42.77 rows=10 width=113) (actual time=0.030..0.072 rows=10
> loops=1)
>   Index Cond: (non_skewed = 42)
>   Filter: (skewed = 0)
>   Rows Removed by Filter: 10
>   Buffers: shared hit=20 read=3
> Execution time: 0.094 ms
> 
> Bad plan (all the subsequent executions):
> Index Scan using skewed__flipper on plan_flipper  (cost=0.43..6.77
> rows=1 width=113) (actual time=0.067..355.867 rows=10 loops=1)
>   Index Cond: (skewed = $1)
>   Filter: (non_skewed = $2)
>   Rows Removed by Filter: 90
>   Buffers: shared hit=18182 read=2735
> Execution time: 355.901 ms

The problem is that the index "skewed__flipper" is more selective than
"non_skewed__flipper" except when "skewed = 0", so the generic plan prefers it.

I don't know if there is a good solution except disabling server prepared 
statements.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-12 Thread Vladimir Sitnikov
Here's the simplified testcase:
https://gist.github.com/vlsi/df08cbef370b2e86a5c1

It reproduces the problem in both 9.4.4 and 9.5rc1.
It is reproducible via both psql and pgjdbc.

I use a single table, however my production case includes a join of
two tables and the query is like
select ... from foo, bar where foo.skewed=? and bar.non_skewed=? and
foo.bar_id=bar.id

Note: my application _always_ sends *the same* *bad* value for skewed
column (it effectively is used as a filtering column in the particular
query).
Unfortunately, on 6th execution backend switches to the plan that uses
skewed index access.

Is it something that can be fixed/improved?

Good plan (the first 5 executions):
Index Scan using non_skewed__flipper on plan_flipper
(cost=0.43..42.77 rows=10 width=113) (actual time=0.030..0.072 rows=10
loops=1)
  Index Cond: (non_skewed = 42)
  Filter: (skewed = 0)
  Rows Removed by Filter: 10
  Buffers: shared hit=20 read=3
Execution time: 0.094 ms

Bad plan (all the subsequent executions):
Index Scan using skewed__flipper on plan_flipper  (cost=0.43..6.77
rows=1 width=113) (actual time=0.067..355.867 rows=10 loops=1)
  Index Cond: (skewed = $1)
  Filter: (non_skewed = $2)
  Rows Removed by Filter: 90
  Buffers: shared hit=18182 read=2735
Execution time: 355.901 ms


Vladimir


plan_flipper.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-12 Thread Dave Cramer
Hi Marko,

Interesting so why would it choose a worse plan at that point ? Why would
it change at all if the current plan is working well ?

Dave Cramer

da...@postgresintl.com
www.postgresintl.com

On 12 January 2016 at 07:15, Marko Tiikkaja  wrote:

> On 12/01/16 13:00, Dave Cramer wrote:
>
>> We have an interesting problem, and the reporter has been kind enough to
>> provide logs for which we can't explain.
>>
>> I'd be interested to hear any plausible explanations for a prepared plan
>> suddenly going from 2ms to 60ms for the same input values ?
>>
>
> This is a new feature in 9.2, where on the fifth (or sixth, not sure)
> execution the planner might choose to use a generic plan.  From the 9.2
> release notes (though I'm fairly certain this is documented somewhere in
> the manual as well):
>
> In the past, a prepared statement always had a single "generic" plan that
> was used for all parameter values, which was frequently much inferior to
> the plans used for non-prepared statements containing explicit constant
> values. Now, the planner attempts to generate custom plans for specific
> parameter values. A generic plan will only be used after custom plans have
> repeatedly proven to provide no benefit. This change should eliminate the
> performance penalties formerly seen from use of prepared statements
> (including non-dynamic statements in PL/pgSQL).
>
>
> .m
>


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-12 Thread Marko Tiikkaja

On 12/01/16 13:00, Dave Cramer wrote:

We have an interesting problem, and the reporter has been kind enough to
provide logs for which we can't explain.

I'd be interested to hear any plausible explanations for a prepared plan
suddenly going from 2ms to 60ms for the same input values ?


This is a new feature in 9.2, where on the fifth (or sixth, not sure) 
execution the planner might choose to use a generic plan.  From the 9.2 
release notes (though I'm fairly certain this is documented somewhere in 
the manual as well):


In the past, a prepared statement always had a single "generic" plan 
that was used for all parameter values, which was frequently much 
inferior to the plans used for non-prepared statements containing 
explicit constant values. Now, the planner attempts to generate custom 
plans for specific parameter values. A generic plan will only be used 
after custom plans have repeatedly proven to provide no benefit. This 
change should eliminate the performance penalties formerly seen from use 
of prepared statements (including non-dynamic statements in PL/pgSQL).



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-12 Thread Dave Cramer
We have an interesting problem, and the reporter has been kind enough to
provide logs for which we can't explain.

I'd be interested to hear any plausible explanations for a prepared plan
suddenly going from 2ms to 60ms for the same input values ?


Dave Cramer

da...@postgresintl.com
www.postgresintl.com

-- Forwarded message --
From: Thomas Kellerer 
Date: 12 January 2016 at 04:03
Subject: [JDBC] Re: 9.4-1207 behaves differently with server side prepared
statements compared to 9.2-1102
To: pgsql-j...@postgresql.org


> Is it possible to get server logs ?

I have picked out one of the statements that suffered from this and
sanitized the logfile.

   http://sql-workbench.net/pg_jdbc_94.log

The complete statement is at the top of the file and in the messages
themselves,
I have replaced each occurrence of the statement with "select "

The interesting thing (at least for me) is that the first few executions of
the
server side statement have pretty much the same runtime as the ones before
the prepare.
And then suddenly the runtime shoots through the rough going up from 1ms to
40ms or even 60ms

Regards
Thomas




--
Sent via pgsql-jdbc mailing list (pgsql-j...@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc