Re: [HACKERS] ExecGather() + nworkers

2016-01-13 Thread Amit Kapila
On Mon, Jan 11, 2016 at 9:16 AM, Amit Kapila 
wrote:
> On Mon, Jan 11, 2016 at 3:14 AM, Peter Geoghegan  wrote:
> >
> > On Sun, Jan 10, 2016 at 9:13 AM, Robert Haas 
wrote:
> > >> I'm not sure why the test for nworkers following the
> > >> LaunchParallelWorkers() call doesn't look like this, though:
> > >>
> > >> /* Set up tuple queue readers to read the results. */
> > >> if (pcxt->nworkers_launched > 0)
> > >> {
> > >> ...
> > >> }
> > >
> > > Hmm, yeah, I guess it could do that.
> >
> > That would make it clearer as an example.
> >
> > >> But going to this additional trouble (detecting no workers launched
on
> > >> the basis of !nworkers_launched) suggests that simply testing
> > >> nworkers_launched would be wrong, which AFAICT it isn't. Can't we
just
> > >> do that, and in so doing also totally remove the "for" loop shown
> > >> here?
> > >
> > > I don't see how the for loop goes away.
> >
> > I meant that some code in the "for" loop goes away. Not all of it.
> > Just the more obscure code. As I said, I'm mostly pointing this out
> > out of concern for making it clearer as example code.
> >
>
> Right, I can write a patch to do it in a way you are suggesting if you
> are not planning to do it.
>

Changed the code such that nworkers_launched gets used wherever
appropriate instead of nworkers.  This includes places other than
pointed out above.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


optimize_parallelism_code_for_launched_workers_usage_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>the implementation is simply - but it hard to design some really general - it 
>is task for UI

Can you please rephrase?

Current design is "if the cost of a generic plan is less than the one
of a custom plan+replan, prefer generic".
I think that is wrong.

"Generic plan" misunderestimates a cost in a sense that it assumes
some pre-defined selectivities.
In other words, if "skewed" values are used, "custom plan" would
likely to have *worse cost* than the one of a generic plan, yet custom
plan is much more suitable for a particular parameter set.
As backend refers to boundParams, it does see that particular
condition is tough, while generic estimator just the cost.

Looking into plancache.c comments I see 3 possible plans:
1) custom plan with PARAM_FLAG_CONST=1. It should probably
constant-fold based on input parameters.

2) custom plan with PARAM_FLAG_CONST=0. I think it should just use
given parameters for selectivity estimations. The generated plan
should still be valid for use with other input values.
3) generic plan. The plan with all variables. <-- here's current behavior

1 has a replan cost.

2&3 can be cached and reused.

Is that correct?
I think #2 is better option than #3 since it gives better plan
stability, thus it is much easier to test and reason about.

This all boils down to adjustment in a single line:
https://github.com/postgres/postgres/blob/ee943004466418595363d567f18c053bae407792/src/backend/utils/cache/plancache.c#L1151-L1152

Does that make sense?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Marko Tiikkaja

On 13/01/16 14:27, Vladimir Sitnikov wrote:

TL;DR: I suggest to create "generic plan" with regard to current bind values.
What's wrong with that approach?


I don't understand how this is supposed to work.  Say you already have a 
plan which looks like this:


 Seq Scan on foo  (cost=0.00..100.00 rows=1 width=630)
   Filter: (bar = $1)

Now the plan gets invoked with  $1 = 5.  What exactly in your mind would 
happen here?



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench stats per script & other stuff

2016-01-13 Thread Fabien COELHO


Hello Alvaro,


Here is a two part v12, which:

part a (refactoring of scripts and their stats):
 - fix option checks (-i alone)
 - s/repleacable/replaceable/ in doc
 - keep small description in doc and help for -S & -N
 - fix 2 comments for pg style
 - show builtin list if not found


I'm looking at this part of your patch and I think it's far too big to
be a simple refactoring.  Would you split it up please?
I think the StatsData / SimpleStat addition should be one patch;
then there's the -b changes.  Then there may (or may not) be a bunch of 
other minor cleanups, not sure.


I'm willing to commit these patches if I can easily review what they do,
which I cannot with the current state.


Hmmm. ISTM that other people already reviewed it.

I can try to separate (again) some stuff, but there will be no miracle.

The overdue refactoring is because pgbench collects statistics at various 
levels, and each time this is done in a different way. Cleaning this 
requires to touch the stuff in many places, which means a "big" patch, 
although ISTM a straightforward one, but this already the case with this 
one.



Please pgindent; make sure to add /*--- here to avoid pgindent mangling
the comment:


Ok.


part b (weight)
 - check that the weight is an int


This part looks okay to me.  Minor nitpick,

+   int i = 0, w = 0, wc = (int) getrand(thread, 0, total_weight - 1);

should be three lines, not one.


Ok.

Also the @W part in the --help output should be in brackets, as 
FILE[@W], right?


Why not.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-13 Thread Anastasia Lubennikova



13.01.2016 04:47, David Rowley :
On 13 January 2016 at 06:47, Jeff Janes > wrote:



Why is omit_opclass a separate patch?  If the included columns now
never participate in the index ordering, shouldn't it be an inherent
property of the main patch that you can "cover" things without btree
opclasses?


I don't personally think the covering_unique_4.0.patch is that close 
to being too big to review, I think things would make more sense of 
the omit_opclass_4.0.patch was included together with this.




I agree that these patches should be merged. It'll be fixed it the next 
updates.
I kept them separate only for historical reasons, it was more convenient 
for me to debug them. Furthermore, I wanted to show some performance 
degradation caused by "omit_opclass" and give a way to reproduce it 
performing test with and whithot the patch.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
Note: I do not suggest changing already cached plans yet.
I suggest looking into "6th bind values" when building a cached plan.
In other words, "if first 5 execution do not reveal dependence on bind
values, then cache the generated plan".

>Say you already have a plan which looks like this:
>Now the plan gets invoked with  $1 = 5.  What exactly in your mind would 
>happen here?

A sequential scan with $1=5 condition. What else could be there?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
> The custom plan is *more expensive*;

You compare costs of custom vs generic plans.
I suggest: do not compare costs *at all*.

>I don't know, it's your proposal :-)  But it looks like I misunderstood.

It is not.

My suggestion is: build a generic plan (that is the plan that will
return proper result for every possible bind value), yet refer to the
values of 6th binds when estimating cardinalitites.
Is it clear now?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Pavel Stehule
2016-01-13 11:44 GMT+01:00 Vladimir Sitnikov :

> >the implementation is simply - but it hard to design some really general
> - it is task for UI
>
> Can you please rephrase?
>

Sorry - It is task for artifical inteligency

>
> Current design is "if the cost of a generic plan is less than the one
> of a custom plan+replan, prefer generic".
> I think that is wrong.
>
> "Generic plan" misunderestimates a cost in a sense that it assumes
> some pre-defined selectivities.
>

Generic plan in Postgres is optimized for most common values - so in
avarage it should be optimal. But the reality is different - the wrong
estimation can be everywhere and the estimation can be lower or upper than
reality.


> In other words, if "skewed" values are used, "custom plan" would
> likely to have *worse cost* than the one of a generic plan, yet custom
> plan is much more suitable for a particular parameter set.
> As backend refers to boundParams, it does see that particular
> condition is tough, while generic estimator just the cost.
>

And there is a second issue - you have not a idea, what parameter vector
will follow. You cannot to check and optimize plans every where, because a
planning can be expensive, and you should to reuse plan more times. What
was true, for first iterations, then it should not be true in following
iterations.

I like a strategy based on risks. Probably there are situation, when the
generic plan is great every time - INSERTs, UPDATEs via PK, simple SELECTs
via PK. generic plan can be well if almost all data has similar
probability. Elsewhere on bigger data, the probability of pretty slow plan
is higher, and then we should to prefer custom plan.

so the strategy - if cost of generic plan is less than some MAGIC CONSTANT
(can be specified by GUC), then use generic plan. Elsewhere use a custom
plan everytime.

It allow to controll the plan reusing. When MAGIC CONSTANT = 0 .. use
custom plan everytime, When MAGIC CONSTANT = M, then use generic plan
always.

Regards

Pavel


> Looking into plancache.c comments I see 3 possible plans:
> 1) custom plan with PARAM_FLAG_CONST=1. It should probably
> constant-fold based on input parameters.
>
> 2) custom plan with PARAM_FLAG_CONST=0. I think it should just use
> given parameters for selectivity estimations. The generated plan
> should still be valid for use with other input values.
> 3) generic plan. The plan with all variables. <-- here's current behavior
>
> 1 has a replan cost.
>
> 2&3 can be cached and reused.
>
> Is that correct?
> I think #2 is better option than #3 since it gives better plan
> stability, thus it is much easier to test and reason about.
>
> This all boils down to adjustment in a single line:
>
> https://github.com/postgres/postgres/blob/ee943004466418595363d567f18c053bae407792/src/backend/utils/cache/plancache.c#L1151-L1152
>
> Does that make sense?
>
> Vladimir
>


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
Please, read my suggestion again.

TL;DR: I suggest to create "generic plan" with regard to current bind values.
What's wrong with that approach?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-13 Thread Anastasia Lubennikova

13.01.2016 04:27, David Rowley:

I've also done some testing:

create table ab (a int, b int);
insert into ab select a,b from generate_Series(1,10) a(a), 
generate_series(1,1) b(b);

set enable_bitmapscan=off;
set enable_indexscan=off;

select * from ab where a = 1 and b=1;
 a | b
---+---
 1 | 1
(1 row)

set enable_indexscan = on;
select * from ab where a = 1 and b=1;
 a | b
---+---
(0 rows)

This is broken. I've not looked into why yet, but from looking at the 
EXPLAIN output I was a bit surprised to see b=1 as an index condition. 
I'd have expected a Filter maybe, but I've not looked at the EXPLAIN 
code to see how those are determined yet.


Hmm... Do you use both patches?
And could you provide index definition, I can't reproduce the problem 
assuming that index is created by the statement

CREATE INDEX idx ON ab (a) INCLUDING (b);

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extend pgbench expressions with functions

2016-01-13 Thread Fabien COELHO


Hello Michaël,


ISTM that if pgbench is to be stopped, the simplest option is just to abort
with a nicer error message from the get*Rand function, there is no need to
change the function signature and transfer the error management upwards.


That's fine to me, as long as the solution is elegant.


Hmmm, this is subjective:-)

I've decided to stay with the current behavior (\setrandom), that is to 
abort the current transaction on errors but not to abort pgbench itself. 
The check is done before calling the functions, and I let an assert in the 
functions just to be sure. It is going to loop on errors anyway, but this 
is what it already does anyway.



Ok. I'm hesitating about removing the operator management, especially if I'm
told to put it back afterwards.


I can understand that, things like that happen all the time here and
that's not a straight-forward patch that we have here. I am sure that
additional opinions here would be good to have before taking one
decision or another. With the current statu-quo, let's just do what
you think is best.


I let the operators alone and just adds functions management next to it. 
I'll merge operators as functions only if it is a blocker.


I have assumed that your v4 is really v17, and this is v18...

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 541d17b..a8bfc79 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -771,17 +771,21 @@ pgbench  options  dbname
   Sets variable varname to an integer value calculated
   from expression.
   The expression may contain integer constants such as 5432,
-  references to variables :variablename,
+  double constants such as 3.14159,
+  references to integer variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity, function calls and
+  parentheses.
+   shows the available
+  functions.
  
 
  
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -801,66 +805,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger 

Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Marko Tiikkaja

On 13/01/16 14:12, Pavel Stehule wrote:

so the strategy - if cost of generic plan is less than some MAGIC CONSTANT
(can be specified by GUC), then use generic plan. Elsewhere use a custom
plan everytime.

It allow to controll the plan reusing. When MAGIC CONSTANT = 0 .. use
custom plan everytime, When MAGIC CONSTANT = M, then use generic plan
always.


I don't think that would solve even the original problem without 
effectively disabling generic plans, despite the problem being 
relatively simple.  The generic plan appears to be really cheap, because 
the planner doesn't have the concept of a "worst case".



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Marko Tiikkaja

On 13/01/16 14:36, Vladimir Sitnikov wrote:

Say you already have a plan which looks like this:
Now the plan gets invoked with  $1 = 5.  What exactly in your mind would happen 
here?


A sequential scan with $1=5 condition. What else could be there?


I don't know, it's your proposal :-)  But it looks like I misunderstood.


Note: I do not suggest changing already cached plans yet.
I suggest looking into "6th bind values" when building a cached plan.


But that wouldn't have helped your case.  The custom plan is *more 
expensive*; I'm guessing because the generic plan gambles on a better 
average case instead of preparing for the worst case.



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgindent-polluted commits

2016-01-13 Thread Mark Dilger

> On Jan 13, 2016, at 9:13 AM, Tom Lane  wrote:
> 
> Simon Riggs  writes:
>> On 13 January 2016 at 14:48, Noah Misch  wrote:
>>> I've noticed commits, from a few of you, carrying pgindent changes to lines
>>> the patch would not otherwise change.
> 
>> Could we review again why this matters?
> 
> Basically this is trading off convenience of the committer (all of the
> alternatives Noah mentions are somewhat annoying) versus the convenience
> of post-commit reviewers.  I'm not sure that his recommendation is the
> best trade-off, nor that the situation is precisely comparable to
> pre-commit review.  There definitely will be pre-commit review, there
> may or may not be any post-commit review.
> 
> I'm willing to go with the "separate commit to reindent individual files"
> approach if there's a consensus that that makes for a cleaner git history.
> But I'm not 100% convinced it matters.

As somebody who maintains a fork of the code base, I can say it is easier
to deal with merge conflicts when work is broken out into smaller commits.
Separating whitespace and formatting changes into their own commits
would make my life a little easier.

OTOH, I don't know if the core developer community cares about the ease
of maintaining code forks.


mark



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Weird behavior during CREATE EXTENSION

2016-01-13 Thread Jim Nasby

On 1/12/16 5:00 PM, Tom Lane wrote:

There's certainly room to improve error reporting for extension scripts,
but I think you are heading into a dead end in the name of saving a little
time coding.  I'd suggest looking into an errcontext callback, instead.

Also, there's some technology in CREATE FUNCTION that deals with the fact
that we may be calling the parser on a string different from the original
user command, which might be worth borrowing here --- at least part of
the confusion is that it's evidently reporting a cursor position relative
to the extension script as though it applied to the CREATE EXTENSION.


Are you talking about plpgsql_compile_error_callback()? It looks like it 
does it's magic by relying on the plpgsql parser to keep track of where 
it's at.


ISTM part of the goal here should be to show what the actual command was 
that failed (ie: the command in the extension file). I'm guessing the 
way to do that would be to have pg_parse_query() keep the original 
statement in the parse nodes?


I guess if this was easy it would already have been fixed...
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-13 Thread David Rowley
On 14 January 2016 at 02:58, Anastasia Lubennikova <
a.lubennik...@postgrespro.ru> wrote:

> 13.01.2016 04:27, David Rowley:
>
>> I've also done some testing:
>>
>> create table ab (a int, b int);
>> insert into ab select a,b from generate_Series(1,10) a(a),
>> generate_series(1,1) b(b);
>> set enable_bitmapscan=off;
>> set enable_indexscan=off;
>>
>> select * from ab where a = 1 and b=1;
>>  a | b
>> ---+---
>>  1 | 1
>> (1 row)
>>
>> set enable_indexscan = on;
>> select * from ab where a = 1 and b=1;
>>  a | b
>> ---+---
>> (0 rows)
>>
>> This is broken. I've not looked into why yet, but from looking at the
>> EXPLAIN output I was a bit surprised to see b=1 as an index condition. I'd
>> have expected a Filter maybe, but I've not looked at the EXPLAIN code to
>> see how those are determined yet.
>>
>
> Hmm... Do you use both patches?
> And could you provide index definition, I can't reproduce the problem
> assuming that index is created by the statement
> CREATE INDEX idx ON ab (a) INCLUDING (b);


Sorry, I forgot the index, and yes you guessed correctly about that.

The problem only exists without the omit_opclass_4.0.patch and with the
covering_unique_4.0.patch, so please ignore.

I will try to review the omit_opclass_4.0.patch soon.

David

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-13 Thread Jeff Janes
On Sat, Dec 26, 2015 at 9:12 PM, Jeff Janes  wrote:
> On Fri, Dec 18, 2015 at 11:43 AM, Artur Zakirov
>  wrote:
>> Hello.
>>
>> PostgreSQL has a contrib module named pg_trgm. It is used to the fuzzy text
>> search. It provides some functions and operators for determining the
>> similarity of the given texts using trigram matching.
>>
>> At the moment, in pg_trgm both the similarity function and the % operator
>> match two strings expecting that they are similar entirely. But they give
>> bad results if we want to find documents by a query which is substring of a
>> document.
>
> This is very interesting.  I suspect the index will not be very useful
> in cases where the full string is much larger than the substring,
> because the limit will not be met often enough to rule out many rows
> just based on the index data.  I have a pretty good test case to see.

My test case is turning out to be harder to evaluate than I thought,
because it was more complicated than I remembered it being.  Hopefully
I can provide some more feedback on it next week.

In the meantime, I had a question about bumping the version to 1.3.

Version 1.2 of pg_trgm has never been included in a community release
(because it didn't make the 9.5 cutoff).  So should we really bump the
version to 1.3, or just merge the changes here directly into the
existing 1.2 HEAD code?

I think it would be pretty odd for 9.5. to come with pg_trgm 1.1 and
for 9.6 to come with pg_trgm 1.3.

Thanks,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-01-13 Thread Jim Nasby

On 1/12/16 11:25 AM, Catalin Iacob wrote:

>The differentiation between Error and SPIError is wrong, because there isn't
>any difference in reality.

They're similar but not really the same thing. raise Error and
plpy.error are both ways to call ereport(ERROR, ...) while SPIError is
raised when coming back after calling into Postgres to execute SQL
that itself raises an error. Now indeed, that executed SQL raised an
error itself via another ereport(ERROR, ...) somewhere but that's a
different thing.


Why should they be different? An error is an error. You either want to 
trap a specific type of error or you don't. Having two completely 
different ways to do the same thing is just confusing.


IMHO the Error and Fatal classes should never have been committed, 
especially since they're undocumented. It's not the responsibility of 
this patch to remove them, but it certainly shouldn't dig the hole deeper.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Removing service-related code in pg_ctl for Cygwin

2016-01-13 Thread Michael Paquier
Hi all,

Beginning a new thread seems more adapted regarding $subject but
that's mentioned here as well:
http://www.postgresql.org/message-id/CAB7nPqQXghm_SdB5iniupz1atzMxk=95gv9a8ocdo83sxcn...@mail.gmail.com

It happens based on some investigation from Andrew that cygwin does
not need to use the service-related code, aka register/unregister
options or similar that are proper to WIN32 and rely instead on
cygrunsrv with a SYSV-like init file to manage the service. Based on
my tests with cygwin, I am able to see as well that a server started
within a cygwin session is able to persist even after it is closed,
which is kind of nice and I think that it is a additional reason to
not use the service-related code paths. Hence what about the following
patch, that makes cygwin behave like any *nix OS when using pg_ctl?
This simplifies a bit the code.

Marco, as I think you do some packaging for Postgres in Cygwin, and
others, does that sound acceptable?

Regards,
-- 
Michael


cygwin-removal-service-master.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Truncation of identifiers

2016-01-13 Thread Thomas Munro
Hi hackers

Wouldn't it be better to raise an error when identifiers are too long,
rather than accepting but truncating them?  I'm not aware of any other
database that does this.  If you're using oversized identifiers you
could finish up using more than one way to refer to the same database
object, and then your queries will have a different meaning if
NAMEDATALEN ever changes.  If you're automatically generating
identifier names (say for partitioning), wouldn't you rather find out
about the size being too long immediately, rather than later when you
try to generate a second long name that collides with the first after
truncation?  I suppose there could be a GUC
truncate_oversized_identifiers defaulting to off, which could be
turned on by those who really prefer the current
truncate-but-raise-NOTICE behaviour.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extend pgbench expressions with functions

2016-01-13 Thread Michael Paquier
On Wed, Jan 13, 2016 at 10:28 PM, Fabien COELHO  wrote:
>
> Hello Michaël,
>
>>> ISTM that if pgbench is to be stopped, the simplest option is just to
>>> abort
>>> with a nicer error message from the get*Rand function, there is no need
>>> to
>>> change the function signature and transfer the error management upwards.
>>
>>
>> That's fine to me, as long as the solution is elegant.
>
>
> Hmmm, this is subjective:-)

And dependent on personal opinions and views.

> I've decided to stay with the current behavior (\setrandom), that is to
> abort the current transaction on errors but not to abort pgbench itself. The
> check is done before calling the functions, and I let an assert in the
> functions just to be sure. It is going to loop on errors anyway, but this is
> what it already does anyway.

OK, yes I see now I missed that during my last review. This has the
disadvantage to double the amount of code dedicated to parameter
checks though :(
But based on the feedback perhaps other folks would feel that it would
be actually worth simply dropping the existing \setrandom command. I
won't object to that personally, such pgbench features are something
for hackers and devs mainly, so I guess that we could survive without
a deprecation period here.

>>> Ok. I'm hesitating about removing the operator management, especially if
>>> I'm
>>> told to put it back afterwards.
>>
>>
>> I can understand that, things like that happen all the time here and
>> that's not a straight-forward patch that we have here. I am sure that
>> additional opinions here would be good to have before taking one
>> decision or another. With the current statu-quo, let's just do what
>> you think is best.
>
>
> I let the operators alone and just adds functions management next to it.
> I'll merge operators as functions only if it is a blocker.

I think that's a blocker, but I am not the only one here and not a committer.

- fprintf(stderr, "gaussian parameter must be at least
%f (not \"%s\")\n", MIN_GAUSSIAN_PARAM, argv[5]);
+fprintf(stderr,
+   "random gaussian parameter must be greater than %f "
+"(got %f)\n", MIN_GAUSSIAN_PARAM, parameter);
This looks like useless noise to me. Why changing those messages?

- if (parameter <= 0.0)
+if (parameter < 0.0)
  {

This bit is false, and triggers an assertion failure when the
exponential parameter is 0.

  fprintf(stderr,
-   "exponential parameter must be greater than
zero (not \"%s\")\n",
-argv[5]);
+  "random exponential parameter must be greater than 0.0 "
+   "(got %f)\n", parameter);
  st->ecnt++;
-return true;
+   return false;
This diff is noise as well, and should be removed.

+   /*
+* Note: this section could be removed, as the same
functionnality
+* is available through \set xxx random_gaussian(...)
+*/
I think that you are right to do that. That's no fun to break existing
scripts, even if people doing that with pgbench are surely experienced
hackers.

}
-
case ENODE_VARIABLE:
Such diffs are noise as well.

int() should be strengthened regarding bounds. For example:
\set cid debug(int(int(9223372036854775807) +
double(9223372036854775807)))
 debug(script=0,command=1): int -9223372036854775808
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] About get_relation_constraints and include_notnull

2016-01-13 Thread Amit Langote
Why does the argument include_notnull argument exist if
get_relation_constraints() is being called from only one place? Perhaps we
could remove it and add the IS NOT NULL test expression unconditionally if
there are any NOT NULL columns.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW join pushdown and scanclauses

2016-01-13 Thread Ashutosh Bapat
On Thu, Jan 14, 2016 at 9:31 AM, Etsuro Fujita 
wrote:

> On 2016/01/08 22:05, Ashutosh Bapat wrote:
>
>> In add_paths_to_joinrel(), the FDW specific hook GetForeignJoinPaths()
>> is called. This hook if implemented should add ForeignPaths for pushed
>> down joins. create_plan_recurse() calls create_scan_plan() on seeing
>> these paths.
>>
>
> create_scan_plan() generates a list of clauses to be applied on scan
>> from rel->baserestrictinfo and parameterization clauses. This list is
>> passed to create_*scan_plan routine as scanclauses. This code is very
>> specific for single relations scans. Now that we are using
>> create_scan_plan() for creating plan for join relations, it needs some
>> changes so that quals relevant to a join can be passed to
>> create_foreignscan_plan().
>>
>
> Do we really need that?  The relevant join quals are passed to
> GetForeignJoinPaths as extra->restrictlist, so I think we can get those
> quals during GetForeignPlan, by looking at the selected ForeignPath that is
> passed to that function as a parameter.
>

Right! You are suggesting that an FDW should just ignore scan_clauses for
join relation and manage those by itself. That looks fine.


>
> A related problem is in
>> create_foreignscan_plan(), which sets ForeignScan::fsSystemCol if a
>> system column is being used in the targetlist or quals. Right now it
>> only checks rel->baserestrictinfo, which is NULL for a joinrel. Thus in
>> case a system column appears in the joinclauses it will not be considered.
>>
>
> IIUC, we assume that such system columns are assumed to be contained in
> fdw_scan_tlist in the joinrel case.
>

>From another mail thread [1] that you have started, I understood that
fsSystemCol should not be set for a join relation, so the bug doesn't exist.

[1] http://www.postgresql.org/message-id/559f9323.4080...@lab.ntt.co.jp

Thanks for your inputs.
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


[HACKERS] PgConf.US Hackers Track CFP

2016-01-13 Thread Joshua D. Drake
The organization committee for PGConf.US invites all Pg Hackers to the 
PGConf.US 2016 hackers track. We want to showcase your talents to the 
general PostgreSQL community, potential new hackers plus provide a forum 
for feedback and deep discussion by having you present!


We have also brought back several events and amenities over the course 
of the 3 day event to stimulate ideas and collaboration amongst the 
PostgreSQL developers. These include the PgConf.US Hackers Lounge and 
the Regulated Industry Summit.


Bring your ideas, prowess, laptops and submit your talk at:

http://www.pgconf.us/2016/submit/

The CFP closes on January 31st, 2016, 11:59pm EST

Sincerely,

JD
--
Command Prompt, Inc.  http://the.postgres.company/
 +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing Functionally Dependent GROUP BY Columns

2016-01-13 Thread David Rowley
Many thanks for the thorough review!

On 12 January 2016 at 23:37, Julien Rouhaud 
wrote:

>
> In identify_key_vars()
>
> +   /* Only PK constraints are of interest for now, see comment above
> */
> +   if (con->contype != CONSTRAINT_PRIMARY)
> +   continue;
>
> I think the comment should better refer to
> remove_useless_groupby_columns() (don't optimize further than what
> check_functional_grouping() does).
>
>
I've improved this by explaining it more clearly.


>
> +   /*
> +* Exit if the constraint is deferrable, there's no point in
> looking
> +* for another PK constriant, as there can only be one.
> +*/
>
> small typo on "constriant"
>
>
Fixed.


>
>
> +   {
> +   varlistmatches = bms_add_member(varlistmatches,
> varlistidx);
> +   found_col = true;
> +   /* don't break, there might be dupicates */
> +   }
>
> small typo on "dupicates". Also, I thought transformGroupClauseExpr()
> called in the parse analysis make sure that there isn't any duplicate in
> the GROUP BY clause. Do I miss something?
>
>
No I missed something :) You are right. I've put a break; here and a
comment to explain why.


>
> in remove_useless_groupby_columns()
>
> +   /*
> +* Skip if there were no Vars found in the GROUP BY clause which
> belong
> +* to this relation. We can also skip if there's only 1 Var, as
> there
> +* needs to be more than one Var for there to be any columns
> which are
> +* functionally dependent on another column.
> +*/
>
>
> This comment doesn't sound very clear. I'm not a native english speaker,
> so maybe it's just me.
>
>
Yes it was badly worded. I've fixed in the attached.


> +   /*
> +* If we found any surplus Vars in the GROUP BY clause, then we'll
> build
> +* a new GROUP BY clause without these surplus Vars.
> +*/
> +   if (anysurplus)
> +   {
> +   List *new_groupby = NIL;
> +
> +   foreach (lc, root->parse->groupClause)
> +   {
> +   SortGroupClause *sgc = (SortGroupClause *) lfirst(lc);
> +   TargetEntry *tle;
> +   Var *var;
> +
> +   tle = get_sortgroupclause_tle(sgc, root->parse->targetList);
> +   var = (Var *) tle->expr;
> +
> +   if (!IsA(var, Var))
> +   continue;
> + [...]
>
> if var isn't a Var, it needs to be added in new_groupby.
>
>
Yes you're right, well, at least I've written enough code to ensure that
it's not needed.
Technically we could look inside non-Vars and check if the Expr is just
made up of Vars from a single relation, and then we could also make that
surplus if we find other Vars which make up the table's primary key.  I
didn't make these changes as I thought it was a less likely scenario. It
wouldn't be too much extra code to add however. I've went and added an XXX
comment to explain that there might be future optimisation opportunities in
the future around this.

I've attached an updated patch.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


prune_group_by_clause_90b5f3c_2016-01-14.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Weird behavior during CREATE EXTENSION

2016-01-13 Thread Tom Lane
Jim Nasby  writes:
> On 1/12/16 5:00 PM, Tom Lane wrote:
>> Also, there's some technology in CREATE FUNCTION that deals with the fact
>> that we may be calling the parser on a string different from the original
>> user command, which might be worth borrowing here --- at least part of
>> the confusion is that it's evidently reporting a cursor position relative
>> to the extension script as though it applied to the CREATE EXTENSION.

> Are you talking about plpgsql_compile_error_callback()? It looks like it 
> does it's magic by relying on the plpgsql parser to keep track of where 
> it's at.

Not really.  It's about transposing the error to an "internal query"
instead of reporting it against the user's text.

I was imagining something roughly like the attached.  Experimenting with
this, the good news is that you get a decent error position:

regression=# create extension pg_trgm ;
ERROR:  syntax error at or near "USINGgist"
LINE 129: ALTER OPERATOR FAMILY gist_trgm_ops USINGgist ADD
  ^
QUERY:  /* contrib/pg_trgm/pg_trgm--1.2.sql */

-- complain if script is sourced in psql, rather than via CREATE EXTENSION
...
ALTER OPERATOR FAMILY gin_trgm_ops USING gin ADD
FUNCTION6  (text,text) gin_trgm_triconsistent (internal, 
int2, text, int4, internal, internal, internal);

CONTEXT:  extension script file 
"/home/postgres/testversion/share/extension/pg_trgm--1.2.sql"
regression=# 

The bad news is that psql regurgitates the whole script in the QUERY
field, because that's what we said was the internal query.

There are various ways you could imagine printing just part of the script;
for instance, starting from the cursor position, scan forward and back for
semicolons at line ends, and trim the query string to report just that
much.  Or maybe it's worth teaching the grammar to report the first token
location of each parsetree, and then the loop in execute_sql_string could
stash that away for use by the error reporter.

I don't really have time to fool with this, but if someone else wants
to run with it ...

regards, tom lane

diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 9d84b79..b4b9286 100644
*** a/src/backend/commands/extension.c
--- b/src/backend/commands/extension.c
*** typedef struct ExtensionVersionInfo
*** 96,102 
--- 96,110 
  	struct ExtensionVersionInfo *previous;		/* current best predecessor */
  } ExtensionVersionInfo;
  
+ /* State needed by script_error_callback */
+ typedef struct
+ {
+ 	const char *script;
+ 	const char *filename;
+ }	script_error_callback_arg;
+ 
  /* Local functions */
+ static void script_error_callback(void *arg);
  static List *find_update_path(List *evi_list,
   ExtensionVersionInfo *evi_start,
   ExtensionVersionInfo *evi_target,
*** read_extension_script_file(const Extensi
*** 682,692 
--- 690,713 
  static void
  execute_sql_string(const char *sql, const char *filename)
  {
+ 	script_error_callback_arg callback_arg;
+ 	ErrorContextCallback sqlerrcontext;
  	List	   *raw_parsetree_list;
  	DestReceiver *dest;
  	ListCell   *lc1;
  
  	/*
+ 	 * Setup error traceback support for ereport().
+ 	 */
+ 	callback_arg.script = sql;
+ 	callback_arg.filename = filename;
+ 
+ 	sqlerrcontext.callback = script_error_callback;
+ 	sqlerrcontext.arg = (void *) _arg;
+ 	sqlerrcontext.previous = error_context_stack;
+ 	error_context_stack = 
+ 
+ 	/*
  	 * Parse the SQL string into a list of raw parse trees.
  	 */
  	raw_parsetree_list = pg_parse_query(sql);
*** execute_sql_string(const char *sql, cons
*** 755,765 
--- 776,809 
  		}
  	}
  
+ 	error_context_stack = sqlerrcontext.previous;
+ 
  	/* Be sure to advance the command counter after the last script command */
  	CommandCounterIncrement();
  }
  
  /*
+  * error context callback for errors within an extension script
+  */
+ static void
+ script_error_callback(void *arg)
+ {
+ 	script_error_callback_arg *callback_arg = (script_error_callback_arg *) arg;
+ 	int			syntaxerrposition;
+ 
+ 	/* If it's a syntax error, convert to internal syntax error report */
+ 	syntaxerrposition = geterrposition();
+ 	if (syntaxerrposition > 0)
+ 	{
+ 		errposition(0);
+ 		internalerrposition(syntaxerrposition);
+ 		internalerrquery(callback_arg->script);
+ 	}
+ 
+ 	errcontext("extension script file \"%s\"", callback_arg->filename);
+ }
+ 
+ /*
   * Execute the appropriate script file for installing or updating the extension
   *
   * If from_version isn't NULL, it's an update

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Truncation of identifiers

2016-01-13 Thread Tom Lane
Thomas Munro  writes:
> Wouldn't it be better to raise an error when identifiers are too long,
> rather than accepting but truncating them?

I wouldn't think so.

> I'm not aware of any other database that does this.

It's standard practice in most programming languages AFAIK.  And SQL is
surely a programming language.

> If you're using oversized identifiers you
> could finish up using more than one way to refer to the same database
> object, and then your queries will have a different meaning if
> NAMEDATALEN ever changes.

No, they'd just start failing if they didn't match the object (which
there can be only one of, else you'd have gotten other errors).

Another argument against comes from the fact that NAMEDATALEN is usually
less than what SQL says is the minimum required length (viz, 128
characters).  Your proposal would have us throwing entirely needless
errors on queries that are fully spec-conformant.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Truncation of identifiers

2016-01-13 Thread Gavin Flower

On 14/01/16 13:05, Tom Lane wrote:

Thomas Munro  writes:

Wouldn't it be better to raise an error when identifiers are too long,
rather than accepting but truncating them?

I wouldn't think so.


I'm not aware of any other database that does this.

It's standard practice in most programming languages AFAIK.  And SQL is
surely a programming language.


If you're using oversized identifiers you
could finish up using more than one way to refer to the same database
object, and then your queries will have a different meaning if
NAMEDATALEN ever changes.

No, they'd just start failing if they didn't match the object (which
there can be only one of, else you'd have gotten other errors).

Another argument against comes from the fact that NAMEDATALEN is usually
less than what SQL says is the minimum required length (viz, 128
characters).  Your proposal would have us throwing entirely needless
errors on queries that are fully spec-conformant.

regards, tom lane



I would prefer a database to be more strict.

How about a GUC to control this behaviour, with the default to be lax?


Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.x versus PG 9.1 branch

2016-01-13 Thread Michael Paquier
On Thu, Jan 14, 2016 at 12:37 PM, Noah Misch  wrote:
> On Wed, Jan 13, 2016 at 11:46:07AM -0500, Tom Lane wrote:
>> [...] we've repeatedly not bothered
>> to back-port regression test fixes for newer Pythons into that branch.
>> I could just omit Python 3 coverage for that branch in the critter's
>> configuration, but I wonder exactly why things are that way.
>>
>> For clarity, to cover 9.1 I think we'd need to back-patch some subset
>> of these commits:
>>
>> f16d52269 ff2faeec5 d0765d50f 6bff0e7d9 527ea6684 8182ffde5
>> 45d1f1e02 2cfb1c6f7
>>
>> The precedent of not fixing 9.1 started with the last of these.
>
>> Or we could just blow it off on the grounds that 9.1 is not long
>> for this world anyhow.
>>
>> Opinions anyone?
>
> I respect the 2012-era decision to have 9.1 not support newer Python, and I
> think the lack of user complaints validates it.  I wouldn't object to
> overturning the decision, either.  The biggest risk, albeit still a small
> risk, is that newer Python is incompatible with 9.1 in a way that the test
> suite does not catch.

The lack of user complaints regarding 9.1 using Python 3.5 and the
fact that 9.1 will be EOL in 8~9 months does not sound worth it to me.

A couple of days ago I bumped into this article, leading to the
thought that Python 4.0 may induce as much breakage as 3.5 did :(
http://astrofrog.github.io/blog/2016/01/12/stop-writing-python-4-incompatible-code/
Just something to keep in mind.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.x versus PG 9.1 branch

2016-01-13 Thread Noah Misch
On Wed, Jan 13, 2016 at 11:46:07AM -0500, Tom Lane wrote:
> [...] we've repeatedly not bothered
> to back-port regression test fixes for newer Pythons into that branch.
> I could just omit Python 3 coverage for that branch in the critter's
> configuration, but I wonder exactly why things are that way.
> 
> For clarity, to cover 9.1 I think we'd need to back-patch some subset
> of these commits:
> 
> f16d52269 ff2faeec5 d0765d50f 6bff0e7d9 527ea6684 8182ffde5
> 45d1f1e02 2cfb1c6f7
> 
> The precedent of not fixing 9.1 started with the last of these.

> Or we could just blow it off on the grounds that 9.1 is not long
> for this world anyhow.
> 
> Opinions anyone?

I respect the 2012-era decision to have 9.1 not support newer Python, and I
think the lack of user complaints validates it.  I wouldn't object to
overturning the decision, either.  The biggest risk, albeit still a small
risk, is that newer Python is incompatible with 9.1 in a way that the test
suite does not catch.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW join pushdown and scanclauses

2016-01-13 Thread Etsuro Fujita

On 2016/01/08 22:05, Ashutosh Bapat wrote:

In add_paths_to_joinrel(), the FDW specific hook GetForeignJoinPaths()
is called. This hook if implemented should add ForeignPaths for pushed
down joins. create_plan_recurse() calls create_scan_plan() on seeing
these paths.



create_scan_plan() generates a list of clauses to be applied on scan
from rel->baserestrictinfo and parameterization clauses. This list is
passed to create_*scan_plan routine as scanclauses. This code is very
specific for single relations scans. Now that we are using
create_scan_plan() for creating plan for join relations, it needs some
changes so that quals relevant to a join can be passed to
create_foreignscan_plan().


Do we really need that?  The relevant join quals are passed to 
GetForeignJoinPaths as extra->restrictlist, so I think we can get those 
quals during GetForeignPlan, by looking at the selected ForeignPath that 
is passed to that function as a parameter.



A related problem is in
create_foreignscan_plan(), which sets ForeignScan::fsSystemCol if a
system column is being used in the targetlist or quals. Right now it
only checks rel->baserestrictinfo, which is NULL for a joinrel. Thus in
case a system column appears in the joinclauses it will not be considered.


IIUC, we assume that such system columns are assumed to be contained in 
fdw_scan_tlist in the joinrel case.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Pavel Stehule
2016-01-13 16:22 GMT+01:00 Vladimir Sitnikov :

> >A value of -1 could disable generic plans
>
> I do not like the idea.
>
> I've seen dramatic performance improvements from using cached plans.
> The numbers are like "20ms to plan vs 1ms to execute" for an often
> used OLTP query. Query text is involved (~5-10KiB).
>

but currently we have not any tool how to check the quality of plan for new
set of parameters. If plan is ok for one value parameters, then can be
pretty bad for following parameters.

Albe's proposal can be good enough for 2/3 cases and it doesn't block any
other enhancing. There is still 1/3 of queries - too complex (slow
planning) too dynamic plan (the generic plan doesn't work).

Regards

Pavel


>
> Vladimir
>


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Tom Lane
Vladimir Sitnikov  writes:
> Note: I state that mixing "kinds" of bind values is a bad application
> design anyway. In other words, application developer should understand
> if a query is DWH-like (requires replans) or OLTP-like (does not
> require replans). Agreed?

No, not agreed.  As was already pointed out upthread, such information
is not available in many use-cases for the plancache.

The real problem here IMO is inaccurate plan cost estimates, and that's
not something that there is any easy fix for.

However ... one specific aspect of that is that to some extent, the cost
estimate made for the generic plan is incommensurate with the estimates
for the custom plans because the latter are made with more information.
I don't remember the details of your specific case anymore, but we've
seen cases where the generic plan is falsely estimated to be cheaper
than custom plans because of this.

I wonder whether it would be useful to reject a generic plan anytime its
estimate is less than the average (or minimum?) estimate for the custom
plans.  If it is less, then either (1) the generic plan is falsely
optimistic, or (2) the specific parameter values provided for the custom
plans were all ones for which the planner could see that the generic plan
was non-optimal.  If (2) holds for the first few custom plans then it's
not unreasonable to suppose that it will keep on holding, and we had
better not use the generic plan.

Basically, the case we're *expecting* to see is that a custom plan is the
same or better cost as the generic plan --- same cost if it's really the
same plan, better cost if knowing the parameter values allows some
optimization to be performed (LIKE-pattern-to-index conversion, partition
scan suppression via constraint exclusion, etc).  If we get a higher cost
estimate for the custom plan then something is fishy and we shouldn't
believe it.

Maybe I'm missing some case where that situation would arise naturally.
Or maybe such a rule wouldn't actually help in very many real-world
cases.  But it seems worth looking into.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Pavel Stehule
2016-01-13 16:18 GMT+01:00 Albe Laurenz :

> Pavel Stehule wrote:
> > I like a strategy based on risks. Probably there are situation, when the
> generic plan is great every
> > time - INSERTs, UPDATEs via PK, simple SELECTs via PK. generic plan can
> be well if almost all data has
> > similar probability. Elsewhere on bigger data, the probability of pretty
> slow plan is higher, and then
> > we should to prefer custom plan.
> >
> > so the strategy - if cost of generic plan is less than some MAGIC
> CONSTANT (can be specified by GUC),
> > then use generic plan. Elsewhere use a custom plan everytime.
> >
> > It allow to controll the plan reusing. When MAGIC CONSTANT = 0 .. use
> custom plan everytime, When
> > MAGIC CONSTANT = M, then use generic plan always.
>
> I have a different idea:
>
> What about a GUC "custom_plan_threshold" that controls after how many
> executions a generic plan will be considered, with a default value of 5.
>
> A value of -1 could disable generic plans.
>

yes, I though about it too - it is simple, and almost deterministic

Pavel


>
> Yours,
> Laurenz Albe
>


Re: [HACKERS] pg_dump and premature optimizations for objects not to be dumped

2016-01-13 Thread Tom Lane
I wrote:
> Attached is a draft patch that does things this way.

BTW, the bulk of the changes in this patch are just there to pass down
the DumpOptions struct to where it's needed, following some pre-existing
work that passed that around as a separate pointer.  I wonder why things
were done that way, rather than adding a DumpOptions pointer in the
Archive struct?  The vast majority of the functions I had to touch already
had an Archive* parameter, and would not have needed to be touched if it
were done the other way.  I'm tempted to push a preliminary patch that
adds such a field and gets rid of DumpOptions as a separate parameter
to anything that also takes an Archive* pointer.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Marko Tiikkaja

On 13/01/16 15:26, Vladimir Sitnikov wrote:

2) It is likely to be more performant. We just need to explain users
that "if different plans required, just use different statements".


This completely screws over PL/PgSQL, among other things.


.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Andres Freund
On 2016-01-13 17:26:43 +0300, Vladimir Sitnikov wrote:
> > consider e.g a table with one somewhat common and otherwise just unique 
> > values.> 
> So what?
> If I understand you properly, you mean: "if client sends unique binds
> first 5-6 executions and bad non-unique afterwards, then cached plan
> would be bad". Is that what you are saying?

That's one of several problems, yes. Generally using a very small sample
("bind values in the the first query"), to plan every future query isn't
going to be fun.

> I agree that is the corner-case for my suggestion.
> Is is really happening often?

Yes.

> I state the following:
> 1) It is way easier to debug & analyze.

Meh. That a prepared statement suddenly performs way differently
depending on which the first bind values are is not, in any way, easier
to debug.

> For instance: current documentation does *not* list a way to get a
> *generic plan*.

Which doensn't have anything to do with your proposal. That'd not change
with the change you propose.

> Is that obvious that "you just need to EXPLAIN ANALYZE EXECUTE *6
> times in a row*" just to get a generic plan?

No. And I hate that. I think it'd be very good to expand EXPLAIN's
output to include information about custom/generic plans.


> 3) What about "client sends top most common value 5 times in a row"?
> Why assume "it will stop doing that"?
> I think the better assumption is "it will continue doing that".

If 20% of your values are nonunique and the rest is unique you'll get
*drastically* different plans, each performing badly for the other case;
with the unique cardinality plan being extremly bad.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>so you don't get to (or want to) have any control over the underlying prepared 
>statement.

That is pl/pgsql's problem, isn't it?
In the mean time, user can use different query texts (e.g. by adding
offset 0, offset 0*1, offset 0*2, etc kind of stuff they typically use
to tune queries) to convince plpgsql to use different statement ids.

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>A value of -1 could disable generic plans

I do not like the idea.

I've seen dramatic performance improvements from using cached plans.
The numbers are like "20ms to plan vs 1ms to execute" for an often
used OLTP query. Query text is involved (~5-10KiB).

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>(1) the generic plan is falsely optimistic

That is my case.
Application is sending most common value on every execution while
backend is optimistic and it things that the app would stop sending
MCVs.

Costs for the plans are OK. However, there is a data skew, so it is
hard to tell what is the "true" selectivity of the skewed column in
general, thus the discussion.

VS>>In other words, application developer should understand
VS>> if a query is DWH-like (requires replans) or OLTP-like (does not
VS>> require replans). Agreed?
Tom>No, not agreed.  As was already pointed out upthread, such information
Tom>is not available in many use-cases for the plancache.

I think you answer the wrong question.
I was asking if you agree that _application_ developer (not pg backed
developer) should know if a query is OLTP or DWH like.

Do you really think app developer should not care which plan would be
chosen for a particular query he is working on?
Why all that "explain" stuff in documentation then?

In the plancache.c you have CURSOR_OPT_GENERIC_PLAN and
CURSOR_OPT_CUSTOM_PLAN flags.
It is obvious that those flags are not yet exposed/used by
applications, but my message is that "one should *not* think that DB
has artificial intelligence to properly identify a plan for each bind
sets and cache plans at the same time".

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>If plan is ok for one value parameters, then can be pretty bad for following 
>parameters.

Happy statements are all alike; every unhappy statement is unhappy in
its own way (see [1]).
If user is sending different kinds of parameters, he is shooting in the foot.

>Albe's proposal can be good enough for 2/3 cases and it doesn't block any 
>other enhancing

Albe's proposal effectively disables plan cache, thus it blocks enhancing.
If a user goes "replan every time" route, there is no way you
introduce plan caching there.

[1]: https://en.wikipedia.org/wiki/Anna_Karenina_principle

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
> for one custom plans can be much better than the generic plan, independent of 
> cardinalities

So what? I do not suggest dropping custom plans entirely.
I perfectly understand there are cases when better replan every time.

> consider e.g a table with one somewhat common and otherwise just unique 
> values.

So what?
If I understand you properly, you mean: "if client sends unique binds
first 5-6 executions and bad non-unique afterwards, then cached plan
would be bad". Is that what you are saying?
I agree that is the corner-case for my suggestion.
Is is really happening often?

I state the following:
1) It is way easier to debug & analyze.
For instance: current documentation does *not* list a way to get a
*generic plan*.
Is that obvious that "you just need to EXPLAIN ANALYZE EXECUTE *6
times in a row*" just to get a generic plan?

2) It is likely to be more performant. We just need to explain users
that "if different plans required, just use different statements".
Isn't that obvious?
Frankly speaking, I do not like "plug" kind of code that just
sends bind values and expects magically optimized plan for each bind
combination.

3) What about "client sends top most common value 5 times in a row"?
Why assume "it will stop doing that"?
I think the better assumption is "it will continue doing that".

At the end, if a client wants specific treatment of a query, then
he/she might be better using separate server-prepared statements (the
one for "unique values", and another one for "non-unique").

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>This completely screws over PL/PgSQL, among other things.

Can you elaborate a bit?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Andres Freund
On 2016-01-13 17:38:22 +0300, Vladimir Sitnikov wrote:
> >so you don't get to (or want to) have any control over the underlying 
> >prepared statement.
> 
> That is pl/pgsql's problem, isn't it?
> In the mean time, user can use different query texts (e.g. by adding
> offset 0, offset 0*1, offset 0*2, etc kind of stuff they typically use
> to tune queries) to convince plpgsql to use different statement ids.

Basically you're arguing to fix one specific edge case which bugs you
personally, by creating a lot of others, which don't bug you. Not
convincing.

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Albe Laurenz
Pavel Stehule wrote:
> I like a strategy based on risks. Probably there are situation, when the 
> generic plan is great every
> time - INSERTs, UPDATEs via PK, simple SELECTs via PK. generic plan can be 
> well if almost all data has
> similar probability. Elsewhere on bigger data, the probability of pretty slow 
> plan is higher, and then
> we should to prefer custom plan.
> 
> so the strategy - if cost of generic plan is less than some MAGIC CONSTANT 
> (can be specified by GUC),
> then use generic plan. Elsewhere use a custom plan everytime.
> 
> It allow to controll the plan reusing. When MAGIC CONSTANT = 0 .. use custom 
> plan everytime, When
> MAGIC CONSTANT = M, then use generic plan always.

I have a different idea:

What about a GUC "custom_plan_threshold" that controls after how many
executions a generic plan will be considered, with a default value of 5.

A value of -1 could disable generic plans.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Chapman Flack
Maybe more work, but:

Could a custom plan be cached /along with the values of
the parameters for which it was planned/?  Then it's kind of
a no-brainer to say "if the parameters are the same this time,
I'm probably looking at the best plan already."  Pretty simpleminded,
but at least it would catch the testcase where the plan flips to
a bad one even when literally the very same parameters are used.

Generalizing a little, how about caching a plan along with a
boolean expression tree over the parameters, evaluating to true
if the new parameters are "close enough" to the planned ones so
that the plan is probably still better than generic?

Version zero could just create the expression p1 = oldp1
AND p2 = oldp2 AND p3 = oldp3, and be simply the same as the
first suggestion. But then, how to generate more interesting
and useful validity-expressions for different plan types and
statistics could be a future-work area with some meat to it,
promising successive improvements. Maybe plugins could
supplement it for particular characterized workloads

-Chap


On 01/13/2016 08:28 AM, Marko Tiikkaja wrote:
> On 13/01/16 14:12, Pavel Stehule wrote:
>> so the strategy - if cost of generic plan is less than some MAGIC
>> CONSTANT
>> (can be specified by GUC), then use generic plan. Elsewhere use a custom
>> plan everytime.
>>
>> It allow to controll the plan reusing. When MAGIC CONSTANT = 0 .. use
>> custom plan everytime, When MAGIC CONSTANT = M, then use generic plan
>> always.
> 
> I don't think that would solve even the original problem without
> effectively disabling generic plans, despite the problem being
> relatively simple.  The generic plan appears to be really cheap, because
> the planner doesn't have the concept of a "worst case".
> 
> 
> .m
> 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>Basically you're arguing to fix one specific edge case which bugs you
>personally, by creating a lot of others, which don't bug you. Not
>convincing.

It bugs me.
It bugs clients of pgjdbc (e.g. Thomas who started the thread).

Note: support of prepared statements for java applications has just
landed. The release dates are 2015-08-27 for pgjdbc, and 2014-02-24
for pgjdbc-ng.
I think current report is just a tip of the iceberg.

> by creating a lot of others, which don't bug you

I think it will not create "lots of others".
Do you have any statistics why particular flavour of generic plan was
committed in 9.2?

Suppose there are two type of binds: "non_unique" (N) and "unique" (U)
that require different plans for perfect response times.

I see the following sequences
 -- all clear, all the approaches would converge to plan for
"unique values".

 -- query for non-unique value is executed again and again.
  Perfect optimizer would either replan or reuse plan with regard to "MCV"
  Current behaviour would switch to "optimistic" plan at 6th
iteration. It is the case of the thread.
  My suggestion is to learn that "MCV is used -> use plan optimized for MCV"

^^^ note that above are "recommended" uses of the database. Each
statement is used for its own purpose: one for MCVs, another for "good
values".

Then there are cases of mixed executions.
Note: I state that mixing "kinds" of bind values is a bad application
design anyway. In other words, application developer should understand
if a query is DWH-like (requires replans) or OLTP-like (does not
require replans). Agreed?

NUUU
  Current behavior optimized for exactly this pattern.
  Well, why was it chosen over "UNNN"?

In other words, a pattern like UNNN would "create a lot of
others" as you say.

NUNUNUNUNUN -- perfect optimizer would replan every time (or have two
sets of plans, but let's leave that out)
  Neither my suggestion nor current behaviour properly covers the case.


I suggest to spare "NUUU" pattern in order to improve "".

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Andres Freund
On January 13, 2016 3:02:27 PM GMT+01:00, Vladimir Sitnikov 
 wrote:
>> The custom plan is *more expensive*;
>
>You compare costs of custom vs generic plans.
>I suggest: do not compare costs *at all*.
>
>>I don't know, it's your proposal :-)  But it looks like I
>misunderstood.
>
>It is not.
>
>My suggestion is: build a generic plan (that is the plan that will
>return proper result for every possible bind value), yet refer to the
>values of 6th binds when estimating cardinalitites.
>Is it clear now?

That's not going to fly for two reasons: for one custom plans can be much 
better than the generic plan, independent of cardinalities. Consider e.g. a 
partitioned table, where the generic scan will scan all partitions. For 
another, just using the specific values for the generic plan will have horrible 
results if the distribution isn't entirely boring, consider e.g a table with 
one somewhat common and otherwise just unique values.

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Marko Tiikkaja

On 13/01/16 15:34, Vladimir Sitnikov wrote:

This completely screws over PL/PgSQL, among other things.


Can you elaborate a bit?


You just write a query like this:

  SELECT * FROM foo WHERE bar = _Variable;

so you don't get to (or want to) have any control over the underlying 
prepared statement.



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pgindent-polluted commits

2016-01-13 Thread Noah Misch
I've noticed commits, from a few of you, carrying pgindent changes to lines
the patch would not otherwise change.  (That is to say, the next pgindent run
would have made the same changes anyway.)  From
https://wiki.postgresql.org/wiki/Submitting_a_Patch#Reasons_your_patch_might_be_returned:

  The fastest way to get your patch rejected is to make unrelated changes.
  Reformatting lines that haven't changed, changing unrelated comments you
  felt were poorly worded, touching code not necessary to your change, etc.

Commits should follow the same high standards we ask of submissions.  Several
pgindent strategies do conform:

1) Run pgindent, then manually reduce its changes to the subset that your
   patch caused.
2) Don't run pgindent yourself; commit code that pgindent may later change.
3) Push a commit containing nothing but a pgindent run of the files you care
   about, then push a second commit for your feature/bugfix.

Please use of one of those next time you'd run pgindent.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
> Generally using a very small sample

That is another issue. Inventing some other algorithm instead of
current "cache after 5 executions" is another effort.

However, I suggest to "learn" from what client is sending.
You suggest to completely ignore that and just prepare for the case
he/she will send "a random value".
Why expect client would stop sending MCVs if we have already seen them
during previous 5 executions?

> That'd not change with the change you propose.

It will.
In my suggestion, the first "explain analyze execute" will match the
"finally cached plan" provided the plan is not treated in a special
way (e.g. replan every time, etc).

> That a prepared statement suddenly performs way differently
>depending on which the first bind values are is not, in any way, easier
>to debug.

It is way easier to debug since *the first* execution plan you get out
of "explain" *matches* the one that will finally be used.
Lots of developers are just not aware of "5 replans by backend".
Lots of the remaining confuse it with "5 non-server-prepared
executions by pgjdbc driver".

In other way: in order to identify a root cause of a slow query you
find bind values. Then you perform explain analyze and you find shiny
fast plan.
Does that immediately ring bells that you need to execute it 6 times
to ensure the plan would still be good?
Well, try being someone not that smart as you are when you answering
this question.

2) In my suggestion, "the first execution would likely to match the plan".

VS>> 3) What about "client sends top most common value 5 times in a row"?
VS>> Why assume "it will stop doing that"?
AF>If 20% of your values are nonunique and the rest is unique you'll get
AF>*drastically* different plans, each performing badly for the other case;
AF>with the unique cardinality plan being extremly bad.

Can you elaborate a bit? I can hardly follow that.



Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Python 3.x versus PG 9.1 branch

2016-01-13 Thread Tom Lane
In view of our rather embarrassing failure to cover the back branches
with Python 3.5-related regression test adjustments, I think there is
a clear need for a buildfarm critter that's testing with Python 3.5,
and I've been working on setting one up.  It's passing at the moment
for 9.2 and up, but not for 9.1, because we've repeatedly not bothered
to back-port regression test fixes for newer Pythons into that branch.
I could just omit Python 3 coverage for that branch in the critter's
configuration, but I wonder exactly why things are that way.

For clarity, to cover 9.1 I think we'd need to back-patch some subset
of these commits:

f16d52269 ff2faeec5 d0765d50f 6bff0e7d9 527ea6684 8182ffde5
45d1f1e02 2cfb1c6f7

The precedent of not fixing 9.1 started with the last of these.

I haven't looked into the details of which changes would actually
apply to 9.1, I just searched for commits that touched the Python
regression tests.  We'd really only need enough changes to address
the regression failures I'm getting, which are attached below.

Or we could just blow it off on the grounds that 9.1 is not long
for this world anyhow.

Opinions anyone?

regards, tom lane

*** /Users/tgl/pgsql/src/pl/plpython/./python3/expected/plpython_do.out	Wed Jan 13 00:46:31 2016
--- /Users/tgl/pgsql/src/pl/plpython/./python3/results/plpython_do.out	Wed Jan 13 00:46:31 2016
***
*** 2,8 
  NOTICE:  This is plpythonu.
  CONTEXT:  PL/Python anonymous code block
  DO $$ nonsense $$ LANGUAGE plpython3u;
! ERROR:  NameError: global name 'nonsense' is not defined
  CONTEXT:  Traceback (most recent call last):
PL/Python anonymous code block, line 1, in 
  nonsense 
--- 2,8 
  NOTICE:  This is plpythonu.
  CONTEXT:  PL/Python anonymous code block
  DO $$ nonsense $$ LANGUAGE plpython3u;
! ERROR:  NameError: name 'nonsense' is not defined
  CONTEXT:  Traceback (most recent call last):
PL/Python anonymous code block, line 1, in 
  nonsense 

==

*** /Users/tgl/pgsql/src/pl/plpython/./python3/expected/plpython_import.out	Wed Jan 13 00:46:31 2016
--- /Users/tgl/pgsql/src/pl/plpython/./python3/results/plpython_import.out	Wed Jan 13 00:46:32 2016
***
*** 51,57 
  -- import python modules
  --
  SELECT import_fail();
! NOTICE:  import socket failed -- No module named foosocket
  CONTEXT:  PL/Python function "import_fail"
  import_fail 
  
--- 51,57 
  -- import python modules
  --
  SELECT import_fail();
! NOTICE:  import socket failed -- No module named 'foosocket'
  CONTEXT:  PL/Python function "import_fail"
  import_fail 
  

==

*** /Users/tgl/pgsql/src/pl/plpython/./python3/expected/plpython_params.out	Wed Jan 13 00:46:31 2016
--- /Users/tgl/pgsql/src/pl/plpython/./python3/results/plpython_params.out	Wed Jan 13 00:46:32 2016
***
*** 37,46 
  SELECT test_param_names2(users) from users;
 test_param_names2   
  ---
!  {'lname': 'doe', 'username': 'j_doe', 'userid': 1, 'fname': 'jane'}
!  {'lname': 'doe', 'username': 'johnd', 'userid': 2, 'fname': 'john'}
!  {'lname': 'doe', 'username': 'w_doe', 'userid': 3, 'fname': 'willem'}
!  {'lname': 'smith', 'username': 'slash', 'userid': 4, 'fname': 'rick'}
  (4 rows)
  
  SELECT test_param_names2(NULL);
--- 37,46 
  SELECT test_param_names2(users) from users;
 test_param_names2   
  ---
!  {'userid': 1, 'username': 'j_doe', 'lname': 'doe', 'fname': 'jane'}
!  {'userid': 2, 'username': 'johnd', 'lname': 'doe', 'fname': 'john'}
!  {'userid': 3, 'username': 'w_doe', 'lname': 'doe', 'fname': 'willem'}
!  {'userid': 4, 'username': 'slash', 'lname': 'smith', 'fname': 'rick'}
  (4 rows)
  
  SELECT test_param_names2(NULL);

==

*** /Users/tgl/pgsql/src/pl/plpython/./python3/expected/plpython_trigger.out	Wed Jan 13 00:46:31 2016
--- /Users/tgl/pgsql/src/pl/plpython/./python3/results/plpython_trigger.out	Wed Jan 13 00:46:32 2016
***
*** 119,125 
  CONTEXT:  PL/Python function "trigger_data"
  NOTICE:  TD[name] => show_trigger_data_trig_before
  CONTEXT:  PL/Python function "trigger_data"
! NOTICE:  TD[new] => {'i': 1, 'v': 'insert'}
  CONTEXT:  PL/Python function "trigger_data"
  NOTICE:  TD[old] => None
  CONTEXT:  PL/Python function "trigger_data"
--- 119,125 
  CONTEXT:  PL/Python function "trigger_data"
  NOTICE:  TD[name] => show_trigger_data_trig_before
  CONTEXT:  PL/Python function "trigger_data"
! NOTICE:  TD[new] => {'v': 'insert', 'i': 1}
  CONTEXT:  PL/Python function "trigger_data"
  NOTICE:  

Re: [HACKERS] pgindent-polluted commits

2016-01-13 Thread Tom Lane
Simon Riggs  writes:
> On 13 January 2016 at 14:48, Noah Misch  wrote:
>> I've noticed commits, from a few of you, carrying pgindent changes to lines
>> the patch would not otherwise change.

> Could we review again why this matters?

Basically this is trading off convenience of the committer (all of the
alternatives Noah mentions are somewhat annoying) versus the convenience
of post-commit reviewers.  I'm not sure that his recommendation is the
best trade-off, nor that the situation is precisely comparable to
pre-commit review.  There definitely will be pre-commit review, there
may or may not be any post-commit review.

I'm willing to go with the "separate commit to reindent individual files"
approach if there's a consensus that that makes for a cleaner git history.
But I'm not 100% convinced it matters.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Pavel Stehule
2016-01-13 17:12 GMT+01:00 Vladimir Sitnikov :

> >If plan is ok for one value parameters, then can be pretty bad for
> following parameters.
>
> Happy statements are all alike; every unhappy statement is unhappy in
> its own way (see [1]).
> If user is sending different kinds of parameters, he is shooting in the
> foot.
>
> >Albe's proposal can be good enough for 2/3 cases and it doesn't block any
> other enhancing
>
> Albe's proposal effectively disables plan cache, thus it blocks enhancing.
> If a user goes "replan every time" route, there is no way you
> introduce plan caching there.
>

I am sorry, I disagree. Albe's proposal should be compatible with current
state, so your argument is too strong. Default is same.

Pavel


>
> [1]: https://en.wikipedia.org/wiki/Anna_Karenina_principle
>
> Vladimir
>


Re: [HACKERS] pgindent-polluted commits

2016-01-13 Thread Simon Riggs
On 13 January 2016 at 14:48, Noah Misch  wrote:


> I've noticed commits, from a few of you, carrying pgindent changes to lines
> the patch would not otherwise change.


Could we review again why this matters?

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services