Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-09-06 Thread Bruce Momjian

Uh, have we addressed this?  I don't think so.

---

Heikki Linnakangas wrote:
 On 02.04.2011 20:48, Tom Lane wrote:
  Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:
  We sometimes transform IN-clauses to a list of ORs:
  postgres=# explain SELECT * FROM foo WHERE a IN (b, c);
  QUERY PLAN
  --
 Seq Scan on foo  (cost=0.00..39.10 rows=19 width=12)
   Filter: ((a = b) OR (a = c))
  (2 rows)
 
  But what if you replace a with a volatile function? It doesn't seem
  legal to do that transformation in that case, but we do it:
 
  This is the fault of transformAExprIn().  But please let's *not* fix
  this by adding volatility to the set of heuristics used there.  Looking
  at this again, it seems to me that most of the problem with this code
  is that we're trying to make optimization decisions in the parser.
 
 Agreed. The history of this is that before 8.2 all IN clauses were 
 transformed to OR clauses straight in the grammar. 8.2 added the code to 
 represent IN clause as a ScalarArrayOpExpr, but it was changed in 8.2.10 
 to use the OR-form again for Vars 
 (http://archives.postgresql.org/pgsql-hackers/2008-10/msg01269.php)
 
  I think what we ought to do is have the parser emit a full-fledged
  InExpr node type (with semantics rather like CaseExpr) and then teach
  the planner to optimize that to something else when it seems
  safe/prudent to do so.  One nontrivial advantage of that is that
  rules/views containing IN constructs would start to reverse-parse
  in the same fashion, instead of introducing weird substitute
  expressions.
 
 Here's my first cut at that. The lefthand expression is now evaluated 
 only once, and stored in econtext-caseValue. Parse analysis turns the 
 righthand expressions into a list of comparison expressions like 
 CaseTestExpr = value1. It's perhaps time that we rename CaseTestExpr 
 into something more generic, now that it's used not only in CASE 
 expressions, but also in IN and in UPDATE targets, but I didn't do that 
 in this patch.
 
 eval_const_expressions checks the lefthand expression for volatile 
 functions. If there aren't any, it transform the InExprs to a list of ORs.
 
 This isn't finished, because it doesn't yet do the transformation to 
 ScalarArrayOpExpr. The OR form is much slower to plan, which is why the 
 ScalarArrayOpExpr transformation was introduced in 8.2. I'll continue 
 hacking on that, but please let me know if you have a better idea on how 
 to handle that. One alternative is to teach the machinery that matches 
 restrictinfos to usable indexes to handle InExpr like it does 
 ScalarArrayOpExprs, but I don't know that code very well.
 
 -- 
Heikki Linnakangas
EnterpriseDB   http://www.enterprisedb.com

[ Attachment, skipping... ]

 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-09-06 Thread Heikki Linnakangas
Nope, this hasn't been addressed. FWIW, I put it on the todo list when I 
stopped working on it.


On 06.09.2011 20:48, Bruce Momjian wrote:


Uh, have we addressed this?  I don't think so.

---

Heikki Linnakangas wrote:

On 02.04.2011 20:48, Tom Lane wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com   writes:

We sometimes transform IN-clauses to a list of ORs:
postgres=# explain SELECT * FROM foo WHERE a IN (b, c);
 QUERY PLAN
--
Seq Scan on foo  (cost=0.00..39.10 rows=19 width=12)
  Filter: ((a = b) OR (a = c))
(2 rows)



But what if you replace a with a volatile function? It doesn't seem
legal to do that transformation in that case, but we do it:


This is the fault of transformAExprIn().  But please let's *not* fix
this by adding volatility to the set of heuristics used there.  Looking
at this again, it seems to me that most of the problem with this code
is that we're trying to make optimization decisions in the parser.


Agreed. The history of this is that before 8.2 all IN clauses were
transformed to OR clauses straight in the grammar. 8.2 added the code to
represent IN clause as a ScalarArrayOpExpr, but it was changed in 8.2.10
to use the OR-form again for Vars
(http://archives.postgresql.org/pgsql-hackers/2008-10/msg01269.php)


I think what we ought to do is have the parser emit a full-fledged
InExpr node type (with semantics rather like CaseExpr) and then teach
the planner to optimize that to something else when it seems
safe/prudent to do so.  One nontrivial advantage of that is that
rules/views containing IN constructs would start to reverse-parse
in the same fashion, instead of introducing weird substitute
expressions.


Here's my first cut at that. The lefthand expression is now evaluated
only once, and stored in econtext-caseValue. Parse analysis turns the
righthand expressions into a list of comparison expressions like
CaseTestExpr = value1. It's perhaps time that we rename CaseTestExpr
into something more generic, now that it's used not only in CASE
expressions, but also in IN and in UPDATE targets, but I didn't do that
in this patch.

eval_const_expressions checks the lefthand expression for volatile
functions. If there aren't any, it transform the InExprs to a list of ORs.

This isn't finished, because it doesn't yet do the transformation to
ScalarArrayOpExpr. The OR form is much slower to plan, which is why the
ScalarArrayOpExpr transformation was introduced in 8.2. I'll continue
hacking on that, but please let me know if you have a better idea on how
to handle that. One alternative is to teach the machinery that matches
restrictinfos to usable indexes to handle InExpr like it does
ScalarArrayOpExprs, but I don't know that code very well.

--
Heikki Linnakangas
EnterpriseDB   http://www.enterprisedb.com


[ Attachment, skipping... ]



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers





--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-09-06 Thread Bruce Momjian
Heikki Linnakangas wrote:
 Nope, this hasn't been addressed. FWIW, I put it on the todo list when I 
 stopped working on it.

Oh, I see it now. Thanks.

---


 
 On 06.09.2011 20:48, Bruce Momjian wrote:
 
  Uh, have we addressed this?  I don't think so.
 
  ---
 
  Heikki Linnakangas wrote:
  On 02.04.2011 20:48, Tom Lane wrote:
  Heikki Linnakangasheikki.linnakan...@enterprisedb.com   writes:
  We sometimes transform IN-clauses to a list of ORs:
  postgres=# explain SELECT * FROM foo WHERE a IN (b, c);
   QUERY PLAN
  --
  Seq Scan on foo  (cost=0.00..39.10 rows=19 width=12)
Filter: ((a = b) OR (a = c))
  (2 rows)
 
  But what if you replace a with a volatile function? It doesn't seem
  legal to do that transformation in that case, but we do it:
 
  This is the fault of transformAExprIn().  But please let's *not* fix
  this by adding volatility to the set of heuristics used there.  Looking
  at this again, it seems to me that most of the problem with this code
  is that we're trying to make optimization decisions in the parser.
 
  Agreed. The history of this is that before 8.2 all IN clauses were
  transformed to OR clauses straight in the grammar. 8.2 added the code to
  represent IN clause as a ScalarArrayOpExpr, but it was changed in 8.2.10
  to use the OR-form again for Vars
  (http://archives.postgresql.org/pgsql-hackers/2008-10/msg01269.php)
 
  I think what we ought to do is have the parser emit a full-fledged
  InExpr node type (with semantics rather like CaseExpr) and then teach
  the planner to optimize that to something else when it seems
  safe/prudent to do so.  One nontrivial advantage of that is that
  rules/views containing IN constructs would start to reverse-parse
  in the same fashion, instead of introducing weird substitute
  expressions.
 
  Here's my first cut at that. The lefthand expression is now evaluated
  only once, and stored in econtext-caseValue. Parse analysis turns the
  righthand expressions into a list of comparison expressions like
  CaseTestExpr = value1. It's perhaps time that we rename CaseTestExpr
  into something more generic, now that it's used not only in CASE
  expressions, but also in IN and in UPDATE targets, but I didn't do that
  in this patch.
 
  eval_const_expressions checks the lefthand expression for volatile
  functions. If there aren't any, it transform the InExprs to a list of ORs.
 
  This isn't finished, because it doesn't yet do the transformation to
  ScalarArrayOpExpr. The OR form is much slower to plan, which is why the
  ScalarArrayOpExpr transformation was introduced in 8.2. I'll continue
  hacking on that, but please let me know if you have a better idea on how
  to handle that. One alternative is to teach the machinery that matches
  restrictinfos to usable indexes to handle InExpr like it does
  ScalarArrayOpExprs, but I don't know that code very well.
 
  --
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
 
  [ Attachment, skipping... ]
 
 
  --
  Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-hackers
 
 
 
 -- 
Heikki Linnakangas
EnterpriseDB   http://www.enterprisedb.com

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-09-06 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Uh, have we addressed this?  I don't think so.

No.  IIRC, I didn't like Heikki's proposed patch, so it's on my head
to come up with something better.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-09-06 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Uh, have we addressed this?  I don't think so.
 
 No.  IIRC, I didn't like Heikki's proposed patch, so it's on my head
 to come up with something better.

You can blame me for getting it into the parser.  It used to be in
gram.y!

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-09-06 Thread Heikki Linnakangas

On 06.09.2011 20:53, Bruce Momjian wrote:

Tom Lane wrote:

Bruce Momjianbr...@momjian.us  writes:

Uh, have we addressed this?  I don't think so.


No.  IIRC, I didn't like Heikki's proposed patch, so it's on my head
to come up with something better.


You can blame me for getting it into the parser.  It used to be in
gram.y!


Huh? Isn't the parser and gram.y more or less the same thing? 
Anyway, it needs to be somewhere else..


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-09-06 Thread Bruce Momjian
Heikki Linnakangas wrote:
 On 06.09.2011 20:53, Bruce Momjian wrote:
  Tom Lane wrote:
  Bruce Momjianbr...@momjian.us  writes:
  Uh, have we addressed this?  I don't think so.
 
  No.  IIRC, I didn't like Heikki's proposed patch, so it's on my head
  to come up with something better.
 
  You can blame me for getting it into the parser.  It used to be in
  gram.y!
 
 Huh? Isn't the parser and gram.y more or less the same thing? 
 Anyway, it needs to be somewhere else..

I meant the '/parser' directory.  It actually created AND nodes in gram.y
so the rest of the parser didn't even know a BETWEEN was used.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-04-19 Thread Heikki Linnakangas

On 11.04.2011 19:33, Heikki Linnakangas wrote:

On 11.04.2011 19:06, Kevin Grittner wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com wrote:

Hmm, the SQL specification explicitly says that

X BETWEEN Y AND Z

is equal to

X= Y AND X= Z

It doesn't say anything about side-effects of X. Seems like an
oversight in the specification. I would not expect X to be
evaluated twice, and I think we should change BETWEEN to not do
that.


Does the SQL spec explicitly say anything about how many times X
should be evaluated if you were to code it as?:

X= Y AND X= Z


Not explicitly. However, it does say that:


NOTE 258 — Since between predicate is an ordering operation, the
Conformance Rules of Subclause 9.12, “Ordering
operations”, also apply.


If I'm reading those ordering operation conformance rules correctly, it
only allows the operand to be a simple column or an expression that's
specified in the ORDER BY or similar, not an arbitrary expression. Which
seems quite restrictive, but it would dodge the whole issue..


Another data point on this: DB2 disallow volatile left-operand to BETWEEN

db2 = SELECT * FROM atable WHERE smallint(rand()*10) BETWEEN 4 AND 5
SQL0583N  The use of routine or expression SYSFUN.RAND is invalid 
because it

is not deterministic or has an external action.  SQLSTATE=42845

I'd like us to still fix this so that there's no multiple evaluation - 
that would actually make BETWEEN more useful than it is today. I'm 
working on a patch to handle both BETWEEN and IN.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-04-19 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 I'd like us to still fix this so that there's no multiple evaluation - 
 that would actually make BETWEEN more useful than it is today. I'm 
 working on a patch to handle both BETWEEN and IN.

One other issue here is that the parser has traditionally handled
BETWEEN by multiply linking the same expression tree.  I've wanted to
get rid of that behavior for a long time, but never got round to it.
It causes a lot of headaches for later processing, because you have to
be wary of multiply processing the same tree.  If we switch BETWEEN
to something with a dedicated representation we could probably get rid
of all multiple-linking in the parser's output, allowing ensuing
simplifications downstream.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-04-11 Thread Heikki Linnakangas

On 05.04.2011 18:42, Heikki Linnakangas wrote:

On 05.04.2011 13:19, Marti Raudsepp wrote:

On Fri, Apr 1, 2011 at 14:24, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:

We sometimes transform IN-clauses to a list of ORs:

postgres=# explain SELECT * FROM foo WHERE a IN (b, c);
QUERY PLAN
Seq Scan on foo (cost=0.00..39.10 rows=19 width=12)
Filter: ((a = b) OR (a = c))

But what if you replace a with a volatile function? It doesn't seem
legal
to do that transformation in that case, but we do it:

postgres=# explain SELECT * FROM foo WHERE (random()*2)::integer IN
(b, c);
QUERY PLAN

Seq Scan on foo (cost=0.00..68.20 rows=19 width=12)
Filter: random() * 2::double precision))::integer = b) OR
(((random()
* 2::double precision))::integer = c))


Is there a similar problem with the BETWEEN clause transformation into
AND expressions?

marti= explain verbose select random() between 0.25 and 0.75;
Result (cost=0.00..0.02 rows=1 width=0)
Output: ((random()= 0.25::double precision) AND (random()=
0.75::double precision))


Yes, good point.


Hmm, the SQL specification explicitly says that

X BETWEEN Y AND Z

is equal to

X = Y AND X = Z

It doesn't say anything about side-effects of X. Seems like an oversight 
in the specification. I would not expect X to be evaluated twice, and I 
think we should change BETWEEN to not do that.



Does anyone object to making BETWEEN and IN more strict about the data 
types? At the moment, you can do this:


postgres=# SELECT '1234' BETWEEN '10001'::text AND 10002::int4;
 ?column?
--
 t
(1 row)

I'm thinking that it should throw an error. Same with IN, if the values 
in the IN-list can't be coerced to a common type. That will probably 
simplify the code a lot, and is what the SQL standard assumes anyway AFAICS.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-04-11 Thread Kevin Grittner
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 On 05.04.2011 18:42, Heikki Linnakangas wrote:
 On 05.04.2011 13:19, Marti Raudsepp wrote:
 On Fri, Apr 1, 2011 at 14:24, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 We sometimes transform IN-clauses to a list of ORs:

 postgres=# explain SELECT * FROM foo WHERE a IN (b, c);
 QUERY PLAN
 Seq Scan on foo (cost=0.00..39.10 rows=19 width=12)
 Filter: ((a = b) OR (a = c))

 But what if you replace a with a volatile function? It
 doesn't seem legal to do that transformation in that case, but
 we do it:

 postgres=# explain SELECT * FROM foo WHERE
 (random()*2)::integer IN (b, c);
 QUERY PLAN

 Seq Scan on foo (cost=0.00..68.20 rows=19 width=12)
 Filter: random() * 2::double precision))::integer = b) OR
 (((random()
 * 2::double precision))::integer = c))

 Is there a similar problem with the BETWEEN clause
 transformation into AND expressions?

 marti= explain verbose select random() between 0.25 and 0.75;
 Result (cost=0.00..0.02 rows=1 width=0)
 Output: ((random()= 0.25::double precision) AND (random()=
 0.75::double precision))

 Yes, good point.
 
 Hmm, the SQL specification explicitly says that
 
 X BETWEEN Y AND Z
 
 is equal to
 
 X = Y AND X = Z
 
 It doesn't say anything about side-effects of X. Seems like an
 oversight in the specification. I would not expect X to be
 evaluated twice, and I think we should change BETWEEN to not do
 that.
 
Does the SQL spec explicitly say anything about how many times X
should be evaluated if you were to code it as?:
 
X = Y AND X = Z
 
If it does, evaluating it a different number of times for BETWEEN
would seem to be a deviation from standard.  Evaluating it once seem
less surprising, but if we're going to deviate from the standard in
doing that, it at least deserves a clear note to that effect in the
docs.
 
Evaluating X once for BETWEEN seems better from a POLA perspective,
unless you happen to be massaging a query to another form and
trusting that the equivalence defined in the standard will always
hold.
 
 Does anyone object to making BETWEEN and IN more strict about the
 data types? At the moment, you can do this:
 
 postgres=# SELECT '1234' BETWEEN '10001'::text AND 10002::int4;
   ?column?
 --
   t
 (1 row)
 
 I'm thinking that it should throw an error. Same with IN, if the
 values in the IN-list can't be coerced to a common type. That will
 probably simplify the code a lot, and is what the SQL standard
 assumes anyway AFAICS.
 
+1 for more strict.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-04-11 Thread Heikki Linnakangas

On 11.04.2011 19:06, Kevin Grittner wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com  wrote:

On 05.04.2011 18:42, Heikki Linnakangas wrote:

On 05.04.2011 13:19, Marti Raudsepp wrote:

On Fri, Apr 1, 2011 at 14:24, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

We sometimes transform IN-clauses to a list of ORs:

postgres=# explain SELECT * FROM foo WHERE a IN (b, c);
QUERY PLAN
Seq Scan on foo (cost=0.00..39.10 rows=19 width=12)
Filter: ((a = b) OR (a = c))

But what if you replace a with a volatile function? It
doesn't seem legal to do that transformation in that case, but
we do it:

postgres=# explain SELECT * FROM foo WHERE
(random()*2)::integer IN (b, c);
QUERY PLAN

Seq Scan on foo (cost=0.00..68.20 rows=19 width=12)
Filter: random() * 2::double precision))::integer = b) OR
(((random()
* 2::double precision))::integer = c))


Is there a similar problem with the BETWEEN clause
transformation into AND expressions?

marti=  explain verbose select random() between 0.25 and 0.75;
Result (cost=0.00..0.02 rows=1 width=0)
Output: ((random()= 0.25::double precision) AND (random()=
0.75::double precision))


Yes, good point.


Hmm, the SQL specification explicitly says that

X BETWEEN Y AND Z

is equal to

X= Y AND X= Z

It doesn't say anything about side-effects of X. Seems like an
oversight in the specification. I would not expect X to be
evaluated twice, and I think we should change BETWEEN to not do
that.


Does the SQL spec explicitly say anything about how many times X
should be evaluated if you were to code it as?:

X= Y AND X= Z


Not explicitly. However, it does say that:


NOTE 258 — Since between predicate is an ordering operation, the 
Conformance Rules of Subclause 9.12, “Ordering

operations”, also apply.


If I'm reading those ordering operation conformance rules correctly, it 
only allows the operand to be a simple column or an expression that's 
specified in the ORDER BY or similar, not an arbitrary expression. Which 
seems quite restrictive, but it would dodge the whole issue..


The spec also has that:

“X BETWEEN SYMMETRIC Y AND Z” is equivalent to “((X BETWEEN ASYMMETRIC Y 
AND Z)

OR (X BETWEEN ASYMMETRIC Z AND Y))”.

So if you take that into account too, X is evaluated four times. The SQL 
standard can be funny sometimes, but I can't believe that they intended 
that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-04-11 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Does anyone object to making BETWEEN and IN more strict about the data 
 types? At the moment, you can do this:

 postgres=# SELECT '1234' BETWEEN '10001'::text AND 10002::int4;
   ?column?
 --
   t
 (1 row)

 I'm thinking that it should throw an error. Same with IN, if the values 
 in the IN-list can't be coerced to a common type.

You *will* get push-back on that ... maybe from people with badly coded
applications, but I guarantee there will be complaints.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-04-05 Thread Marti Raudsepp
On Fri, Apr 1, 2011 at 14:24, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 We sometimes transform IN-clauses to a list of ORs:

 postgres=# explain SELECT * FROM foo WHERE a IN (b, c);
                      QUERY PLAN
  Seq Scan on foo  (cost=0.00..39.10 rows=19 width=12)
   Filter: ((a = b) OR (a = c))

 But what if you replace a with a volatile function? It doesn't seem legal
 to do that transformation in that case, but we do it:

 postgres=# explain SELECT * FROM foo WHERE (random()*2)::integer IN (b, c);
                                                     QUERY PLAN

  Seq Scan on foo  (cost=0.00..68.20 rows=19 width=12)
   Filter: random() * 2::double precision))::integer = b) OR (((random()
 * 2::double precision))::integer = c))

Is there a similar problem with the BETWEEN clause transformation into
AND expressions?

marti= explain verbose select random() between 0.25 and 0.75;
 Result  (cost=0.00..0.02 rows=1 width=0)
   Output: ((random() = 0.25::double precision) AND (random() =
0.75::double precision))

As expected, I get a statistical skew of 0.4375 / 0.5625, whereas the
correct would be 50/50:

marti= select random() between 0.25 and 0.75 as result, count(*) from
generate_series(1,100) i group by 1;
 result | count
+
 f  | 437262
 t  | 562738

I also always noticed that BETWEEN with subqueries produces two
subplan nodes, this seems suboptimal.

marti= explain verbose select (select random()) between 0.25 and 0.75;
 Result  (cost=0.03..0.04 rows=1 width=0)
   Output: (($0 = 0.25::double precision) AND ($1 = 0.75::double precision))
   InitPlan 1 (returns $0)
 -  Result  (cost=0.00..0.01 rows=1 width=0)
   Output: random()
   InitPlan 2 (returns $1)
 -  Result  (cost=0.00..0.01 rows=1 width=0)
   Output: random()


Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-04-05 Thread Heikki Linnakangas

On 05.04.2011 13:19, Marti Raudsepp wrote:

On Fri, Apr 1, 2011 at 14:24, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

We sometimes transform IN-clauses to a list of ORs:

postgres=# explain SELECT * FROM foo WHERE a IN (b, c);
  QUERY PLAN
  Seq Scan on foo  (cost=0.00..39.10 rows=19 width=12)
   Filter: ((a = b) OR (a = c))

But what if you replace a with a volatile function? It doesn't seem legal
to do that transformation in that case, but we do it:

postgres=# explain SELECT * FROM foo WHERE (random()*2)::integer IN (b, c);
 QUERY PLAN

  Seq Scan on foo  (cost=0.00..68.20 rows=19 width=12)
   Filter: random() * 2::double precision))::integer = b) OR (((random()
* 2::double precision))::integer = c))


Is there a similar problem with the BETWEEN clause transformation into
AND expressions?

marti=  explain verbose select random() between 0.25 and 0.75;
  Result  (cost=0.00..0.02 rows=1 width=0)
Output: ((random()= 0.25::double precision) AND (random()=
0.75::double precision))


Yes, good point.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-04-02 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 We sometimes transform IN-clauses to a list of ORs:
 postgres=# explain SELECT * FROM foo WHERE a IN (b, c);
QUERY PLAN
 --
   Seq Scan on foo  (cost=0.00..39.10 rows=19 width=12)
 Filter: ((a = b) OR (a = c))
 (2 rows)

 But what if you replace a with a volatile function? It doesn't seem 
 legal to do that transformation in that case, but we do it:

This is the fault of transformAExprIn().  But please let's *not* fix
this by adding volatility to the set of heuristics used there.  Looking
at this again, it seems to me that most of the problem with this code
is that we're trying to make optimization decisions in the parser.

I think what we ought to do is have the parser emit a full-fledged
InExpr node type (with semantics rather like CaseExpr) and then teach
the planner to optimize that to something else when it seems
safe/prudent to do so.  One nontrivial advantage of that is that
rules/views containing IN constructs would start to reverse-parse
in the same fashion, instead of introducing weird substitute
expressions.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Transforming IN (...) to ORs, volatility

2011-04-01 Thread Heikki Linnakangas

We sometimes transform IN-clauses to a list of ORs:

postgres=# explain SELECT * FROM foo WHERE a IN (b, c);
  QUERY PLAN
--
 Seq Scan on foo  (cost=0.00..39.10 rows=19 width=12)
   Filter: ((a = b) OR (a = c))
(2 rows)

But what if you replace a with a volatile function? It doesn't seem 
legal to do that transformation in that case, but we do it:


postgres=# explain SELECT * FROM foo WHERE (random()*2)::integer IN (b, c);
 QUERY PLAN 



-
---
 Seq Scan on foo  (cost=0.00..68.20 rows=19 width=12)
   Filter: random() * 2::double precision))::integer = b) OR 
(((random() * 2::double precision))::integer = c))

(2 rows)

I tried to read the SQL spec to see if it has anything to say about 
that, but I couldn't find anything. My common sense says that that 
transformation is not legal.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-04-01 Thread Gianni Ciolli
On Fri, Apr 01, 2011 at 02:24:53PM +0300, Heikki Linnakangas wrote:

 I tried to read the SQL spec to see if it has anything to say about
 that, but I couldn't find anything. My common sense says that that
 transformation is not legal.

Your feeling is correct; I would motivate it as follows.

  random() IN (b,c)

is not equivalent to

  (random() = b) OR (random() = c)

because the two random() will evaluate to two different numbers.  So,
for instance, if you define random_boolean() as either true or false
randomly (and VOLATILEly), then

  random_boolean() IN (true, false)

is always true, while

  (random_boolean() = true) OR (random_boolean() = false)

is not (has probability 75%). For instance, the first random_boolean()
might return false while the second one returns true.

Best regards,
Dr. Gianni Ciolli - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gianni.cio...@2ndquadrant.it | www.2ndquadrant.it

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transforming IN (...) to ORs, volatility

2011-04-01 Thread Robert Haas
On Fri, Apr 1, 2011 at 7:24 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 My common sense says that that transformation
 is not legal.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers