Re: [HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-05 Thread Merlin Moncure
On Sat, May 3, 2014 at 5:48 AM, Marko Tiikkaja ma...@joh.to wrote:
 On 5/2/14, 10:10 PM, Merlin Moncure wrote:

 On Fri, May 2, 2014 at 3:03 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Meh.  Then you could have a query that works fine until you add a column
 to the table, and it stops working.  If nobody ever used column names
 identical to table names it'd be all right, but unfortunately people
 seem to do that a lot...


 That's already the case with select statements

 I don't think that's true if you table-qualify your column references and
 don't use SELECT *.


 and, if a user were
 concerned about that, always have the option of aliasing the table as
 nearly 100% of professional developers do:

 SELECT f FROM foo f;
 etc.


 So e.g.:

   UPDATE foo f SET f = ..;

 would resolve to the table, despite there being a column called f? That
 would break backwards compatibility.

 How about:

   UPDATE foo SET ROW(foo) = (1,2,3);

 ISTM that this could be parsed unambiguously, though it's perhaps a bit
 ugly.

Hm, that's a bit too ugly: row(foo) in this case means 'do special
behavior X' whereas in all other cases it means make an anonymous
rowtype with one attribute of type 'foo'.

How about:
UPDATE foo SET (foo).* = (1,2,3);

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-05 Thread Pavel Stehule
2014-05-05 17:02 GMT+02:00 Merlin Moncure mmonc...@gmail.com:

 On Sat, May 3, 2014 at 5:48 AM, Marko Tiikkaja ma...@joh.to wrote:
  On 5/2/14, 10:10 PM, Merlin Moncure wrote:
 
  On Fri, May 2, 2014 at 3:03 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 
  Meh.  Then you could have a query that works fine until you add a
 column
  to the table, and it stops working.  If nobody ever used column names
  identical to table names it'd be all right, but unfortunately people
  seem to do that a lot...
 
 
  That's already the case with select statements
 
  I don't think that's true if you table-qualify your column references and
  don't use SELECT *.
 
 
  and, if a user were
  concerned about that, always have the option of aliasing the table as
  nearly 100% of professional developers do:
 
  SELECT f FROM foo f;
  etc.
 
 
  So e.g.:
 
UPDATE foo f SET f = ..;
 
  would resolve to the table, despite there being a column called f? That
  would break backwards compatibility.
 
  How about:
 
UPDATE foo SET ROW(foo) = (1,2,3);
 
  ISTM that this could be parsed unambiguously, though it's perhaps a bit
  ugly.

 Hm, that's a bit too ugly: row(foo) in this case means 'do special
 behavior X' whereas in all other cases it means make an anonymous
 rowtype with one attribute of type 'foo'.

 How about:
 UPDATE foo SET (foo).* = (1,2,3);


It is looking little bit strange

I like previous proposal UPDATE foo SET foo = (1,2,3);

Pavel



 merlin


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-05 Thread Andrew Dunstan


On 05/05/2014 11:20 AM, Pavel Stehule wrote:




How about:
UPDATE foo SET (foo).* = (1,2,3);


It is looking little bit strange

I like previous proposal UPDATE foo SET foo = (1,2,3);



What if the table has a field called foo? Won't it then be ambiguous?

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-05 Thread Merlin Moncure
On Mon, May 5, 2014 at 10:32 AM, Andrew Dunstan and...@dunslane.net wrote:

 On 05/05/2014 11:20 AM, Pavel Stehule wrote:




 How about:
 UPDATE foo SET (foo).* = (1,2,3);


 It is looking little bit strange

 I like previous proposal UPDATE foo SET foo = (1,2,3);


 What if the table has a field called foo? Won't it then be ambiguous?

See upthread: it prefers the field to the table if both are there
(exactly as SELECT does).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-05 Thread David G Johnston
Merlin Moncure-2 wrote
 On Sat, May 3, 2014 at 5:48 AM, Marko Tiikkaja lt;

 marko@

 gt; wrote:
 On 5/2/14, 10:10 PM, Merlin Moncure wrote:

 On Fri, May 2, 2014 at 3:03 PM, Tom Lane lt;

 tgl@.pa

 gt; wrote:

 Meh.  Then you could have a query that works fine until you add a
 column
 to the table, and it stops working.  If nobody ever used column names
 identical to table names it'd be all right, but unfortunately people
 seem to do that a lot...


 That's already the case with select statements

 I don't think that's true if you table-qualify your column references and
 don't use SELECT *.


 and, if a user were
 concerned about that, always have the option of aliasing the table as
 nearly 100% of professional developers do:

 SELECT f FROM foo f;
 etc.


 So e.g.:

   UPDATE foo f SET f = ..;

 would resolve to the table, despite there being a column called f? That
 would break backwards compatibility.

 How about:

   UPDATE foo SET ROW(foo) = (1,2,3);

 ISTM that this could be parsed unambiguously, though it's perhaps a bit
 ugly.
 
 Hm, that's a bit too ugly: row(foo) in this case means 'do special
 behavior X' whereas in all other cases it means make an anonymous
 rowtype with one attribute of type 'foo'.
 
 How about:
 UPDATE foo SET (foo).* = (1,2,3);

Wouldn't

UPDATE foo SET (foo.*) = (1,2,3)

be better since it would cleanly support non-complete types like

UPDATE foo SET (foo.col1, foo.col3) = (1,3)

Though I am not that concerned about overloading the use of ROW in context
of an UPDATE.

As with normal usage of ROW why not make its presence optional - support
both syntaxes?

Keywords like USING and SET have different meanings when used in
DELETE/UPDATE so having ROW behave similarly wouldn't be that confusing -
and it does seem to have an ambiguity if you restrict this interpretation of
ROW to only the SET of the update statement.

Is there any need or requirement for (or against) interleaving normal and
row-valued, or even multiple row-valued, SET expressions?

UPDATE foo SET (foo.col1, foo.col3) = (1,3), foo.col2 = 2

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Supporting-multiple-column-assignment-in-UPDATE-9-5-project-tp5802240p5802471.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-03 Thread Marko Tiikkaja

On 5/2/14, 10:10 PM, Merlin Moncure wrote:

On Fri, May 2, 2014 at 3:03 PM, Tom Lane t...@sss.pgh.pa.us wrote:

Meh.  Then you could have a query that works fine until you add a column
to the table, and it stops working.  If nobody ever used column names
identical to table names it'd be all right, but unfortunately people
seem to do that a lot...


That's already the case with select statements


I don't think that's true if you table-qualify your column references 
and don't use SELECT *.



and, if a user were
concerned about that, always have the option of aliasing the table as
nearly 100% of professional developers do:

SELECT f FROM foo f;
etc.


So e.g.:

  UPDATE foo f SET f = ..;

would resolve to the table, despite there being a column called f? 
That would break backwards compatibility.


How about:

  UPDATE foo SET ROW(foo) = (1,2,3);

ISTM that this could be parsed unambiguously, though it's perhaps a bit 
ugly.



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-02 Thread Merlin Moncure
On Fri, May 2, 2014 at 1:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I've been thinking about how we might implement the multiple column
 assignment UPDATE syntax that was introduced in SQL:2003.  This feature
 allows you to do

 UPDATE table SET ..., (column, column, ...) = row-valued expression, ...

 where the system arranges to evaluate the row-valued expression just
 once per row and then assign its fields into the specified target columns.

 Back in commit 6e8596a146c9b16f2c053ea8c6e361a114c0b65c we introduced
 some limited support for this syntax, but it only handles a row-valued
 expression that is a ROW() constructor, and it just does a simple
 syntactic transformation of pulling apart the ROW() constructor and
 building an independent assignment to each target column.  The actually
 interesting uses for this feature don't work with that implementation
 approach.  The most common case I've seen asked for is where the
 expression is a sub-SELECT returning multiple columns (but at most one
 row).

 As far as the parser is concerned, the main hurdle to supporting this
 feature is that the representation of an UPDATE's targetlist assumes that
 each list element is an independent TargetEntry representing a single
 assignment.  Now, there is a heck of a lot of code that knows what
 targetlists look like, so I'm not eager to try to change that basic
 assumption.  What seems like probably a better idea is to represent
 SET (target1, target2, target3) = foo
 as though it were
 SET target1 = fooref.col1, target2 = fooref.col2, target3 = fooref.col3
 where fooref is some Param-like reference to a separate list of
 expressions that have composite outputs.  It would be understood that this
 separate targetlist would be evaluated just once before evaluating the
 main tlist.  This approach would allow all the existing targetlist
 manipulation code to stay about the same.  It would be a bit of a
 challenge for ruleutils.c to reconstruct the original syntax when printing
 an UPDATE in a rule, but I think that's just a small matter of
 programming.  (Possibly it would help if the elements of the separate
 composite-values targetlist contained markers as to which main-tlist
 elements they were for.)

 Now, we could probably implement it straightforwardly just based on that
 idea, though it's not quite clear where to shoehorn evaluation of the
 separate targetlist into the constructed plan.  One way would be to
 insert an additional evaluation level by adding a Result node on top
 of the normal plan, and then have the lower level compute the composite
 values as resjunk tlist elements, while the upper level does FieldSelects
 from the composite values to implement the fooref.colN references.

 However, I'm mainly interested in the sub-SELECT case; indeed, anything
 else you might want to do could be transformed into a sub-SELECT, so
 I wouldn't feel bad if we just restricted the new feature to that.
 And this doesn't seem like quite the right way to do it for sub-SELECTs.

 In the case of sub-SELECTs, we have almost the right execution mechanism
 already, in that initPlans are capable of setting multiple PARAM_EXEC
 runtime Params, one for each output column of the sub-SELECT.  So what
 I called fooref.col1 etc above could just be PARAM_EXEC Params referring
 to the subplan outputs --- except that initPlans are only for uncorrelated
 subqueries (those without any outer references to Vars of the parent query
 level).  And the interesting cases for UPDATE generally involve correlated
 subqueries.

 What I'm thinking about this is that we ought to make an effort to unify
 the currently separate implementation paths for correlated and
 uncorrelated subqueries.  Instead of SubPlans in the expression tree for
 correlated subqueries, I think they should all be treated much like
 initPlans are now, ie, there are PARAM_EXEC Params referencing outputs
 from a list of subqueries that are attached to the expression tree's
 parent plan node, and we lazily evaluate the subqueries upon first use of
 one of their output parameters.  What would be different from the current
 handling of initPlans is that each time we advance to a new input row,
 we'd need to reset the evaluation state of the subqueries that are
 correlated.  The reason for changing it like that is so that we can have
 multiple separate Params referencing different output columns of a single
 correlated subquery, and be sure that we evaluate the correlated subquery
 only once; the current SubPlan mechanism can't support separate references
 to the same subplan.  Now, this would add a small amount of new bookkeeping
 overhead to use of correlated subqueries, but I find it hard to believe
 that that'd be noticeable compared to the startup/shutdown cost of the
 subquery.

 So, if we were to revise the handling of correlated subqueries like that,
 then for the case of a row-valued expression that is a sub-SELECT we
 wouldn't need any explicit runtime evaluation of 

Re: [HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-02 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 On Fri, May 2, 2014 at 1:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I've been thinking about how we might implement the multiple column
 assignment UPDATE syntax that was introduced in SQL:2003.  This feature
 allows you to do
 UPDATE table SET ..., (column, column, ...) = row-valued expression, ...

 Couple quick questions:
 1) how does this interplay with RETURNING?  I guess it probably
 doesn't change, but I imagine there's be no way to reference the
 composite result in the RETURNING statement?

Not as such; obviously you could reference the assigned-to columns
in RETURNING and thereby reconstruct the composite value.

 2) I often wish that you could reference the table (or it's alias)
 directly as the field list.

 UPDATE foo f set f = (...)::foo;
 or even
 UPDATE foo SET foo = foo;

Hm.  You could get there with this syntax as long as you didn't mind
writing out the field list explicitly.  Arguments why you should
want to do that are the same as for avoiding SELECT *, with maybe
a bit more urgency since at least SELECT * won't trash your data
if you get it wrong.  However, assuming that that argument doesn't
impress you ...

My draft copy of SQL99 mentions a syntax

UPDATE table SET ROW = row-valued expression [ WHERE ... ]

which does not appear in later editions of the spec, and probably wasn't
in SQL99 final either (since SQL:2003 does not mention it as a removed
feature).  I'm not sure we'd want to implement that; it would require
making ROW into a fully-reserved word, which it is not today, and that
seems rather a high price for implementing a not-per-spec feature.
But I don't think your suggestions of the table name or alias work;
they could conflict with an actual column name.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-02 Thread Merlin Moncure
On Fri, May 2, 2014 at 2:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 2) I often wish that you could reference the table (or it's alias)
 directly as the field list.

 UPDATE foo f set f = (...)::foo;
 or even
 UPDATE foo SET foo = foo;

 Hm.  You could get there with this syntax as long as you didn't mind
 writing out the field list explicitly.  Arguments why you should
 want to do that are the same as for avoiding SELECT *, with maybe
 a bit more urgency since at least SELECT * won't trash your data
 if you get it wrong.  However, assuming that that argument doesn't
 impress you ...

 My draft copy of SQL99 mentions a syntax

 UPDATE table SET ROW = row-valued expression [ WHERE ... ]

 which does not appear in later editions of the spec, and probably wasn't
 in SQL99 final either (since SQL:2003 does not mention it as a removed
 feature).  I'm not sure we'd want to implement that; it would require
 making ROW into a fully-reserved word, which it is not today, and that
 seems rather a high price for implementing a not-per-spec feature.
 But I don't think your suggestions of the table name or alias work;
 they could conflict with an actual column name.

Presumably it'd follow similar rules to SELECT -- resolve the column
name in the face of ambiguity.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-02 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 On Fri, May 2, 2014 at 2:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 But I don't think your suggestions of the table name or alias work;
 they could conflict with an actual column name.

 Presumably it'd follow similar rules to SELECT -- resolve the column
 name in the face of ambiguity.

Meh.  Then you could have a query that works fine until you add a column
to the table, and it stops working.  If nobody ever used column names
identical to table names it'd be all right, but unfortunately people
seem to do that a lot...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-02 Thread Merlin Moncure
On Fri, May 2, 2014 at 3:03 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 On Fri, May 2, 2014 at 2:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 But I don't think your suggestions of the table name or alias work;
 they could conflict with an actual column name.

 Presumably it'd follow similar rules to SELECT -- resolve the column
 name in the face of ambiguity.

 Meh.  Then you could have a query that works fine until you add a column
 to the table, and it stops working.  If nobody ever used column names
 identical to table names it'd be all right, but unfortunately people
 seem to do that a lot...

That's already the case with select statements and, if a user were
concerned about that, always have the option of aliasing the table as
nearly 100% of professional developers do:

SELECT f FROM foo f;
etc.

Now, I need this feature a lot less than I used to (although I do like
the symmetry with SELECT); hstore and jsonb have matured to the point
that they can handle most trigger function operations that you'd want
to abstract over multiple tables without expensive calls to
information_schema.  The main advantages for a native approach would
be type safety (although even that situation is improving at long
last), performance, and code complexity.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers