subject:"Re\: \[HACKERS\] MERGE Specification"

On Fri, 2010-08-06 at 09:39 +0800, Boxuan Zhai wrote:
 
 Besides, (I mean no offense, but) can this method really avoid losing
 row? 

Not as you just specified, no.

You need *both* actions of RAISE ERROR and DO NOTHING, or you may as
well have neither.

(1) Natural style allows missing rows if you are not careful - and also
allows missing rows in future when COL is allowed to take value 'C',
which may not have been originally considered when SQL first written

WHEN NOT MATCHED AND COL = 'A'
  INSERT...
WHEN NOT MATCHED AND COL = 'B'
  INSERT...

(2) Shows code style required to explicitly avoid missing rows

WHEN NOT MATCHED AND COL = 'A'
  INSERT...
WHEN NOT MATCHED AND COL = 'B'
  INSERT...
WHEN NOT MATCHED
  RAISE ERROR

(3) More complex example, with explicit DO NOTHING, showing how it can
provide well structured code

WHEN NOT MATCHED AND COL = 'A'
  DO NOTHING
WHEN NOT MATCHED AND COL = 'B'
  INSERT...
WHEN NOT MATCHED
  RAISE ERROR


So DO NOTHING is the default and implies silently ignoring rows. RAISE
ERROR is the opposite.

Coding for those seems very easy, its just a question of should we do
it?. DB2 has it; SQL:2008 does not. But then SQL:2008 followed the DB2
introduction of AND clauses, and SQL:2011 has so far followed the DB2
introduction of DELETE action also.

Given that Peter is now attending SQL Standards meetings, I would
suggest we leave out my suggestion above, for now. We have time to raise
this at standards meetings and influence the outcome and then follow
later.

There is a workaround:

WHEN NOT MATCHED AND COL = 'A'
  DO NOTHING
WHEN NOT MATCHED AND COL = 'B'
  INSERT...
WHEN NOT MATCHED AND TRUE
  INSERT INTO ERROR_TABLE (errortext);

where ERROR_TABLE has an INSERT trigger which throws an ERROR with given
text.

SQL:2011 makes no mention of how MERGE should react to statement level
triggers. MERGE is not a trigger action even. Given considerable
confusion in this area, IMHO we should just say the MERGE does not call
statement triggers at all, of any kind.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2010-08-06 Thread Heikki Linnakangas


On 06/08/10 10:12, Simon Riggs wrote:

So DO NOTHING is the default and implies silently ignoring rows. RAISE
ERROR is the opposite.

Coding for those seems very easy, its just a question of should we do
it?. DB2 has it; SQL:2008 does not. But then SQL:2008 followed the DB2
introduction of AND clauses, and SQL:2011 has so far followed the DB2
introduction of DELETE action also.


I see neither DO NOTHING or RAISE ERROR in the documentation of DB2, 
Oracle, or MSSQL server.



Given that Peter is now attending SQL Standards meetings, I would
suggest we leave out my suggestion above, for now. We have time to raise
this at standards meetings and influence the outcome and then follow
later.


Ok, fair enough.


SQL:2011 makes no mention of how MERGE should react to statement level
triggers. MERGE is not a trigger action even. Given considerable
confusion in this area, IMHO we should just say the MERGE does not call
statement triggers at all, of any kind.


IMO the UPDATE/DELETE/INSERT actions should fire the respective 
statement level triggers, but the MERGE itself should not.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

On Fri, 2010-08-06 at 10:28 +0300, Heikki Linnakangas wrote:

  SQL:2011 makes no mention of how MERGE should react to statement level
  triggers. MERGE is not a trigger action even. Given considerable
  confusion in this area, IMHO we should just say the MERGE does not call
  statement triggers at all, of any kind.
 
 IMO the UPDATE/DELETE/INSERT actions should fire the respective 
 statement level triggers, but the MERGE itself should not.

When, and how?

If an UPDATE is mentioned 5 times, do we call the trigger 5 times? What
happens if none of the UPDATEs are ever executed? 

Best explain exactly what you mean.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

On Fri, 2010-08-06 at 10:28 +0300, Heikki Linnakangas wrote:
 On 06/08/10 10:12, Simon Riggs wrote:
  So DO NOTHING is the default and implies silently ignoring rows. RAISE
  ERROR is the opposite.
 
  Coding for those seems very easy, its just a question of should we do
  it?. DB2 has it; SQL:2008 does not. But then SQL:2008 followed the DB2
  introduction of AND clauses, and SQL:2011 has so far followed the DB2
  introduction of DELETE action also.
 
 I see neither DO NOTHING or RAISE ERROR in the documentation of DB2, 
 Oracle, or MSSQL server.

Agreed, Oracle and MSSQL server does not have these. 

However, DB2 very clearly does have these features

* SIGNAL which raises an error and can be used in place of any action,
at any point in sequence of WHEN clauses. DB2 already supports SIGNAL as
part of SQL/PSM, which we do not, so RAISE ERROR was the nearest
equivalent command for PostgreSQL.

* ELSE IGNORE which does same thing as DO NOTHING, except it must always
be last statement in a sequence of WHEN clauses. DO NOTHING is already a
phrase with exactly this meaning in PostgreSQL, so I suggest that.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2010-08-06 Thread Boxuan Zhai

On Fri, Aug 6, 2010 at 3:41 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On Fri, 2010-08-06 at 10:28 +0300, Heikki Linnakangas wrote:

   SQL:2011 makes no mention of how MERGE should react to statement level
   triggers. MERGE is not a trigger action even. Given considerable
   confusion in this area, IMHO we should just say the MERGE does not call
   statement triggers at all, of any kind.
 
  IMO the UPDATE/DELETE/INSERT actions should fire the respective
  statement level triggers, but the MERGE itself should not.

 When, and how?

 If an UPDATE is mentioned 5 times, do we call the trigger 5 times?


My current process for BEFOR / AFTER STATEMENT trigger on MERGE is to fire
the triggers for all action types that appears in the command, unless it is
replaced by a INSTEAD rule. But the triggers for one action type will be
fired only once. That means you will get both UPDATE and INSERT triggers be
activated for only once if you are executing a MERGE command with 5 UPDATEs
and 10 INSERTs.


 What happens if none of the UPDATEs are ever executed?

 The triggers (I mean triggers for statement) will be fired anyway even the
UPDATE action matches no tuple. This is not for MERGE only. If you update a
table with the command
UPDATE foo SET ... WHERE false;
It will also fire the STATEMENT triggers of UPDATE type on foo (I think so).


And, even not been asked, I want to say that, in current implementation of
MERGE,  the row level triggers are fired by the actions that take the
tuples. If one tuple is caught by an UPDATE action, then the UPDATE row
trigger will be fired on this tuple. If it is handled by INSERT action, then
the INSRET row triggers are on.

Hope you agree with my designs.


 Best explain exactly what you mean.

 --
  Simon Riggs   www.2ndQuadrant.com http://www.2ndquadrant.com/
  PostgreSQL Development, 24x7 Support, Training and Services

Re: [HACKERS] MERGE Specification

2010-08-06 Thread Boxuan Zhai

On Fri, Aug 6, 2010 at 3:53 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On Fri, 2010-08-06 at 10:28 +0300, Heikki Linnakangas wrote:
  On 06/08/10 10:12, Simon Riggs wrote:
   So DO NOTHING is the default and implies silently ignoring rows. RAISE
   ERROR is the opposite.
  
   Coding for those seems very easy, its just a question of should we do
   it?. DB2 has it; SQL:2008 does not. But then SQL:2008 followed the DB2
   introduction of AND clauses, and SQL:2011 has so far followed the DB2
   introduction of DELETE action also.
 
  I see neither DO NOTHING or RAISE ERROR in the documentation of DB2,
  Oracle, or MSSQL server.

 Agreed, Oracle and MSSQL server does not have these.

 However, DB2 very clearly does have these features

 * SIGNAL which raises an error and can be used in place of any action,
 at any point in sequence of WHEN clauses. DB2 already supports SIGNAL as
 part of SQL/PSM, which we do not, so RAISE ERROR was the nearest
 equivalent command for PostgreSQL.

 * ELSE IGNORE which does same thing as DO NOTHING, except it must always
 be last statement in a sequence of WHEN clauses. DO NOTHING is already a
 phrase with exactly this meaning in PostgreSQL, so I suggest that.


So, we need to add both DO NOTHING and RAISE ERROR actions in the MERGE
command now !? What will RAISE ERROR do? To stop the whole MERGE command OR,
just throw an error notice for the row and move on.



  --
  Simon Riggs   www.2ndQuadrant.com http://www.2ndquadrant.com/
  PostgreSQL Development, 24x7 Support, Training and Services

Re: [HACKERS] MERGE Specification

On Fri, 2010-08-06 at 16:26 +0800, Boxuan Zhai wrote:

 So, we need to add both DO NOTHING and RAISE ERROR actions in the
 MERGE command now !? What will RAISE ERROR do?

Let's get the rest of it working first. This would be a later extension,
though I think an important one for our developers.

 To stop the whole MERGE command OR, just throw an error notice for the
 row and move on.

As you say, it would be even better to be able to report errors in some
way and move onto next row. NOTICE is not the way though.

Maybe one of the actions would be to EXECUTE a procedure, so we can call
an error logging function.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2010-08-06 Thread Peter Eisentraut

On tor, 2010-08-05 at 16:35 +0100, Simon Riggs wrote:
 * DELETE is an extension to the standard, though supported by Oracle,
 DB2 and SQLServer and damn useful

- SQL:2011


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

On Mon, 2008-04-21 at 21:08 +0100, Simon Riggs wrote:
 The following two files specify the behaviour of the MERGE statement and
 how it will work in the world of PostgreSQL. 

 The HTML file was generated from SGML source, though the latter is not
 included here for clarity.

Enclose merge.sgml docs for forthcoming MERGE command, as originally
written.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services
!--
$PostgreSQL$
--

refentry id=SQL-MERGE
 refmeta
  refentrytitle id=SQL-MERGE-TITLEMERGE/refentrytitle
  refmiscinfoSQL - Language Statements/refmiscinfo
 /refmeta

 refnamediv
  refnameMERGE/refname
  refpurposeupdate, insert or delete rows of a table based upon source data/refpurpose
 /refnamediv

 indexterm zone=sql-merge
  primaryMERGE/primary
 /indexterm

 refsynopsisdiv
synopsis
MERGE INTO replaceable class=PARAMETERtable/replaceable [ [ AS ] replaceable class=parameteralias/replaceable ]
USING replaceable class=PARAMETERsource-query/replaceable
ON replaceable class=PARAMETERjoin_condition/replaceable
[replaceable class=PARAMETERwhen_clause/replaceable [...]]

where replaceable class=PARAMETERwhen_clause/replaceable is

{ WHEN MATCHED [ AND replaceable class=PARAMETERcondition/replaceable ] THEN { replaceable class=PARAMETERmerge_update/replaceable | DELETE }
  WHEN NOT MATCHED [ AND replaceable class=PARAMETERcondition/replaceable ] THEN { replaceable class=PARAMETERmerge_insert/replaceable | DO NOTHING } }

where replaceable class=PARAMETERmerge_update/replaceable is

UPDATE SET { replaceable class=PARAMETERcolumn/replaceable = { replaceable class=PARAMETERexpression/replaceable | DEFAULT } |
   ( replaceable class=PARAMETERcolumn/replaceable [, ...] ) = ( { replaceable class=PARAMETERexpression/replaceable | DEFAULT } [, ...] ) } [, ...]

and replaceable class=PARAMETERmerge_insert/replaceable is

INSERT [( replaceable class=PARAMETERcolumn/replaceable [, ...] )] { VALUES ( { replaceable class=PARAMETERexpression/replaceable | DEFAULT } [, ...] ) | DEFAULT VALUES }
/synopsis
 /refsynopsisdiv

 refsect1
  titleDescription/title

  para
   commandMERGE/command performs at most one action on each row from
   the target table, driven by the rows from the source query. This
   provides a way to specify a single SQL statement that can conditionally
   commandUPDATE/command or commandINSERT/command rows, a task
   that would otherwise require multiple procedural language statements.
  /para

  para
   First, the commandMERGE/command command performs a left outer join
   from source query to target table, producing zero or more merged rows. For 
   each merged row, literalWHEN/ clauses are evaluated in the
   specified order until one of them is activated. The corresponding action
   is then applied and processing continues for the next row.
  /para

  para
   commandMERGE/command actions have the same effect as 
   regular commandUPDATE/command, commandINSERT/command, or
   commandDELETE/command commands of the same names, though the syntax
   is slightly different.
  /para

  para
   If no literalWHEN/ clause activates then an implicit action of 
   literalINSERT DEFAULT VALUES/ is performed for that row. If that
   implicit action is not desirable an explicit action of 
   literalDO NOTHING/ may be specified instead.
  /para

  para
   commandMERGE/command will only affect rows only in the specified table.
  /para

  para
   There is no literalRETURNING/ clause with commandMERGE/command.
  /para

  para
   There is no MERGE privilege.
   You must have the literalUPDATE/literal privilege on the table
   if you specify an update action, the literalINSERT/literal privilege if
   you specify an insert action and/or the literalDELETE/literal privilege
   if you wish to delete. You will also require the 
   literalSELECT/literal privilege to any table whose values are read 
   in the replaceable class=parameterexpressions/replaceable or
   replaceable class=parametercondition/replaceable.
  /para
 /refsect1

 refsect1
  titleParameters/title

  variablelist
   varlistentry
termreplaceable class=PARAMETERtable/replaceable/term
listitem
 para
  The name (optionally schema-qualified) of the table to merge into.
 /para
/listitem
   /varlistentry

   varlistentry
termreplaceable class=parameteralias/replaceable/term
listitem
 para
  A substitute name for the target table. When an alias is
  provided, it completely hides the actual name of the table.  For
  example, given literalMERGE foo AS f/, the remainder of the
  commandMERGE/command statement must refer to this table as
  literalf/ not literalfoo/.
 /para
/listitem
   /varlistentry

   varlistentry
termreplaceable class=PARAMETERsource-query/replaceable/term
listitem
 para
  A query (commandSELECT/command statement or commandVALUES/command
  statement) that supplies the rows to be merged into the target

Re: [HACKERS] MERGE Specification

2010-08-05 Thread Heikki Linnakangas


On 05/08/10 10:46, Simon Riggs wrote:

On Mon, 2008-04-21 at 21:08 +0100, Simon Riggs wrote:

The following two files specify the behaviour of the MERGE statement and
how it will work in the world of PostgreSQL.



The HTML file was generated from SGML source, though the latter is not
included here for clarity.


Enclose merge.sgml docs for forthcoming MERGE command, as originally
written.


Oh, cool, I wasn't aware you had written that already. Boxuan, please 
include this in your patch, after reviewing and removing/editing 
anything that doesn't apply to your patch.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

On Thu, 2010-08-05 at 12:29 +0300, Heikki Linnakangas wrote:
 On 05/08/10 10:46, Simon Riggs wrote:
  On Mon, 2008-04-21 at 21:08 +0100, Simon Riggs wrote:
  The following two files specify the behaviour of the MERGE statement and
  how it will work in the world of PostgreSQL.
 
  The HTML file was generated from SGML source, though the latter is not
  included here for clarity.
 
  Enclose merge.sgml docs for forthcoming MERGE command, as originally
  written.
 
 Oh, cool, I wasn't aware you had written that already. Boxuan, please 
 include this in your patch, after reviewing and removing/editing 
 anything that doesn't apply to your patch.

Also had these fragments as well, if they're still useful. Probably just
useful as pointers as to what else to change to include the docs.


The tests and docs were written from SQL standard, so any deviations
would need to be flagged. The idea of writing the tests first was that
they provide an objective test of whether the implementation works
according to spec.

I'd quite like a commentary on anything that needs changing. Not saying
I will necessarily object to differences, but knowing the differences
sounds important for us.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services
Index: doc/src/sgml/reference.sgml
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/doc/src/sgml/reference.sgml,v
retrieving revision 1.66
diff -c -r1.66 reference.sgml
*** doc/src/sgml/reference.sgml	27 Mar 2008 17:24:16 -	1.66
--- doc/src/sgml/reference.sgml	18 Apr 2008 17:50:31 -
***
*** 135,140 
--- 135,141 
 listen;
 load;
 lock;
+merge;
 move;
 notify;
 prepare;
Index: doc/src/sgml/ref/allfiles.sgml
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/doc/src/sgml/ref/allfiles.sgml,v
retrieving revision 1.73
diff -c -r1.73 allfiles.sgml
*** doc/src/sgml/ref/allfiles.sgml	27 Mar 2008 17:24:16 -	1.73
--- doc/src/sgml/ref/allfiles.sgml	18 Apr 2008 11:10:16 -
***
*** 107,112 
--- 107,113 
  !entity listen system listen.sgml
  !entity load   system load.sgml
  !entity lock   system lock.sgml
+ !entity merge  system merge.sgml
  !entity move   system move.sgml
  !entity notify system notify.sgml
  !entity preparesystem prepare.sgml
Index: doc/src/sgml/ref/update.sgml
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/doc/src/sgml/ref/update.sgml,v
retrieving revision 1.46
diff -c -r1.46 update.sgml
*** doc/src/sgml/ref/update.sgml	15 Feb 2008 22:17:06 -	1.46
--- doc/src/sgml/ref/update.sgml	21 Apr 2008 19:01:36 -
***
*** 322,327 
--- 322,330 
  -- continue with other operations, and eventually
  COMMIT;
  /programlisting
+ 
+This operation can be executed in a single statement using
+xref linkend=sql-merge endterm=sql-merge-title.
/para
  
para

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2010-08-05 Thread Boxuan Zhai

On Thu, Aug 5, 2010 at 7:25 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On Thu, 2010-08-05 at 12:29 +0300, Heikki Linnakangas wrote:
  On 05/08/10 10:46, Simon Riggs wrote:
   On Mon, 2008-04-21 at 21:08 +0100, Simon Riggs wrote:
   The following two files specify the behaviour of the MERGE statement
 and
   how it will work in the world of PostgreSQL.
  
   The HTML file was generated from SGML source, though the latter is not
   included here for clarity.
  
   Enclose merge.sgml docs for forthcoming MERGE command, as originally
   written.
 
  Oh, cool, I wasn't aware you had written that already. Boxuan, please
  include this in your patch, after reviewing and removing/editing
  anything that doesn't apply to your patch.


Thanks a lot for the instruction file of MERGE command. I have read through
it carefully. It is really a great work. I have to admit that I am not
familiar with the sgml language, and I cannot write the instruction by
myself.

All features of MERGE demonstrated in this file are consistent with my
implementation, EXCEPT the DO NOTHING option. In current edition, we don't
have the DO NOTHING action type. That is, during the execution of MERGE
commands, if one tuple is not caught by any of the merge actions, it will be
ignored. In another word, DO NOTING (although cannot be specified explicitly
by user) is the DEFAULT action for tuples.

In the contrary, Simon's instruction says that the DEFAULT action for the
tuple caught by no actions is
WHEN NOT MATCHED THEN INSERT DEFAULT VALUES

From the user's point of view, these two kinds of MERGE command may have
not much differences. But, as the coder, I prefer current setting, because
we can save the implementation for a new type of MERGE actions (DO
NOTHING is a special merge action type). And, thus, no checks and special
process for it. (For example, we need to make sure that DO NOTHING is the
last WHEN clause, and it has no additional qual. And we have to generate a
INSERT DEFAULT VALUES action for the MERGE command if we don't find the DO
NOTHING action)

Well, if people want the DO NOTHING action, I will add it in the system.


 Now, I have changed the RULE strategy of MERGE to the better logic. And I
am working on triggers for MERGE, which is also mentioned in the instruction
file. I will build a new patch with no long comment and blank line around
functions, and possibly contain the regress test file and this sgml
instructions in it.

I wish we can reach a agreement on the DO NOTHING thing before my next
submission, so I can make necessary modification on my code for it. (the new
patch may be finished in one or two days, I think)

Thanks!

PS: I have an embarrassing question: how to view the sgml instructions of
postgres in web page form, rather than read the source code of them?


 Also had these fragments as well, if they're still useful. Probably just
 useful as pointers as to what else to change to include the docs.


 The tests and docs were written from SQL standard, so any deviations
 would need to be flagged. The idea of writing the tests first was that
 they provide an objective test of whether the implementation works
 according to spec.

 I'd quite like a commentary on anything that needs changing. Not saying
 I will necessarily object to differences, but knowing the differences
 sounds important for us.

 --
  Simon Riggs   www.2ndQuadrant.com http://www.2ndquadrant.com/
  PostgreSQL Development, 24x7 Support, Training and Services

Re: [HACKERS] MERGE Specification

2010-08-05 Thread David Fetter

On Thu, Aug 05, 2010 at 09:55:29PM +0800, Boxuan Zhai wrote:
 On Thu, Aug 5, 2010 at 7:25 PM, Simon Riggs si...@2ndquadrant.com wrote:
 
  On Thu, 2010-08-05 at 12:29 +0300, Heikki Linnakangas wrote:
   On 05/08/10 10:46, Simon Riggs wrote:
On Mon, 2008-04-21 at 21:08 +0100, Simon Riggs wrote:
The following two files specify the behaviour of the MERGE statement
  and
how it will work in the world of PostgreSQL.
   
The HTML file was generated from SGML source, though the latter is not
included here for clarity.
   
Enclose merge.sgml docs for forthcoming MERGE command, as originally
written.
  
   Oh, cool, I wasn't aware you had written that already. Boxuan, please
   include this in your patch, after reviewing and removing/editing
   anything that doesn't apply to your patch.
 
 Thanks a lot for the instruction file of MERGE command. I have read through
 it carefully. It is really a great work. I have to admit that I am not
 familiar with the sgml language, and I cannot write the instruction by
 myself.

It's really not super complicated.  It's quite like (and ancestral to)
HTML.

 All features of MERGE demonstrated in this file are consistent with my
 implementation, EXCEPT the DO NOTHING option. In current edition, we don't
 have the DO NOTHING action type. That is, during the execution of MERGE
 commands, if one tuple is not caught by any of the merge actions, it will be
 ignored. In another word, DO NOTING (although cannot be specified explicitly
 by user) is the DEFAULT action for tuples.
 
 In the contrary, Simon's instruction says that the DEFAULT action for the
 tuple caught by no actions is
 WHEN NOT MATCHED THEN INSERT DEFAULT VALUES

I believe that the SQL standard specifies this behavior, and I don't
think we have a compelling reason to do something different from what
the SQL standard specifies.

 Well, if people want the DO NOTHING action, I will add it in the system.

That'd be great :)

  Now, I have changed the RULE strategy of MERGE to the better logic. And I
 am working on triggers for MERGE, which is also mentioned in the instruction
 file. I will build a new patch with no long comment and blank line around
 functions, and possibly contain the regress test file and this sgml
 instructions in it.
 
 I wish we can reach a agreement on the DO NOTHING thing before my next
 submission, so I can make necessary modification on my code for it. (the new
 patch may be finished in one or two days, I think)
 
 Thanks!
 
 PS: I have an embarrassing question: how to view the sgml instructions of
 postgres in web page form, rather than read the source code of them?

After you've built postgresql, do this:

cd doc/src/sgml
make

Then you can point a web browser at the doc/src/sgml/html/index.html
(and similar)

http://www.postgresql.org/docs/current/static/docguide.html

has information about the tools you will need for the above to work.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

On Thu, 2010-08-05 at 21:55 +0800, Boxuan Zhai wrote:

 In the contrary, Simon's instruction says that the DEFAULT action for
 the tuple caught by no actions is 
 WHEN NOT MATCHED THEN INSERT DEFAULT VALUES
  
 From the user's point of view, these two kinds of MERGE command may
 have not much differences. But, as the coder, I prefer current
 setting, because we can save the implementation for a new type
 of MERGE actions (DO NOTHING is a special merge action type). And,
 thus, no checks and special process for it. (For example, we need to
 make sure that DO NOTHING is the last WHEN clause, and it has no
 additional qual. And we have to generate a INSERT DEFAULT VALUES
 action for the MERGE command if we don't find the DO NOTHING action)
  
 Well, if people want the DO NOTHING action, I will add it in the
 system. 

This is only important when using AND search condition, so its not
important for the common UPSERT case of unconditional UPDATE/INSERT.

Personally, I would prefer the default action to be RAISE ERROR or
similar. Otherwise its just too easy to get complex logic wrong and lose
a few rows without noticing. If that was the case then you would
definitely need DO NOTHING when you explicitly wanted to lose a few
rows.

You may think that's a bit strong, but consider that PostgreSQL uses
default = ERROR in vast majority of switch() statements. I think its a
safe coding practice and the annoyance of having run-time errors is much
better than losing rows.
 
The INSERT DEFAULT VALUES was behaviour taken from another DBMS, its not
part of the standard AFAICS.

 Now, I have changed the RULE strategy of MERGE to the better logic.
 And I am working on triggers for MERGE, which is also mentioned in the
 instruction file. I will build a new patch with no long comment and
 blank line around functions, and possibly contain the regress test
 file and this sgml instructions in it.
  
 I wish we can reach a agreement on the DO NOTHING thing before my next
 submission, so I can make necessary modification on my code for
 it. (the new patch may be finished in one or two days, I think)
  
 Thanks!
  
 PS: I have an embarrassing question: how to view the sgml instructions
 of postgres in web page form, rather than read the source code of
 them?

If you edit the files, as shown in the patches here, then you just need
to drop into the doc/sgml/src directory and type make. The SGML will
then be compiled into HTML and you can view the resulting file directly
in your web browser.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2010-08-05 Thread Heikki Linnakangas


On 05/08/10 17:22, Simon Riggs wrote:

On Thu, 2010-08-05 at 21:55 +0800, Boxuan Zhai wrote:


In the contrary, Simon's instruction says that the DEFAULT action for
the tuple caught by no actions is
WHEN NOT MATCHED THEN INSERT DEFAULT VALUES

 From the user's point of view, these two kinds of MERGE command may
have not much differences. But, as the coder, I prefer current
setting, because we can save the implementation for a new type
of MERGE actions (DO NOTHING is a special merge action type). And,
thus, no checks and special process for it. (For example, we need to
make sure that DO NOTHING is the last WHEN clause, and it has no
additional qual. And we have to generate a INSERT DEFAULT VALUES
action for the MERGE command if we don't find the DO NOTHING action)

Well, if people want the DO NOTHING action, I will add it in the
system.


This is only important when using ANDsearch condition, so its not
important for the common UPSERT case of unconditional UPDATE/INSERT.


Assuming the default action if no other action matches is to do nothing, 
then an explicit DO NOTHING is just a convenience. You can have the same 
effect by having an AND NOT condition to all the actions following 
the DO NOTHING action. I admit it's quite handy, but let's avoid 
PostgreSQL extensions at this point.



Personally, I would prefer the default action to be RAISE ERROR or
similar. Otherwise its just too easy to get complex logic wrong and lose
a few rows without noticing. If that was the case then you would
definitely need DO NOTHING when you explicitly wanted to lose a few
rows.

You may think that's a bit strong, but consider that PostgreSQL uses
default =  ERROR in vast majority of switch() statements. I think its a
safe coding practice and the annoyance of having run-time errors is much
better than losing rows.

The INSERT DEFAULT VALUES was behaviour taken from another DBMS, its not
part of the standard AFAICS.


What does the standard say about this? We should follow the standard, I 
don't see enough reason to deviate here.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

On Thu, 2010-08-05 at 18:17 +0300, Heikki Linnakangas wrote:
 On 05/08/10 17:22, Simon Riggs wrote:
  On Thu, 2010-08-05 at 21:55 +0800, Boxuan Zhai wrote:
 
  In the contrary, Simon's instruction says that the DEFAULT action for
  the tuple caught by no actions is
  WHEN NOT MATCHED THEN INSERT DEFAULT VALUES
 
   From the user's point of view, these two kinds of MERGE command may
  have not much differences. But, as the coder, I prefer current
  setting, because we can save the implementation for a new type
  of MERGE actions (DO NOTHING is a special merge action type). And,
  thus, no checks and special process for it. (For example, we need to
  make sure that DO NOTHING is the last WHEN clause, and it has no
  additional qual. And we have to generate a INSERT DEFAULT VALUES
  action for the MERGE command if we don't find the DO NOTHING action)
 
  Well, if people want the DO NOTHING action, I will add it in the
  system.
 
  This is only important when using ANDsearch condition, so its not
  important for the common UPSERT case of unconditional UPDATE/INSERT.
 
 Assuming the default action if no other action matches is to do nothing, 
 then an explicit DO NOTHING is just a convenience. You can have the same 
 effect by having an AND NOT condition to all the actions following 
 the DO NOTHING action. I admit it's quite handy, but let's avoid 
 PostgreSQL extensions at this point.

err...

* DELETE is an extension to the standard, though supported by Oracle,
DB2 and SQLServer and damn useful

* INSERT DEFAULT VALUES is an extension to the standard, though matches
options on the normal INSERT clause

* rule support is an extension to the standard

* It appears we would be in violation of the standard on 
14.12 General Rule 6 a) i) 2) B), p.890
(Oh, I wish I was joking, there really is such a paragraph number)
which specifies that the join between source and target table must not
return multiple rows or must return cardinality violation. That's
pretty difficult thing to check and not very useful if it does do that.

anyway, that list isn't an argument in favour of change. The argument in
favour of a fail-safe default is that it is a safe coding practice that
the PostgreSQL project already uses itself. The only way to write a safe
MERGE SQL statement is with an extension to the standard...

Principle of minimal extension would mean we only need to support RAISE
ERROR, to allow people to specify they actively want statement to fail
if the list of WHEN clauses does not produce a match.

  Personally, I would prefer the default action to be RAISE ERROR or
  similar. Otherwise its just too easy to get complex logic wrong and lose
  a few rows without noticing. If that was the case then you would
  definitely need DO NOTHING when you explicitly wanted to lose a few
  rows.
 
  You may think that's a bit strong, but consider that PostgreSQL uses
  default =  ERROR in vast majority of switch() statements. I think its a
  safe coding practice and the annoyance of having run-time errors is much
  better than losing rows.
 
  The INSERT DEFAULT VALUES was behaviour taken from another DBMS, its not
  part of the standard AFAICS.
 
 What does the standard say about this? We should follow the standard, I 
 don't see enough reason to deviate here.

I checked the standard before commenting previously and have done so
again here. I can't see anything that refers to this (in SQL:2008),
either way.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2010-08-05 Thread Robert Haas

On Thu, Aug 5, 2010 at 11:35 AM, Simon Riggs si...@2ndquadrant.com wrote:
 * It appears we would be in violation of the standard on
 14.12 General Rule 6 a) i) 2) B), p.890
 (Oh, I wish I was joking, there really is such a paragraph number)

Just shoot me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2010-08-05 Thread Merlin Moncure

On Thu, Aug 5, 2010 at 7:25 AM, Simon Riggs si...@2ndquadrant.com wrote:
Also had these fragments as well, if they're still useful. Probably just
useful as pointers as to what else to change to include the docs.

The tests and docs were written from SQL standard, so any deviations
would need to be flagged. The idea of writing the tests first was that
they provide an objective test of whether the implementation works
according to spec.

I'd quite like a commentary on anything that needs changing. Not saying
I will necessarily object to differences, but knowing the differences
sounds important for us.

I think this is a wonderful feature. A couple of thoughts:

*) Would however very much like to see RETURNING support if it's not
there. Our other DML statements support it, and this one should too.
wCTE if/when we get it will make the lack of it especially glaring.
(OTOH, no issue if there is no rule support...they should be
deprecated)

*) The decision to stay on the standard and not do a 'race free'
version was IMO a good one. I am starting to come around to the point
of view that the *only* safe way to guarantee race free merge with the
current locking model is to take an appropriate table lock. BTW, our
pl/pgsql upsert example we've been encouraging people to use has a
horrible bug (see:
http://postgresql.1045698.n5.nabble.com/Danger-of-idiomatic-plpgsql-loop-for-merging-data-td2257700.html).

If we want to rework the locking model to support anticipatory locks
then fine (but that has nothing to do with MERGE specifically).

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2010-08-05 Thread Boxuan Zhai

Dear All,

I have seen a lively discussion about the DO NOTING action in MERGE command.
And, I think most people want it. So it will be added to my next patch.

Before the implementation, I still have some questions to confirm:

1. If we have a DO NOTHING action specified, it should be the last WHEN
clause. It must be of the NOT MATCHED cases, and it CANNOT have any
additional action qualifications. Am I correct?

2. If no DO NOTHING specified, we will imply a INSERT DEFAULT VALUES action
as the end of MERGE.
My question is, is this action taken only for the NOT MATCHED tuples?  If
this is the case, then what about the MATCHED tuples that match not previous
actions? Ignore them?
That means we are in fact going to add two implicit WHEN clause:
   a) WHEN NOT MATCHED INSERT default values;
   b) WHEN MATCHED THEN DO NOTHING.
OR, is the INSERT DEFAULT VALUES applied to ALL tuples not matter they are
MATCHED or not?


Besides, (I mean no offense, but) can this method really avoid losing row?

So far as I know, the DEFAULT values for table attributes are defined when
the table is created and no variables are allowed in the default value
expressions. That means, they are usually constants or simple serial
numbers.

Image that we have a MERGE command that has thousands of NOT MATCHED tuples
going to the implicit action. Then, the target table will inserted with
thousands of rows with DEAULT VALUES. These row will have similar (if not
exactly the same) simple content, which contains NO information from the
source table of MERGE. Is this really what we want? If it is not, then what
is the use of the INSERT DEFAULT VALUES action?

Regards

Re: [HACKERS] MERGE Specification

2008-04-28 Thread Simon Riggs

On Fri, 2008-04-25 at 09:10 -0400, Alvaro Herrera wrote:
 Simon Riggs wrote:
 
  I'm now happy that we can get a spec-compliant end result by always
  forcing NOT MATCHED rules to occur before MATCHED rules, when we have at
  least one unique index.
 
 ... and raise an ERROR when there is no unique index?

No, I think an ERROR is not required, nor desirable.

In the absence of a unique index we allow exactly duplicate rows to
exist in a table. This is effectively user defined behaviour, albeit the
default setting.

We have two choices of behaviour:

1. If a MERGE statement runs and sees a row in the target table is NOT
MATCHED then it will insert a row. It is possible that a concurrent
MERGE statement could also see the row in the target table as NOT
MATCHED and then insert a duplicate row.

2. In the absence of a Unique Index, throw an ERROR because a concurrent
MERGE *might* result in duplicate Inserts. (i.e. prevent the above).

(1) is a situation possible with concurrent INSERTs into a table without
a unique index, so I see no reason to make MERGE follow (2) when INSERTs
do not.

Also, it is possible for a MERGE to generate duplicate rows in a table
if the INSERT clause contained constants for example. In the absence of
an applicable rule the MERGE will generate INSERT DEFAULT VALUES, i.e.
an all-constant insert will take place. So the MERGE spec allows the
inserting of duplicate rows without error.

We could include additional options to control this behaviour, if anyone
thinks it worthwhile, but ISTM more restrictive than protective.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-28 Thread Marko Kreen

On 4/25/08, Robert Treat [EMAIL PROTECTED] wrote:
 On Thursday 24 April 2008 23:40, Tom Lane wrote:
   Robert Treat [EMAIL PROTECTED] writes:
Perhaps a better option would be to implement Merge per spec, and then
implement a replace into command for the oltp scenario.  This way you
keep the spec behavior for the spec syntax, and have a clearly non-spec
command for non-spec behavior.
  
   In that case, it's a fair question to ask just who will use the spec
   syntax.  As far as I can tell from years of watching the mailing lists,
   there is plenty of demand for a concurrent-safe insert-or-update
   behavior, and *exactly zero* demand for the other.  I challenge you to
   find even one request for the spec behavior in the mailing list
   archives.  (Simon doesn't count.)
  


 AIUI the current implementation is designed to avoid race conditions partially
  at the cost of being insert friendly and somewhat update unfriendly. My guess
  is that most of the people wanting this for OLTP use want an update friendly
  implementation (that's certainly been the majority of cases I've needed
  myself, and that I have seen others use).

This seems to hint that there should be 2 variants of merge/upsert
- insert-friendly and update-friendly...  It seems unlikely one implementation
can be both.  And especially bad would be implementation that is neither.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-28 Thread Simon Riggs

On Mon, 2008-04-28 at 11:57 +0300, Marko Kreen wrote:
 On 4/25/08, Robert Treat [EMAIL PROTECTED] wrote:
  On Thursday 24 April 2008 23:40, Tom Lane wrote:
Robert Treat [EMAIL PROTECTED] writes:
 Perhaps a better option would be to implement Merge per spec, and then
 implement a replace into command for the oltp scenario.  This way you
 keep the spec behavior for the spec syntax, and have a clearly non-spec
 command for non-spec behavior.
   
In that case, it's a fair question to ask just who will use the spec
syntax.  As far as I can tell from years of watching the mailing lists,
there is plenty of demand for a concurrent-safe insert-or-update
behavior, and *exactly zero* demand for the other.  I challenge you to
find even one request for the spec behavior in the mailing list
archives.  (Simon doesn't count.)
   
 
 
  AIUI the current implementation is designed to avoid race conditions 
  partially
   at the cost of being insert friendly and somewhat update unfriendly. My 
  guess
   is that most of the people wanting this for OLTP use want an update 
  friendly
   implementation (that's certainly been the majority of cases I've needed
   myself, and that I have seen others use).
 
 This seems to hint that there should be 2 variants of merge/upsert
 - insert-friendly and update-friendly...  It seems unlikely one implementation
 can be both.  And especially bad would be implementation that is neither.

Not sure what an option that was neither would look like ...

I would summarise the two MERGE behaviour proposals as

1. Correctly protects against concurrent inserts. Uses one
sub-transaction per row and leaves 2 dead rows per update. Requires us
to perform tasks in different order than required by SQL spec, but the
end result seems identical to me (now).
Has been noted as suitable for OLTP, and poor for bulk data maintenance.
Has been described as insert-friendly and non-spec.

2. Does not protect against concurrent inserts. Leaves 1 dead row per
update. Much more efficient for updates, not sure about any efficiency
gain for inserts.
Has been noted as being unsuitable for OLTP, though likely to offer more
acceptable performance for bulk operations.
Has been described as update-friendly.

By consensus, I'm doing (1). 

It looks likely that doing (2) should be fairly small change and so can
be offered as an option. For example, we can have an additional
action-order clause with two options (the first of which is default)
[INSERT BEFORE UPDATE ACTION ORDER | DEFAULT ACTION ORDER]
So the default is to force inserts to occur before updates, as required
by (1). The other option DEFAULT ACTION ORDER tests the WHEN clauses
in the order specified in the statement, allowing the user to choose
whether they want to test for updates or inserts first.

Overall, the difference between these behaviours is small in comparison
with making MERGE work in the first place...

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-26 Thread Hannes Dorbath


Simon Riggs wrote:

On Fri, 2008-04-25 at 10:03 +0300, Hannu Krosing wrote:

On Tue, 2008-04-22 at 00:24 +0100, Simon Riggs wrote:

On Mon, 2008-04-21 at 16:38 -0400, A.M. wrote:

MERGE will not invoke Rules. Does this imply that MERGE cannot be  
used on views or that the resulting INSERTs or UPDATEs do not work on  
views?

Yes, that's right. Just like COPY. That seems fine to me because you're
likely to be doing a MERGE immediately after a COPY anyway, so the
restriction just continues.

May be the bulk data merging variant of MERGE to be used after initial
COPY should be a variant of COPY with special keyword(s) instead of
MERGE ?


That does sound like a good way of differentiating between the OLTP and
bulk loading cases.

I'll bear that in mind as we develop.


To me, a simple user, it'd be important that MERGE implementation does 
not place any unpredictable restrictions. For example in Oracle you can 
break any MERGE statement by placing a full text index on the table. So 
I'd really expect a MERGE in PostgreSQL to be fine with views, rules, 
tsearch, etc.


Just my 2 cent.


--
Best regards,
Hannes Dorbath

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-25 Thread Hannu Krosing


On Tue, 2008-04-22 at 00:24 +0100, Simon Riggs wrote:
 On Mon, 2008-04-21 at 16:38 -0400, A.M. wrote:
 
  MERGE will not invoke Rules. Does this imply that MERGE cannot be  
  used on views or that the resulting INSERTs or UPDATEs do not work on  
  views?
 
 Yes, that's right. Just like COPY. That seems fine to me because you're
 likely to be doing a MERGE immediately after a COPY anyway, so the
 restriction just continues.

May be the bulk data merging variant of MERGE to be used after initial
COPY should be a variant of COPY with special keyword(s) instead of
MERGE ?


Hannu



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

On Thu, 2008-04-24 at 23:40 -0400, Tom Lane wrote:
 Robert Treat [EMAIL PROTECTED] writes:
  Perhaps a better option would be to implement Merge per spec, and then 
  implement a replace into command for the oltp scenario.  This way you 
  keep 
  the spec behavior for the spec syntax, and have a clearly non-spec command 
  for non-spec behavior. 
 
 In that case, it's a fair question to ask just who will use the spec
 syntax.  As far as I can tell from years of watching the mailing lists,
 there is plenty of demand for a concurrent-safe insert-or-update
 behavior, and *exactly zero* demand for the other.  I challenge you to
 find even one request for the spec behavior in the mailing list
 archives.  (Simon doesn't count.)

A Freudian slip? Hopefully, you meant apart from Simon's request.  ;-)

 I recently came across the expression YAGNI, and think it's probably
 pretty relevant to this discussion:
 http://en.wikipedia.org/wiki/You_Ain't_Gonna_Need_It

In matters of technical implementation, I follow you almost without
question, and very happily so.

I think all of us should be careful when expressing views on what other
people need or don't need. We sleep soundly after having given such an
opinion, but that doesn't make those opinions valid. I'm not sure if
there is a pithy acronym for that thought.

In this case, I had already agreed to do it the safe way, for OLTP. I
believe there is a need for other behaviour as well, but that isn't the
use case that the majority are expressing at this time.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-25 Thread Martijn van Oosterhout

On Thu, Apr 24, 2008 at 11:40:22PM -0400, Tom Lane wrote:
 In that case, it's a fair question to ask just who will use the spec
 syntax.  As far as I can tell from years of watching the mailing lists,
 there is plenty of demand for a concurrent-safe insert-or-update
 behavior, and *exactly zero* demand for the other.  I challenge you to
 find even one request for the spec behavior in the mailing list
 archives.  (Simon doesn't count.)

I could have used something like this a few years ago. I don't think it
would get mentioned on the lists, because frankly it's not something I
would've expected a DBMS to handle internally. Certainly I'd never
heard of the MERGE command until recently. I just wrote a program to do
it (and no, race conditions wern't an issue).

Making a race condition free version is fine, just as long as merging
on a condition without a unique index is also supported.

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature

Re: [HACKERS] MERGE Specification

2008-04-25 Thread Petr Jelinek


Tom Lane wrote:

Robert Treat [EMAIL PROTECTED] writes:
Perhaps a better option would be to implement Merge per spec, and then 
implement a replace into command for the oltp scenario.  This way you keep 
the spec behavior for the spec syntax, and have a clearly non-spec command 
for non-spec behavior. 


In that case, it's a fair question to ask just who will use the spec
syntax.  As far as I can tell from years of watching the mailing lists,
there is plenty of demand for a concurrent-safe insert-or-update
behavior, and *exactly zero* demand for the other.  I challenge you to
find even one request for the spec behavior in the mailing list
archives.  (Simon doesn't count.)


While I agree that there is zero demand for anything else then UPSERT 
version of MERGE INTO, I don't think that doing the *exact opposite* of 
what SQL standard says without any change of syntax to clearly 
differentiate the behavior is best thing to do.


Another thing is, the table on which we do SELECT (the one in USING) can 
be different from target table and we can use columns from that table in 
INSERT/UPDATE statement (probably one the reasons why spec says the 
SELECT query has to be executed before any changes). How you want to 
use the INSERT first implementation in this scenario ? IMHO you still 
need to have both implementations in the end. So we probably need to 
implement the standard one first and then implement our version and put 
some restrictions of what can be in USING or INSERT part when using it.


Regards
Petr Jelinek
--
Regards
Petr Jelinek (PJMODOS)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

On Fri, 2008-04-25 at 11:07 +0200, Petr Jelinek wrote:

 Another thing is, the table on which we do SELECT (the one in USING) can 
 be different from target table and we can use columns from that table in 
 INSERT/UPDATE statement (probably one the reasons why spec says the 
 SELECT query has to be executed before any changes). How you want to 
 use the INSERT first implementation in this scenario ? IMHO you still 
 need to have both implementations in the end. So we probably need to 
 implement the standard one first and then implement our version and put 
 some restrictions of what can be in USING or INSERT part when using it.

We do this:

1. Run using clause,

then for each row

2. NOT MATCHED rules
3. MATCHED

I'm now happy that we can get a spec-compliant end result by always
forcing NOT MATCHED rules to occur before MATCHED rules, when we have at
least one unique index.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

On Thu, 2008-04-24 at 20:28 +0100, Simon Riggs wrote:

 With MERGE, we would need to do it like this:
 
 1. If there are any WHEN NOT MATCHED clauses that trigger INSERTs, then
 attempt to apply them first, no matter what order they were in with
 respect to the WHEN MATCHED clauses. Start loop at step (3) every time.
 If there aren't any, start loop straight at step (5). Note that we would
 need to check to see that INSERTs had not been removed by Rules.

This would only be needed when there is at least one unique index.

MERGE will work, whether or not a unique index exists. If there is no
unique index then MERGE will silently ignore duplicate rows produced by
concurrent INSERTs, though that is no different from what can occur now.

No special action will be required to handle DELETEs, since attempting
to delete a row twice doesn't produce an error.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-25 Thread Decibel!


On Apr 25, 2008, at 3:28 AM, Simon Riggs wrote:
I recently came across the expression YAGNI, and think it's  
probably

pretty relevant to this discussion:
http://en.wikipedia.org/wiki/You_Ain't_Gonna_Need_It


In matters of technical implementation, I follow you almost without
question, and very happily so.

I think all of us should be careful when expressing views on what  
other

people need or don't need. We sleep soundly after having given such an
opinion, but that doesn't make those opinions valid. I'm not sure if
there is a pithy acronym for that thought.



Agreed. Many people on hackers don't actually deal with production  
systems, so it's easy to look at things from an academic standpoint  
and not a practical one. This is generally a Good Thing (I think  
MySQL is an example of what happens when you don't do that), but it  
does need to be balanced by real-world needs. And not all of those  
needs are always well represented on the lists.


In this case, I have bulk-load code that could certainly use MERGE.  
It's not that hard to write code that will handle this in a way  
that's not safe from race conditions, so it's unlikely that we'll see  
that many requests, but that doesn't mean a fast MERGE wouldn't be  
useful. It certainly would have saved me some effort, and it would  
probably out-perform the current code.

--
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED]
Give your computer some brain candy! www.distributed.net Team #1828




smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] MERGE Specification

On Fri, 2008-04-25 at 10:03 +0300, Hannu Krosing wrote:
 On Tue, 2008-04-22 at 00:24 +0100, Simon Riggs wrote:
  On Mon, 2008-04-21 at 16:38 -0400, A.M. wrote:
  
   MERGE will not invoke Rules. Does this imply that MERGE cannot be  
   used on views or that the resulting INSERTs or UPDATEs do not work on  
   views?
  
  Yes, that's right. Just like COPY. That seems fine to me because you're
  likely to be doing a MERGE immediately after a COPY anyway, so the
  restriction just continues.
 
 May be the bulk data merging variant of MERGE to be used after initial
 COPY should be a variant of COPY with special keyword(s) instead of
 MERGE ?

That does sound like a good way of differentiating between the OLTP and
bulk loading cases.

I'll bear that in mind as we develop.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-25 Thread Andrew Dunstan




Robert Treat wrote:


Perhaps a better option would be to implement Merge per spec, and then 
implement a replace into command for the oltp scenario.  This way you keep 
the spec behavior for the spec syntax, and have a clearly non-spec command 
for non-spec behavior. 
  


MySQL's REPLACE INTO is *NOT* semantically equivalent to any flavor of 
insert or update. It is delete plus insert. They do have INSERT ... 
ON DUPLICATE KEY UPDATE ...



Presumably, if we implement MERGE with transaction-safe semantics, which 
Simon has agreed to do, we would not need to consider anything like the 
latter, but we might still want to consider REPLACE INTO (in the MySQL 
sense).


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-25 Thread Alvaro Herrera

Simon Riggs wrote:

 I'm now happy that we can get a spec-compliant end result by always
 forcing NOT MATCHED rules to occur before MATCHED rules, when we have at
 least one unique index.

... and raise an ERROR when there is no unique index?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-25 Thread Robert Treat

On Thursday 24 April 2008 23:40, Tom Lane wrote:
 Robert Treat [EMAIL PROTECTED] writes:
  Perhaps a better option would be to implement Merge per spec, and then
  implement a replace into command for the oltp scenario.  This way you
  keep the spec behavior for the spec syntax, and have a clearly non-spec
  command for non-spec behavior.

 In that case, it's a fair question to ask just who will use the spec
 syntax.  As far as I can tell from years of watching the mailing lists,
 there is plenty of demand for a concurrent-safe insert-or-update
 behavior, and *exactly zero* demand for the other.  I challenge you to
 find even one request for the spec behavior in the mailing list
 archives.  (Simon doesn't count.)


AIUI the current implementation is designed to avoid race conditions partially 
at the cost of being insert friendly and somewhat update unfriendly. My guess 
is that most of the people wanting this for OLTP use want an update friendly 
implementation (that's certainly been the majority of cases I've needed 
myself, and that I have seen others use). 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-24 Thread Decibel!


On Apr 22, 2008, at 3:20 PM, Martijn van Oosterhout wrote:


On Tue, Apr 22, 2008 at 02:19:24PM -0500, Decibel! wrote:

But no matter how this is done, I think we need to handle the race
conditions, and handle them by default. If people *really* know what
they're doing, they can disable the row locking (perhaps one way to
do this would be to grab an explicit lock on the table and have merge
check for that...).


I disagree. The spec doesn't require it and MERGE is useful without  
it.
For a first cut I would say implement as the spec says, race  
conditions

and all. Later we can think on whether it's worth handling them
directly.



That really strikes me as taking the MySQL route. If push comes to  
shove, I'll take a MERGE with race conditions over no merge at all,  
but I think it's very important that it does the right thing. Just  
because the spec doesn't say anything about it doesn't mean it's ok.

--
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED]
Give your computer some brain candy! www.distributed.net Team #1828




smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] MERGE Specification

2008-04-24 Thread Tom Lane

Decibel! [EMAIL PROTECTED] writes:
 That really strikes me as taking the MySQL route. If push comes to  
 shove, I'll take a MERGE with race conditions over no merge at all,  
 but I think it's very important that it does the right thing. Just  
 because the spec doesn't say anything about it doesn't mean it's ok.

Agreed.  It seems to me that in the last set of discussions, we rejected
implementing MERGE precisely because it failed to provide a solution to
the race-condition problem.  I'm not satisfied with a version that
doesn't handle that, because I think that is *exactly* what most people
will try to use it for.  The non-concurrent bulk update case that Simon
is arguing for is the uncommon usage.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-24 Thread Simon Riggs

On Thu, 2008-04-24 at 12:19 -0400, Tom Lane wrote:
 Decibel! [EMAIL PROTECTED] writes:
  That really strikes me as taking the MySQL route. If push comes to  
  shove, I'll take a MERGE with race conditions over no merge at all,  
  but I think it's very important that it does the right thing. Just  
  because the spec doesn't say anything about it doesn't mean it's ok.
 
 Agreed.  It seems to me that in the last set of discussions, we rejected
 implementing MERGE precisely because it failed to provide a solution to
 the race-condition problem.  I'm not satisfied with a version that
 doesn't handle that, because I think that is *exactly* what most people
 will try to use it for.  The non-concurrent bulk update case that Simon
 is arguing for is the uncommon usage.

If y'all think that, then I will do it that way.

The only protection from the race condition is to do the Insert first. 

With MERGE, we would need to do it like this:

1. If there are any WHEN NOT MATCHED clauses that trigger INSERTs, then
attempt to apply them first, no matter what order they were in with
respect to the WHEN MATCHED clauses. Start loop at step (3) every time.
If there aren't any, start loop straight at step (5). Note that we would
need to check to see that INSERTs had not been removed by Rules.

2. For each row retrieved by outer join, goto either step 3 or 5 as
established before the loop starts.

3. Try to apply the WHEN NOT MATCHED clauses. The ordering of the
clauses with respect to each other will remain exactly as stated. If one
of the clauses activates an INSERT, we start an internal subtransaction
and perform the INSERT action. If it succeeds, we commit the
subtransaction and continue.

4. If the INSERT fails with a uniqueness violation, we shrug. The ERROR
has caused the subtransaction to abort.

5. Process WHEN MATCHED clauses and continue.

Technically, this is a standards violation because of the potentially
out-of-order execution of the when clauses. Though the end result is not
distinguishable from standards compliant behaviour, AFAICS.

Note that in step 3 we *must* use subtransactions if there is more than
1 unique index on a table, otherwise we might succeed with the first
index and fail with the second. Using a subtransaction per row pretty
much rules out an efficient bulk load.

Note also that this results in a version optimised for INSERT. If we end
up doing an UPDATE there will be two dead rows, probably in two separate
blocks. We hope that doesn't matter because of HOT.

There's probably a reasonable argument for having an optional keyword to
make MERGE behave differently for bulk loads, but I'll save that now for
another day. Focus now is on a command that works well for OLTP cases.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-24 Thread Chris Browne

[EMAIL PROTECTED] (Decibel!) writes:

 On Apr 22, 2008, at 1:17 PM, Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:

 As I've said elsewhere, we could have it lock each row, its just more
 overhead if we do and not necessary at all for bulk data merging.

 I'll presume we want locking as an option, unless people say
 otherwise.

 It's not so simple. If you look for a row to merge into and don't
 find one
 there's no row to lock. What unique constraints do is hold the lock
 on the
 index page where the entry would have to be added.

 That's the trick that plpgsql cannot implement. That's why users
 are forced to
 loop and retry until they manage to do an update successfully or
 insert
 successfully.

 Yeah, hopefully there's a better way to do this other than row locks.

 But no matter how this is done, I think we need to handle the race
 conditions, and handle them by default. If people *really* know what
 they're doing, they can disable the row locking (perhaps one way to
 do this would be to grab an explicit lock on the table and have merge
 check for that...).

I agree that handling the race conditions by default is preferable.

Consider: An excellent reason to prefer MERGE is if it handles race
conditions that would otherwise require application code be more
carefully and cleverly written to avoid the race conditions.

If MERGE solves it automatically, and eliminates hand-written code,
that's TWO benefits, that quite likely outweigh any performance costs.

I know we've had cases where race conditions [that a well-done MERGE
would probably solve easily] bit us badly, requiring considerable
development effort, and the addition of extra table columns and such.

There are some possibilities of worst case MERGE being pretty
slow; it's likely to be faster than the alternatives we used in its
absence ;-).
-- 
(reverse (concatenate 'string ofni.secnanifxunil @ enworbbc))
http://www3.sympatico.ca/cbbrowne/wp.html
I once went  to a shrink.  He  told me to speak freely.   I did.  The
damn fool tried to charge me $90 an hour.
-- [EMAIL PROTECTED] (Jim Moore Jr)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-24 Thread Robert Treat

On Thursday 24 April 2008 12:19, Tom Lane wrote:
 Decibel! [EMAIL PROTECTED] writes:
  That really strikes me as taking the MySQL route. If push comes to
  shove, I'll take a MERGE with race conditions over no merge at all,
  but I think it's very important that it does the right thing. Just
  because the spec doesn't say anything about it doesn't mean it's ok.

 Agreed.  It seems to me that in the last set of discussions, we rejected
 implementing MERGE precisely because it failed to provide a solution to
 the race-condition problem.  I'm not satisfied with a version that
 doesn't handle that, because I think that is *exactly* what most people
 will try to use it for.  The non-concurrent bulk update case that Simon
 is arguing for is the uncommon usage.


Perhaps a better option would be to implement Merge per spec, and then 
implement a replace into command for the oltp scenario.  This way you keep 
the spec behavior for the spec syntax, and have a clearly non-spec command 
for non-spec behavior. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-24 Thread Tom Lane

Robert Treat [EMAIL PROTECTED] writes:
 Perhaps a better option would be to implement Merge per spec, and then 
 implement a replace into command for the oltp scenario.  This way you keep 
 the spec behavior for the spec syntax, and have a clearly non-spec command 
 for non-spec behavior. 

In that case, it's a fair question to ask just who will use the spec
syntax.  As far as I can tell from years of watching the mailing lists,
there is plenty of demand for a concurrent-safe insert-or-update
behavior, and *exactly zero* demand for the other.  I challenge you to
find even one request for the spec behavior in the mailing list
archives.  (Simon doesn't count.)

I recently came across the expression YAGNI, and think it's probably
pretty relevant to this discussion:
http://en.wikipedia.org/wiki/You_Ain't_Gonna_Need_It

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-22 Thread Simon Riggs

On Mon, 2008-04-21 at 21:57 -0400, Alvaro Herrera wrote:
 Simon Riggs wrote:
 
  Unrelated to rule processing, you did read the bit about MERGE and race
  conditions? ISTM that MERGE as it stands isn't very useful for anything
  other than large data loads since its going to cause problems if used
  concurrently.
 
 But that's how the committee designed it, yes?

Yes. Not sure if I see your point there, but yes, that's how its been
designed.

Both DB2 and Oracle have additional items to get around the shortcomings
of the command.

The way MERGE works we first test to see if it matches or not, then if
not matched we would activate the NOT MATCHED action, which standard
says must be an insert. The gap between the two actions allows a race
condition to exist. 

We could close the gap by taking a lock on the row when we perform the
is-matched test, but that would be expensive for bulk operations. ISTM
the lock should be optional. Not sure what the default should be. Input
welcome.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-22 Thread Martijn van Oosterhout

On Tue, Apr 22, 2008 at 08:24:58AM +0100, Simon Riggs wrote:
 The way MERGE works we first test to see if it matches or not, then if
 not matched we would activate the NOT MATCHED action, which standard
 says must be an insert. The gap between the two actions allows a race
 condition to exist. 
 
 We could close the gap by taking a lock on the row when we perform the
 is-matched test, but that would be expensive for bulk operations. ISTM
 the lock should be optional. Not sure what the default should be. Input
 welcome.

ISTM that if the original select does a SELECT FOR UPDATE then it
should work fine for UPDATEs since any update with overwrite the xmax
field anyway.

What you can't do is prevent multiple inserts. Though if its a unique
index you should be able to do the same trick as normal inserts: create
the row, try to insert into the index and if that fails fall back to
doing an update.

What you really need for this though is a non-fatal _bt_check_unique so
you can recover without having a savepoint for every row.

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature

Re: [HACKERS] MERGE Specification

2008-04-22 Thread Simon Riggs

On Tue, 2008-04-22 at 10:02 +0200, Martijn van Oosterhout wrote:
 On Tue, Apr 22, 2008 at 08:24:58AM +0100, Simon Riggs wrote:
  The way MERGE works we first test to see if it matches or not, then if
  not matched we would activate the NOT MATCHED action, which standard
  says must be an insert. The gap between the two actions allows a race
  condition to exist. 
  
  We could close the gap by taking a lock on the row when we perform the
  is-matched test, but that would be expensive for bulk operations. ISTM
  the lock should be optional. Not sure what the default should be. Input
  welcome.
 
 ISTM that if the original select does a SELECT FOR UPDATE then it
 should work fine for UPDATEs since any update with overwrite the xmax
 field anyway.

Yes, agreed, that's what I meant by the lock on the row.

Incidentally, this is essentially the same problem that occurs with
SERIALIZABLE updates.

It should be easy enough to put an optional LOCK MATCHED ROW clause
into the MERGE statement, as an extension. The Standard doesn't specify
the lock timing. 

 What you can't do is prevent multiple inserts. Though if its a unique
 index you should be able to do the same trick as normal inserts: create
 the row, try to insert into the index and if that fails fall back to
 doing an update.

The Standard doesn't really allow that. It's either matched or its not. 

MERGE is specifically
1. Match
2. Update or Insert as per step (1), following complex logic

rather than

1. Update
2. if not matched Insert

which is exactly what the MySQL and Teradata upsert statements do, but
only for single row operations, unlike MERGE.

For MERGE, there is no lets try one of these and if not, I'll switch.
You decide which it is going to be and then do it. Which can fail... 

I guess we could just spin through, re-testing the match each time and
re-initiating an action, but I see problems there also, not least of
which is it violates the standard. That may not be that clever, but
there may be reasons we can't see yet, or reasons that would affect
other implementors. Guidance, please, if anybody sees clearly?

 What you really need for this though is a non-fatal _bt_check_unique so
 you can recover without having a savepoint for every row.

Oracle simply fails in the event of a uniqueness violation, even though
it logs other errors. DB2 fails unconditionally if there is even a
single error. The MySQL and Teradata syntax don't seem to offer any
protection from concurrent inserts either. Teradata and DB2 both use
locking, so they would lock the value prior to the update anyway, so the
update, insert issue would not happen for them at least.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-22 Thread Simon Riggs

On Mon, 2008-04-21 at 22:27 -0400, Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 
  Unrelated to rule processing, you did read the bit about MERGE and race
  conditions? ISTM that MERGE as it stands isn't very useful for anything
  other than large data loads since its going to cause problems if used
  concurrently.
 
 If there are race conditions what advantage does it offer over writing plpgsql
 or client code to do it?

That's an excellent question. I'm not trying to sell you anything here.
MERGE is a SQL Standard command, supported by Oracle, DB2, SQLServer
etc, so there is reason enough to implement it.

There may be also reasons to implement other syntaxes, other behaviours,
which would be non-standard. If people want the latter first/second/not
at all then please speak, its not an either-or situation.

I would expect MERGE to be slightly faster than a well coded PL/pgSQL
function, but there won't be too much in it. It will allow the statement
to be more easily parallelised in the form it currently takes, I would
note.

 I thought the whole advantage of having a built-in command is that it could do
 the kind of locking our unique constraints do to avoid race conditions.

As I've said elsewhere, we could have it lock each row, its just more
overhead if we do and not necessary at all for bulk data merging.

I'll presume we want locking as an option, unless people say otherwise.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-22 Thread Gregory Stark

Simon Riggs [EMAIL PROTECTED] writes:

 As I've said elsewhere, we could have it lock each row, its just more
 overhead if we do and not necessary at all for bulk data merging.

 I'll presume we want locking as an option, unless people say otherwise.

It's not so simple. If you look for a row to merge into and don't find one
there's no row to lock. What unique constraints do is hold the lock on the
index page where the entry would have to be added.

That's the trick that plpgsql cannot implement. That's why users are forced to
loop and retry until they manage to do an update successfully or insert
successfully.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-22 Thread A.M.



On Apr 22, 2008, at 1:47 PM, Simon Riggs wrote:


On Mon, 2008-04-21 at 22:27 -0400, Gregory Stark wrote:

Simon Riggs [EMAIL PROTECTED] writes:

Unrelated to rule processing, you did read the bit about MERGE and  
race
conditions? ISTM that MERGE as it stands isn't very useful for  
anything
other than large data loads since its going to cause problems if  
used

concurrently.


If there are race conditions what advantage does it offer over  
writing plpgsql

or client code to do it?


That's an excellent question. I'm not trying to sell you anything  
here.

MERGE is a SQL Standard command, supported by Oracle, DB2, SQLServer
etc, so there is reason enough to implement it.

There may be also reasons to implement other syntaxes, other  
behaviours,
which would be non-standard. If people want the latter first/second/ 
not

at all then please speak, its not an either-or situation.

I would expect MERGE to be slightly faster than a well coded PL/pgSQL
function, but there won't be too much in it. It will allow the  
statement

to be more easily parallelised in the form it currently takes, I would
note.

I thought the whole advantage of having a built-in command is that  
it could do
the kind of locking our unique constraints do to avoid race  
conditions.


As I've said elsewhere, we could have it lock each row, its just more
overhead if we do and not necessary at all for bulk data merging.


I was hoping to use MERGE alongside the other standard DML. Its  
purpose is to set the truth regardless of past states.


Why should it be relegated to the bulk-loading basement alongside COPY?

Cheers,
M


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-22 Thread Decibel!


On Apr 22, 2008, at 1:17 PM, Gregory Stark wrote:

Simon Riggs [EMAIL PROTECTED] writes:


As I've said elsewhere, we could have it lock each row, its just more
overhead if we do and not necessary at all for bulk data merging.

I'll presume we want locking as an option, unless people say  
otherwise.


It's not so simple. If you look for a row to merge into and don't  
find one
there's no row to lock. What unique constraints do is hold the lock  
on the

index page where the entry would have to be added.

That's the trick that plpgsql cannot implement. That's why users  
are forced to
loop and retry until they manage to do an update successfully or  
insert

successfully.


Yeah, hopefully there's a better way to do this other than row locks.

But no matter how this is done, I think we need to handle the race  
conditions, and handle them by default. If people *really* know what  
they're doing, they can disable the row locking (perhaps one way to  
do this would be to grab an explicit lock on the table and have merge  
check for that...).


On a different note, if you intend for the SGML to become the doc  
page for MERGE, I'd really like to see some more complex examples  
showing both delete and more than 2 WHEN cases. Something like the  
multiple actions on single target row test case would work.

--
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED]
Give your computer some brain candy! www.distributed.net Team #1828




smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] MERGE Specification

2008-04-22 Thread Martijn van Oosterhout

On Tue, Apr 22, 2008 at 02:19:24PM -0500, Decibel! wrote:
 But no matter how this is done, I think we need to handle the race  
 conditions, and handle them by default. If people *really* know what  
 they're doing, they can disable the row locking (perhaps one way to  
 do this would be to grab an explicit lock on the table and have merge  
 check for that...).

I disagree. The spec doesn't require it and MERGE is useful without it.
For a first cut I would say implement as the spec says, race conditions
and all. Later we can think on whether it's worth handling them
directly.

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature

Re: [HACKERS] MERGE Specification

2008-04-21 Thread Pavel Stehule

Hello Simon,

would you support RETURNING clause?

Regards
Pavel Stehule

On 21/04/2008, Simon Riggs [EMAIL PROTECTED] wrote:
 The following two files specify the behaviour of the MERGE statement and
  how it will work in the world of PostgreSQL. In places, this supercedes
  my recent commentary on MERGE, particularly with regard to triggers.

  Neither of these files is intended for CVS.

  The HTML file was generated from SGML source, though the latter is not
  included here for clarity.

  The test file shows how I expect a successful test run to look when a
  regression test is executed with a working version of final MERGE patch
  applied. It has behavioural comments in it also, to make it slightly
  more readable.

  If anybody has any questions, ask them now please, before I begin
  detailed implementation.


  --
   Simon Riggs
   2ndQuadrant  http://www.2ndQuadrant.com


  --
  Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

On Mon, 2008-04-21 at 22:18 +0200, Pavel Stehule wrote:

 would you support RETURNING clause?

I wouldn't rule it out completely, but not in the first implementation
because
- its more work
- its not in the SQL Standard
- neither Oracle nor DB2 support it either, so its only going to provide
incompatibility
- there are some wrinkles with MERGE that means I don't want to
over-complicate it because it looks to me like it will change in future
versions of the standard
- not sure what the use case for that will be

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-21 Thread A.M.



On Apr 21, 2008, at 4:08 PM, Simon Riggs wrote:

The following two files specify the behaviour of the MERGE statement  
and
how it will work in the world of PostgreSQL. In places, this  
supercedes

my recent commentary on MERGE, particularly with regard to triggers.

Neither of these files is intended for CVS.

The HTML file was generated from SGML source, though the latter is not
included here for clarity.

The test file shows how I expect a successful test run to look when a
regression test is executed with a working version of final MERGE  
patch

applied. It has behavioural comments in it also, to make it slightly
more readable.

If anybody has any questions, ask them now please, before I begin
detailed implementation.



MERGE will not invoke Rules. Does this imply that MERGE cannot be  
used on views or that the resulting INSERTs or UPDATEs do not work on  
views?


Cheers,
M

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

On Mon, 2008-04-21 at 16:38 -0400, A.M. wrote:

 MERGE will not invoke Rules. Does this imply that MERGE cannot be  
 used on views or that the resulting INSERTs or UPDATEs do not work on  
 views?

Yes, that's right. Just like COPY. That seems fine to me because you're
likely to be doing a MERGE immediately after a COPY anyway, so the
restriction just continues.

Rules for Insert, Update and Delete are almost identical to the way
MERGE works anyway, so there's no particular loss of functionality. That
was why I co-opted the ability to DO NOTHING in a WHEN clause from the
way PostgreSQL Rules work.

I'm not taking any explicit decisions to exclude them permanently. I do
think its possible that we could support them and possibly very cheaply,
but I wouldn't make any promises initially.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-21 Thread Tom Lane

Simon Riggs [EMAIL PROTECTED] writes:
 On Mon, 2008-04-21 at 16:38 -0400, A.M. wrote:
 MERGE will not invoke Rules. Does this imply that MERGE cannot be  
 used on views or that the resulting INSERTs or UPDATEs do not work on  
 views?

 Yes, that's right. Just like COPY.

I find this to be pretty ugly.  COPY is a special case because
(a) it is a utility statement not a plannable one, and (b) its only
reason to exist is to transfer data as fast as possible, so bypassing
rules isn't an unreasonable restriction.  MERGE has neither of those
properties, and thus arguing that it can or should be like COPY is an
entirely unconvincing proposition.  (In fact, I don't even want to think
about what kind of crock you're going to need in order to get it through
the planner without also going through the rewriter.)

Please think a bit harder.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

On Mon, 2008-04-21 at 20:28 -0400, Tom Lane wrote:

 In fact, I don't even want to think
 about what kind of crock you're going to need in order to get it through
 the planner without also going through the rewriter.

Hmmm, I hadn't thought I might be adding work rather than avoiding it.

I'll give it a go.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification