Re: [HACKERS] Autonomous Transaction is back

2015-09-11 Thread Merlin Moncure
On Thu, Sep 10, 2015 at 8:39 PM, Noah Misch  wrote:
> On Wed, Sep 09, 2015 at 10:04:01AM -0400, Robert Haas wrote:
>> On Sun, Sep 6, 2015 at 1:56 AM, Noah Misch  wrote:
>> > What design principle(s) have you been using to decide how autonomous
>> > transactions should behave?
>>
>> I have to admit to a complete lack of principle.  I'm quite frightened
>> of what this is going to need from the lock manager, and I'm trying to
>> wriggle out of having to do things there that are going to be nastily
>> hard.  My wriggling isn't going very well, though.
>
> It's an exceptionally-challenging development project, agreed.  So much code
> assumes the 1:1 relationship between backends and top-level transactions.

I guess I'm being obtuse, but can you explain why that assumption must
be revisited?  I don't see why it has to be...I must be missing
something.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-09-11 Thread Noah Misch
On Fri, Sep 11, 2015 at 02:30:53PM -0500, Merlin Moncure wrote:
> On Thu, Sep 10, 2015 at 8:39 PM, Noah Misch  wrote:
> > It's an exceptionally-challenging development project, agreed.  So much code
> > assumes the 1:1 relationship between backends and top-level transactions.
> 
> I guess I'm being obtuse, but can you explain why that assumption must
> be revisited?

It's not imperative, but the proposal on the table does so.  Robert described
doing so as "a key design goal for this feature"[1].  I have a general picture
of why the proposers chose that, but I will defer to them for any elaboration.

[1] 
http://www.postgresql.org/message-id/ca+tgmoze__kohfuyvzpxvtjsb85xm34kc_t41-oxhtb_111...@mail.gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-09-10 Thread Noah Misch
On Wed, Sep 09, 2015 at 10:04:01AM -0400, Robert Haas wrote:
> On Sun, Sep 6, 2015 at 1:56 AM, Noah Misch  wrote:
> > What design principle(s) have you been using to decide how autonomous
> > transactions should behave?
> 
> I have to admit to a complete lack of principle.  I'm quite frightened
> of what this is going to need from the lock manager, and I'm trying to
> wriggle out of having to do things there that are going to be nastily
> hard.  My wriggling isn't going very well, though.

It's an exceptionally-challenging development project, agreed.  So much code
assumes the 1:1 relationship between backends and top-level transactions.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-09-09 Thread Robert Haas
On Sun, Sep 6, 2015 at 1:56 AM, Noah Misch  wrote:
> What design principle(s) have you been using to decide how autonomous
> transactions should behave?

I have to admit to a complete lack of principle.  I'm quite frightened
of what this is going to need from the lock manager, and I'm trying to
wriggle out of having to do things there that are going to be nastily
hard.  My wriggling isn't going very well, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-09-09 Thread Merlin Moncure
On Wed, Sep 9, 2015 at 9:04 AM, Robert Haas  wrote:
> On Sun, Sep 6, 2015 at 1:56 AM, Noah Misch  wrote:
>> What design principle(s) have you been using to decide how autonomous
>> transactions should behave?
>
> I have to admit to a complete lack of principle.  I'm quite frightened
> of what this is going to need from the lock manager, and I'm trying to
> wriggle out of having to do things there that are going to be nastily
> hard.  My wriggling isn't going very well, though.

Hm.  Here is current dblink behavior:

postgres=# create table l (id int);
CREATE TABLE
postgres=# insert into l values(1);
INSERT 0 1
postgres=# update l set id =2 where id = 1;
UPDATE 1
Time: 0.595 ms
postgres=# select dblink('', 'update l set id = 3 where id = 1');


Does the lock manager really needs to be extended to address this
case?  pg_locks pretty clearly explains what's happening, via:
postgres=# select locktype, transactionid, pid, granted from pg_locks
where not granted;
   locktype│ transactionid │  pid  │ granted
───┼───┼───┼─
 transactionid │ 88380 │ 20543 │ f

and

postgres=# select locktype, transactionid, pid, granted from pg_locks
where transactionid = 88380;
   locktype│ transactionid │  pid  │ granted
───┼───┼───┼─
 transactionid │ 88380 │ 20543 │ f
 transactionid │ 88380 │ 19022 │ t

If pg_locks and/or pg_stat_activity were extended with a 'parent_pid'
column, a userland query could terminate affected backends with a join
against that column where any ungranted.  Naturally there's a lot more
to it than that; you'd want to issue an appropriate cancellation
message and various other things.  I suppose I'm probably missing
something, but I'm not clear on why the lock manager needs to be
overhauled to deal with this case when we can just scan current
strictures assuming we can a) manage child pid independently of parent
pid and b) easily figure out who is parent of who.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-09-08 Thread Merlin Moncure
On Sun, Sep 6, 2015 at 12:56 AM, Noah Misch  wrote:
> My comments have flowed out of a principle that autonomous transactions shall
> have precisely the same semantics as using another backend via dblink.

That's what I thought, too.  AT is syntax sugar for dblink approach.
Syntax and performance aside, I think the only way dblink style AT
could be improved would be to have:

1) better cancel handling, especially cancel child when parent cancels
(this is a major headache with dblink)
2) automatic rejection of acquire attempts by child on exclusive
assets held by parent

#2 is what we've been discussing, but I'm pretty confused.  Most
especially I'm not clear on whether parent and children share a pid or
have unique pid.  If pids are unique as in dblink, #2 could be done
via a query assuming you have a way of identifying child transactions
in pg_stat_activity.  The discussion has suggested that the pid is
shared, which I found odd but took on faith.

I'm also wondering if we need syntax to handle synchronous vs
asynchronous execution of the AT (the latter of which is only possible
with a separate PID, I think).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-09-05 Thread Noah Misch
On Thu, Sep 03, 2015 at 04:21:55PM -0400, Robert Haas wrote:
> On Sat, Aug 22, 2015 at 2:23 AM, Noah Misch  wrote:
> >> > Can you get away with only looking at tuples though?  For example,
> >> > what about advisory locks?  Table locks?
> >>
> >> Well, that's an interesting question.  Can we get away with regarding
> >> those things as non-conflicting, as between the parent and child
> >> transactions?
> >
> > For system lock types, no.  While one could define advisory locks to work
> > differently, we should assume that today's advisory lockers have 
> > expectations
> > like those of system lockers.  An autonomous transaction should not bypass 
> > any
> > lock that a transaction of another backend could not bypass.
> 
> Why?
> 
> Suppose you do this:
> 
> BEGIN;
> DECLARE CURSOR foo FOR SELECT * FROM foo;
> BEGIN AUTONOMOUS TRANSACTION;
> ALTER TABLE foo ALTER bar TYPE int;
> 
> This has got to fail for safety reasons, but CheckTableNotInUse() is
> on it.  Suppose you do this:
>
> BEGIN;
> LOCK foo;
> BEGIN AUTONOMOUS TRANSACTION;
> INSERT INTO foo VALUES ('spelunk');
> 
> How will making this fail improve anything?

Core PostgreSQL doesn't care.  This is a user interface design decision to be
made in light of current SQL expectations and future SQL author wishes.  The
LOCK reference page, our contract with users, says nothing to constrain the
choice.  LOCK is exceptional in that we never get much insight into the
caller's expectations.  I think LOCK should follow closely the built-in
commands that take the same locks.  This variation of your examples must fail
in order to avoid a paradox if the first transaction aborts:

BEGIN;
ALTER TABLE foo ALTER bar TYPE frob;
BEGIN AUTONOMOUS TRANSACTION;
INSERT INTO foo VALUES ('spelunk');

If we made that fail and made your second example succeed, that would imply
"LOCK foo" acquires a special kind of AccessExclusiveLock differing from what
ALTER TABLE acquires.  That's incompatible with my sense of the LOCK command's
role in the system.  An ability to procure an option to acquire, without
waiting, a lock and delegate that option to another transaction would have
applications.  It's a different feature calling for distinct syntax.


My comments have flowed out of a principle that autonomous transactions shall
have precisely the same semantics as using another backend via dblink.  They
should have less overhead.  They may give different error messages.  They
shall not permit sequences of commands that fail in a dblink implementation of
the same multi-transaction sequence.  I chose this principle because it fits
my intuitive notion of transaction "autonomy" and because everything I've
heard about other implementations suggests that they work in that way.  If
some RDBMS implementation does otherwise, I would find that persuasive.

What design principle(s) have you been using to decide how autonomous
transactions should behave?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-09-03 Thread Robert Haas
On Sat, Aug 22, 2015 at 2:23 AM, Noah Misch  wrote:
>> > Can you get away with only looking at tuples though?  For example,
>> > what about advisory locks?  Table locks?
>>
>> Well, that's an interesting question.  Can we get away with regarding
>> those things as non-conflicting, as between the parent and child
>> transactions?
>
> For system lock types, no.  While one could define advisory locks to work
> differently, we should assume that today's advisory lockers have expectations
> like those of system lockers.  An autonomous transaction should not bypass any
> lock that a transaction of another backend could not bypass.

Why?

Suppose you do this:

BEGIN;
DECLARE CURSOR foo FOR SELECT * FROM foo;
BEGIN AUTONOMOUS TRANSACTION;
ALTER TABLE foo ALTER bar TYPE int;

This has got to fail for safety reasons, but CheckTableNotInUse() is
on it.  Suppose you do this:

BEGIN;
LOCK foo;
BEGIN AUTONOMOUS TRANSACTION;
INSERT INTO foo VALUES ('spelunk');

How will making this fail improve anything?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-08-22 Thread Noah Misch
On Fri, Aug 21, 2015 at 10:06:44AM -0400, Robert Haas wrote:
 On Tue, Aug 18, 2015 at 3:06 PM, Merlin Moncure mmonc...@gmail.com wrote:
  On Tue, Aug 18, 2015 at 8:17 AM, Robert Haas robertmh...@gmail.com wrote:
  On Sat, Aug 15, 2015 at 6:47 PM, Noah Misch n...@leadboat.com wrote:
  [1] That's not to say it must use the shmem lock structures and deadlock
  detector.
 
  This footnote goes to my point.

  So, I agree that this scenario should be an error.  What I don't agree
  with is the idea that it should be the deadlock detector's job to
  throw that error.

I couldn't gather from your earlier messages that this scenario should get an
error, so I'm glad to have that clarified.

  Can you get away with only looking at tuples though?  For example,
  what about advisory locks?  Table locks?
 
 Well, that's an interesting question.  Can we get away with regarding
 those things as non-conflicting, as between the parent and child
 transactions?

For system lock types, no.  While one could define advisory locks to work
differently, we should assume that today's advisory lockers have expectations
like those of system lockers.  An autonomous transaction should not bypass any
lock that a transaction of another backend could not bypass.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-08-21 Thread Robert Haas
On Tue, Aug 18, 2015 at 3:06 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Aug 18, 2015 at 8:17 AM, Robert Haas robertmh...@gmail.com wrote:
 On Sat, Aug 15, 2015 at 6:47 PM, Noah Misch n...@leadboat.com wrote:
 CREATE TABLE t (c) AS SELECT 1;
 BEGIN;
 UPDATE t SET c = 2 WHERE c = 1;
 BEGIN_AUTONOMOUS;
 UPDATE t SET c = 3 WHERE c = 1;
 UPDATE t SET c = 4 WHERE c = 2;
 COMMIT_AUTONOMOUS;
 ROLLBACK;

 If you replace the autonomous transaction with a savepoint, the c=3 update
 finds no rows, and the c=4 update changes one row.  When the outer 
 transaction
 aborts, only the original c=1 row remains live.  If you replace the 
 autonomous
 transaction with a dblink/pg_background call, the c=3 update waits
 indefinitely for c=2 to commit or abort, an undetected deadlock.

 Suppose you make the autonomous transaction see tuples like in the savepoint
 case.  The c=3 update finds no rows to update, and the c=4 update changes 
 one
 row.  When the outer transaction aborts, you have two live rows (c=1 and 
 c=4).
 Suppose you instead make the autonomous transaction see tuples like in the
 dblink case, yet let c=3 ignore the lock and change a row.  If both the
 autonomous transaction and the outer transaction were to commit, then you 
 get
 two live rows (c=2 and c=3).  Neither of those outcomes is acceptable, of
 course.  In today's tuple update rules, c=3 must deadlock[1].  Other 
 credible
 tuple update rules may not have this problem, but nothing jumps to mind.

 [1] That's not to say it must use the shmem lock structures and deadlock
 detector.

 This footnote goes to my point.

 It seems clear to me that having the autonomous transaction see the
 effects of the outer uncommitted transaction is a recipe for trouble.
 If the autonomous transaction updates a row and commits, and the outer
 transaction later aborts, the resulting state is inconsistent with any
 serial history.  I'm fairly certain that's going to leave us in an
 unhappy place.

 Even more obviously, ending up with two committed row versions that
 are both updates of a single ancestor version is no good.

 So, I agree that this scenario should be an error.  What I don't agree
 with is the idea that it should be the deadlock detector's job to
 throw that error.  Rather, I think that when we examine the xmax of
 the tuple we can see - which is the original one, not the one updated
 by the outer transaction - we should check whether that XID belongs to
 an outer transaction.

 Hm: do you mean 'an' outer transaction (meaning, basically, any in
 progress transaction) or the outer transaction of the AT.  I think you
 mean outer transaction of the AT, which makes a lot of sense and
 should be easy and fast to test.  It's like an implied NOWAIT if the
 locker is the AT and the lockee is the parent.

 Can you get away with only looking at tuples though?  For example,
 what about advisory locks?  Table locks?

Well, that's an interesting question.  Can we get away with regarding
those things as non-conflicting, as between the parent and child
transactions?

If we want to insist that they conflict, the easy case is easy: when
acquiring a lock, we can consult the local lock table hash and see if
an outer transaction already holds such a lock.  If so, we error out.
The problem is when somebody else is waiting for the lock we hold.
That gets complicated, because the deadlock detector can't work out
that we're deadlocking with ourselves unless there are two separate
PGPROCs, one for the inner transaction and one for the outer
transaction, and even then it might think it can rearrange the lock
queue so that the outer transaction goes first, which in fact won't
work here.

I'm not exactly sure how to make this work out without involving the
deadlock detector, but I think we should try hard to get there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-08-20 Thread Rajeev rastogi
On 18 August 2015 21:18, Robert Haas Wrote:

This footnote goes to my point.

It seems clear to me that having the autonomous transaction see the
effects of the outer uncommitted transaction is a recipe for trouble.
If the autonomous transaction updates a row and commits, and the outer
transaction later aborts, the resulting state is inconsistent with any
serial history.  I'm fairly certain that's going to leave us in an
unhappy place.

Even more obviously, ending up with two committed row versions that are
both updates of a single ancestor version is no good.

So, I agree that this scenario should be an error.  What I don't agree
with is the idea that it should be the deadlock detector's job to throw
that error.  Rather, I think that when we examine the xmax of the tuple
we can see - which is the original one, not the one updated by the outer
transaction - we should check whether that XID belongs to an outer
transaction.  If it does, we should throw an error instead of trying to
lock it.  That way (1) the error message will be clear and specific to
the situation and (2) we don't need a separate PGPROC for each
autonomous transaction.  The first of those benefits is agreeable; the
second one is, in my opinion, a key design goal for this feature.

Yes I agree with this. I was in favor of error all the time without involving 
deadlock detector. 

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-08-18 Thread Merlin Moncure
On Tue, Aug 18, 2015 at 8:17 AM, Robert Haas robertmh...@gmail.com wrote:
 On Sat, Aug 15, 2015 at 6:47 PM, Noah Misch n...@leadboat.com wrote:
 CREATE TABLE t (c) AS SELECT 1;
 BEGIN;
 UPDATE t SET c = 2 WHERE c = 1;
 BEGIN_AUTONOMOUS;
 UPDATE t SET c = 3 WHERE c = 1;
 UPDATE t SET c = 4 WHERE c = 2;
 COMMIT_AUTONOMOUS;
 ROLLBACK;

 If you replace the autonomous transaction with a savepoint, the c=3 update
 finds no rows, and the c=4 update changes one row.  When the outer 
 transaction
 aborts, only the original c=1 row remains live.  If you replace the 
 autonomous
 transaction with a dblink/pg_background call, the c=3 update waits
 indefinitely for c=2 to commit or abort, an undetected deadlock.

 Suppose you make the autonomous transaction see tuples like in the savepoint
 case.  The c=3 update finds no rows to update, and the c=4 update changes one
 row.  When the outer transaction aborts, you have two live rows (c=1 and 
 c=4).
 Suppose you instead make the autonomous transaction see tuples like in the
 dblink case, yet let c=3 ignore the lock and change a row.  If both the
 autonomous transaction and the outer transaction were to commit, then you get
 two live rows (c=2 and c=3).  Neither of those outcomes is acceptable, of
 course.  In today's tuple update rules, c=3 must deadlock[1].  Other credible
 tuple update rules may not have this problem, but nothing jumps to mind.

 [1] That's not to say it must use the shmem lock structures and deadlock
 detector.

 This footnote goes to my point.

 It seems clear to me that having the autonomous transaction see the
 effects of the outer uncommitted transaction is a recipe for trouble.
 If the autonomous transaction updates a row and commits, and the outer
 transaction later aborts, the resulting state is inconsistent with any
 serial history.  I'm fairly certain that's going to leave us in an
 unhappy place.

 Even more obviously, ending up with two committed row versions that
 are both updates of a single ancestor version is no good.

 So, I agree that this scenario should be an error.  What I don't agree
 with is the idea that it should be the deadlock detector's job to
 throw that error.  Rather, I think that when we examine the xmax of
 the tuple we can see - which is the original one, not the one updated
 by the outer transaction - we should check whether that XID belongs to
 an outer transaction.

Hm: do you mean 'an' outer transaction (meaning, basically, any in
progress transaction) or the outer transaction of the AT.  I think you
mean outer transaction of the AT, which makes a lot of sense and
should be easy and fast to test.  It's like an implied NOWAIT if the
locker is the AT and the lockee is the parent.

Can you get away with only looking at tuples though?  For example,
what about advisory locks?  Table locks?


merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-08-18 Thread Robert Haas
On Sat, Aug 15, 2015 at 6:47 PM, Noah Misch n...@leadboat.com wrote:
 CREATE TABLE t (c) AS SELECT 1;
 BEGIN;
 UPDATE t SET c = 2 WHERE c = 1;
 BEGIN_AUTONOMOUS;
 UPDATE t SET c = 3 WHERE c = 1;
 UPDATE t SET c = 4 WHERE c = 2;
 COMMIT_AUTONOMOUS;
 ROLLBACK;

 If you replace the autonomous transaction with a savepoint, the c=3 update
 finds no rows, and the c=4 update changes one row.  When the outer transaction
 aborts, only the original c=1 row remains live.  If you replace the autonomous
 transaction with a dblink/pg_background call, the c=3 update waits
 indefinitely for c=2 to commit or abort, an undetected deadlock.

 Suppose you make the autonomous transaction see tuples like in the savepoint
 case.  The c=3 update finds no rows to update, and the c=4 update changes one
 row.  When the outer transaction aborts, you have two live rows (c=1 and c=4).
 Suppose you instead make the autonomous transaction see tuples like in the
 dblink case, yet let c=3 ignore the lock and change a row.  If both the
 autonomous transaction and the outer transaction were to commit, then you get
 two live rows (c=2 and c=3).  Neither of those outcomes is acceptable, of
 course.  In today's tuple update rules, c=3 must deadlock[1].  Other credible
 tuple update rules may not have this problem, but nothing jumps to mind.

 [1] That's not to say it must use the shmem lock structures and deadlock
 detector.

This footnote goes to my point.

It seems clear to me that having the autonomous transaction see the
effects of the outer uncommitted transaction is a recipe for trouble.
If the autonomous transaction updates a row and commits, and the outer
transaction later aborts, the resulting state is inconsistent with any
serial history.  I'm fairly certain that's going to leave us in an
unhappy place.

Even more obviously, ending up with two committed row versions that
are both updates of a single ancestor version is no good.

So, I agree that this scenario should be an error.  What I don't agree
with is the idea that it should be the deadlock detector's job to
throw that error.  Rather, I think that when we examine the xmax of
the tuple we can see - which is the original one, not the one updated
by the outer transaction - we should check whether that XID belongs to
an outer transaction.  If it does, we should throw an error instead of
trying to lock it.  That way (1) the error message will be clear and
specific to the situation and (2) we don't need a separate PGPROC for
each autonomous transaction.  The first of those benefits is
agreeable; the second one is, in my opinion, a key design goal for
this feature.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-08-17 Thread Albe Laurenz
Noah Misch wrote:
   If the autonomous transaction can interact with uncommitted
   work in a way that other backends could not, crazy things happen when the
   autonomous transaction commits and the suspended transaction aborts:
  
   CREATE TABLE t (c) AS SELECT 1;
   BEGIN;
   UPDATE t SET c = 2 WHERE c = 1;
   BEGIN_AUTONOMOUS;
   UPDATE t SET c = 3 WHERE c = 1;
   UPDATE t SET c = 4 WHERE c = 2;
   COMMIT_AUTONOMOUS;
   ROLLBACK;
  
   If you replace the autonomous transaction with a savepoint, the c=3 update
   finds no rows, and the c=4 update changes one row.  When the outer 
   transaction
   aborts, only the original c=1 row remains live.  If you replace the 
   autonomous
   transaction with a dblink/pg_background call, the c=3 update waits
   indefinitely for c=2 to commit or abort, an undetected deadlock.

 My starting expectation is that the semantics of an autonomous transaction
 will be exactly those of dblink/pg_background.  (I said that during the
 unconference session.)  The application would need to read data from tables
 before switching to the autonomous section.  Autonomous transactions are then
 a performance and syntactic help, not a source of new semantics.  Does any
 database have autonomous transactions that do otherwise?

Oracle behaves like that, i.e. it deadlocks with your example:

SQL SELECT * FROM t;

 C
--
 1

SQL CREATE PROCEDURE proc2 IS
  2  PRAGMA AUTONOMOUS_TRANSACTION;
  3  BEGIN
  4  UPDATE t SET c = 3 WHERE c = 1;
  5  UPDATE t SET c = 4 WHERE c = 2;
  6  COMMIT;
  7  END;
  8  /

Procedure created.

SQL CREATE PROCEDURE proc1 IS
  2  BEGIN
  3  UPDATE t SET c = 2 WHERE c = 1;
  4  proc2;
  5  ROLLBACK;
  6  END;
  7  /

Procedure created.

SQL CALL proc1();
CALL proc1()
 *
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at LAURENZ.PROC2, line 4
ORA-06512: at LAURENZ.PROC1, line 4

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-08-15 Thread Noah Misch
On Sat, Aug 15, 2015 at 10:20:55PM -0300, Alvaro Herrera wrote:
 Noah Misch wrote:
 
  In today's scenarios, the later query cannot commit unless the suspended 
  query
  also commits.  (Changing that is the raison d'être of autonomous
  transactions.)  If the autonomous transaction can interact with uncommitted
  work in a way that other backends could not, crazy things happen when the
  autonomous transaction commits and the suspended transaction aborts:
  
  CREATE TABLE t (c) AS SELECT 1;
  BEGIN;
  UPDATE t SET c = 2 WHERE c = 1;
  BEGIN_AUTONOMOUS;
  UPDATE t SET c = 3 WHERE c = 1;
  UPDATE t SET c = 4 WHERE c = 2;
  COMMIT_AUTONOMOUS;
  ROLLBACK;
  
  If you replace the autonomous transaction with a savepoint, the c=3 update
  finds no rows, and the c=4 update changes one row.  When the outer 
  transaction
  aborts, only the original c=1 row remains live.  If you replace the 
  autonomous
  transaction with a dblink/pg_background call, the c=3 update waits
  indefinitely for c=2 to commit or abort, an undetected deadlock.
 
 Maybe what we need to solve this is to restrict what the autonomous
 transaction can do; for instance, make it so that the autonomous
 transaction can see all rows of the outer transaction as if the outer
 transaction were committed, but trying to update any such row raises an
 error.  As far as I can see, this closes this particular problem.  (We
 likely need additional rules to close all holes, but hopefully you get
 the idea.)
 
 Perhaps there exists a set of rules strong enough to eliminate all
 problematic visibility scenarios, but which still enables behavior
 useful enough to cover the proposed use cases.  The audit scenario is
 covered because the audit trail doesn't need to modify the audited
 tuples themselves, only read them.

My starting expectation is that the semantics of an autonomous transaction
will be exactly those of dblink/pg_background.  (I said that during the
unconference session.)  The application would need to read data from tables
before switching to the autonomous section.  Autonomous transactions are then
a performance and syntactic help, not a source of new semantics.  Does any
database have autonomous transactions that do otherwise?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-08-15 Thread Noah Misch
On Fri, Aug 07, 2015 at 11:26:08AM -0400, Robert Haas wrote:
 On Thu, Aug 6, 2015 at 11:04 PM, Merlin Moncure mmonc...@gmail.com wrote:
  I don't necessarily disagree with what you're saying, but it's not
  clear to me what the proposed behavior is.  Since the AT can commit
  before the outer, ISTM *any* ungranted lock requested by the AT but
  held by the outer leads to either A: functional deadlock (regardless
  of implementation details) or B: special behavior.
 
 I don't accept that.  We've already GOT cases where a query can be
 suspended and other queries can be running in the same backend.  You
 can do that via cursors.  Those cases work fine, and the deadlock
 detector doesn't know anything about them.  How is this any different?

In today's scenarios, the later query cannot commit unless the suspended query
also commits.  (Changing that is the raison d'être of autonomous
transactions.)  If the autonomous transaction can interact with uncommitted
work in a way that other backends could not, crazy things happen when the
autonomous transaction commits and the suspended transaction aborts:

CREATE TABLE t (c) AS SELECT 1;
BEGIN;
UPDATE t SET c = 2 WHERE c = 1;
BEGIN_AUTONOMOUS;
UPDATE t SET c = 3 WHERE c = 1;
UPDATE t SET c = 4 WHERE c = 2;
COMMIT_AUTONOMOUS;
ROLLBACK;

If you replace the autonomous transaction with a savepoint, the c=3 update
finds no rows, and the c=4 update changes one row.  When the outer transaction
aborts, only the original c=1 row remains live.  If you replace the autonomous
transaction with a dblink/pg_background call, the c=3 update waits
indefinitely for c=2 to commit or abort, an undetected deadlock.

Suppose you make the autonomous transaction see tuples like in the savepoint
case.  The c=3 update finds no rows to update, and the c=4 update changes one
row.  When the outer transaction aborts, you have two live rows (c=1 and c=4).
Suppose you instead make the autonomous transaction see tuples like in the
dblink case, yet let c=3 ignore the lock and change a row.  If both the
autonomous transaction and the outer transaction were to commit, then you get
two live rows (c=2 and c=3).  Neither of those outcomes is acceptable, of
course.  In today's tuple update rules, c=3 must deadlock[1].  Other credible
tuple update rules may not have this problem, but nothing jumps to mind.


[1] That's not to say it must use the shmem lock structures and deadlock
detector.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-08-15 Thread Alvaro Herrera
Noah Misch wrote:

 In today's scenarios, the later query cannot commit unless the suspended query
 also commits.  (Changing that is the raison d'être of autonomous
 transactions.)  If the autonomous transaction can interact with uncommitted
 work in a way that other backends could not, crazy things happen when the
 autonomous transaction commits and the suspended transaction aborts:
 
 CREATE TABLE t (c) AS SELECT 1;
 BEGIN;
 UPDATE t SET c = 2 WHERE c = 1;
 BEGIN_AUTONOMOUS;
 UPDATE t SET c = 3 WHERE c = 1;
 UPDATE t SET c = 4 WHERE c = 2;
 COMMIT_AUTONOMOUS;
 ROLLBACK;
 
 If you replace the autonomous transaction with a savepoint, the c=3 update
 finds no rows, and the c=4 update changes one row.  When the outer transaction
 aborts, only the original c=1 row remains live.  If you replace the autonomous
 transaction with a dblink/pg_background call, the c=3 update waits
 indefinitely for c=2 to commit or abort, an undetected deadlock.

Maybe what we need to solve this is to restrict what the autonomous
transaction can do; for instance, make it so that the autonomous
transaction can see all rows of the outer transaction as if the outer
transaction were committed, but trying to update any such row raises an
error.  As far as I can see, this closes this particular problem.  (We
likely need additional rules to close all holes, but hopefully you get
the idea.)

Perhaps there exists a set of rules strong enough to eliminate all
problematic visibility scenarios, but which still enables behavior
useful enough to cover the proposed use cases.  The audit scenario is
covered because the audit trail doesn't need to modify the audited
tuples themselves, only read them.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-08-07 Thread Robert Haas
On Thu, Aug 6, 2015 at 11:04 PM, Merlin Moncure mmonc...@gmail.com wrote:
 I don't necessarily disagree with what you're saying, but it's not
 clear to me what the proposed behavior is.  Since the AT can commit
 before the outer, ISTM *any* ungranted lock requested by the AT but
 held by the outer leads to either A: functional deadlock (regardless
 of implementation details) or B: special behavior.

I don't accept that.  We've already GOT cases where a query can be
suspended and other queries can be running in the same backend.  You
can do that via cursors.  Those cases work fine, and the deadlock
detector doesn't know anything about them.  How is this any different?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-08-06 Thread Robert Haas
On Mon, Aug 3, 2015 at 9:09 AM, Merlin Moncure mmonc...@gmail.com wrote:
 hm.  OK, what's the behavior of:

 BEGIN
   UPDATE foo SET x = x + 1 WHERE foo_id = 1;

   BEGIN WITH AUTONOMOUS TRANSACTION
 UPDATE foo SET x = x + 1 WHERE foo_id = 1;
   END;

   RAISE EXCEPTION ...;
 EXCEPTION ...

 END;

Sure, so that case might need a little bit of special handling.  That
doesn't mean it's a good idea for heavyweight locks to conflict in
general.  I think you're going to find that implementing the latter is
an extremely unrewarding task, and that the benefits are seriously
negative.  For example, consider:

BEGIN
UPDATE foo SET x = x + 1 WHERE foo_id = 1;
BEGIN WITH AUTONOMOUS TRANSACTION
UPDATE foo SET x = x + 1 WHERE foo_id = 2;
END;
END;

Now, suppose that a concurrent session does LOCK TABLE foo after the
first UPDATE and before the second one.  That's now a soft deadlock.
But the only way the deadlock detector can see that is if the main
transaction and the autonomous transaction have separate PGPROC
entries, which is a design we explicitly rejected because it puts a
tight limit on the number of ATs that can be in progress and the level
to which those ATs can be nested.  But let's say you don't care, so we
go back to that design.  The deadlock detector will have to be taught
that the outer transaction can't help but wait for the inner
transaction, so we teach it that.  Now it can see that the only way to
resolve the deadlock without aborting any transactions is to reorder
the lock request from the autonomous transaction ahead of the
concurrent session that is seeking a full table lock.  So the
autonomous transaction acquires the lock without blocking after all.
You have exactly the same result that you would have had anyway but
with a phenomenal amount of additional code and complexity.

And for what?  In the original example, the way the deadlock is going
to be reported is like this:

ERROR: deadlock detected
DETAIL: Process 12345 waits for ShareLock on transaction 1000; blocked
by process 12345.
Process 12345 waits for ShareLock on transaction 1001; blocked by process 12345.

That is not a model of clarity.  On the other hand, if you just make a
rule that attempting to update or delete a tuple that an outer
transaction has already updated throws a bespoke error, you can do
something like this:

ERROR: tuple to be updated was already modified by a suspended outer transaction

...which has precedent in an existing message in trigger.c.
Similarly, if you try to drop a table that the outer transaction has
locked, the natural thing is for CheckTableNotInUse() to catch that
and report it this way:

ERROR: cannot DROP TABLE foo because it is being used by active
queries in this session

If you work hard enough, you can instead make that generate a deadlock
error message, but you're going to have to work pretty hard, and the
result is worse.

I'd really like to hear some more *specific* scenarios where it's
valuable for locks to conflict between the outer transaction and the
AT.  I grant that tuple updates are a case where the conflict has to
be detected somehow, but I don't accept that the lock manager is the
best way to do that, and I don't accept that there are a large number
of other cases that will need similar handling.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-08-06 Thread Merlin Moncure
On Thu, Aug 6, 2015 at 4:15 PM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Aug 3, 2015 at 9:09 AM, Merlin Moncure mmonc...@gmail.com wrote:
 hm.  OK, what's the behavior of:

 BEGIN
   UPDATE foo SET x = x + 1 WHERE foo_id = 1;

   BEGIN WITH AUTONOMOUS TRANSACTION
 UPDATE foo SET x = x + 1 WHERE foo_id = 1;
   END;

   RAISE EXCEPTION ...;
 EXCEPTION ...

 END;

 Sure, so that case might need a little bit of special handling.  That
 doesn't mean it's a good idea for heavyweight locks to conflict in
 general.  I think you're going to find that implementing the latter is
 an extremely unrewarding task, and that the benefits are seriously
 negative.  For example, consider:

 BEGIN
 UPDATE foo SET x = x + 1 WHERE foo_id = 1;
 BEGIN WITH AUTONOMOUS TRANSACTION
 UPDATE foo SET x = x + 1 WHERE foo_id = 2;
 END;
 END;

 Now, suppose that a concurrent session does LOCK TABLE foo after the
 first UPDATE and before the second one.  That's now a soft deadlock.
 But the only way the deadlock detector can see that is if the main
 transaction and the autonomous transaction have separate PGPROC
 entries, which is a design we explicitly rejected because it puts a
 tight limit on the number of ATs that can be in progress and the level
 to which those ATs can be nested.  But let's say you don't care, so we
 go back to that design.  The deadlock detector will have to be taught
 that the outer transaction can't help but wait for the inner
 transaction, so we teach it that.  Now it can see that the only way to
 resolve the deadlock without aborting any transactions is to reorder
 the lock request from the autonomous transaction ahead of the
 concurrent session that is seeking a full table lock.  So the
 autonomous transaction acquires the lock without blocking after all.
 You have exactly the same result that you would have had anyway but
 with a phenomenal amount of additional code and complexity.

 And for what?  In the original example, the way the deadlock is going
 to be reported is like this:

 ERROR: deadlock detected
 DETAIL: Process 12345 waits for ShareLock on transaction 1000; blocked
 by process 12345.
 Process 12345 waits for ShareLock on transaction 1001; blocked by process 
 12345.

 That is not a model of clarity.  On the other hand, if you just make a
 rule that attempting to update or delete a tuple that an outer
 transaction has already updated throws a bespoke error, you can do
 something like this:

 ERROR: tuple to be updated was already modified by a suspended outer 
 transaction

 ...which has precedent in an existing message in trigger.c.
 Similarly, if you try to drop a table that the outer transaction has
 locked, the natural thing is for CheckTableNotInUse() to catch that
 and report it this way:

 ERROR: cannot DROP TABLE foo because it is being used by active
 queries in this session

 If you work hard enough, you can instead make that generate a deadlock
 error message, but you're going to have to work pretty hard, and the
 result is worse.

 I'd really like to hear some more *specific* scenarios where it's
 valuable for locks to conflict between the outer transaction and the
 AT.  I grant that tuple updates are a case where the conflict has to
 be detected somehow, but I don't accept that the lock manager is the
 best way to do that, and I don't accept that there are a large number
 of other cases that will need similar handling.

I don't necessarily disagree with what you're saying, but it's not
clear to me what the proposed behavior is.  Since the AT can commit
before the outer, ISTM *any* ungranted lock requested by the AT but
held by the outer leads to either A: functional deadlock (regardless
of implementation details) or B: special behavior.  Deadlocks would
certainly require some acrobatics to detect and resolve due to the
fact that one party to the lock is not in fact blocked on a lock but
on the outer's execution state.  So maybe the right thing to do is to
simply ignore the problem and hang both transactions until timeout or
cancel; this isn't really much different vs. ghetto dblink style AT
that is done today in my experience.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-08-04 Thread Rajeev rastogi
On 03 August 2015 18:40, Merlin Moncure [mailto:mmonc...@gmail.com] Wrote:
On Sun, Aug 2, 2015 at 11:37 PM, Rajeev rastogi
rajeev.rast...@huawei.com wrote:
 On 31 July 2015 23:10, Robert Haas Wrote:
I think we're going entirely down the wrong path here.  Why is it ever
useful for a backend's lock requests to conflict with themselves, even
with autonomous transactions?  That seems like an artifact of somebody
else's implementation that we should be happy we don't need to copy.

 IMHO, since most of the locking are managed at transaction level not
backend level and we consider main  autonomous transaction to be
independent transaction, then practically they may conflict right.
 It is also right as you said that there is no as such useful use-cases
where autonomous transaction conflicts with main (parent) transaction.
But we cannot take it for granted as user might make a mistake. So at-
least we should have some mechanism to handle this rare case, for which
as of now I think throwing error from autonomous transaction as one of
the solution. Once error thrown from autonomous transaction, main
transaction may continue as it is (or abort main transaction also??).

hm.  OK, what's the behavior of:

BEGIN
  UPDATE foo SET x = x + 1 WHERE foo_id = 1;

  BEGIN WITH AUTONOMOUS TRANSACTION
UPDATE foo SET x = x + 1 WHERE foo_id = 1;
  END;

  RAISE EXCEPTION ...;
EXCEPTION ...

END;

It should throw an error (or something equivalent) as the second update will 
wait for record lock to get released, which in this case will not happen till 
second update finishes. So catch 22.

Also,
*) What do the other candidate implementations do?  IMO, compatibility
should be the underlying design principle.


Oracle throws error in such case. But we can decide on what behavior we want to 
keep.

*) What will the SQL only feature look like?

Similar to PL as mentioned in your example, we can provide the SQL only 
feature also.

*) Is the SPI interface going to be extended to expose AT?

I don’t think at this point that there is any need of exposing SPI interface 
for this.

Thanks and Regards,
Kumar Rajeev Rastogi

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-08-04 Thread Merlin Moncure
On Tue, Aug 4, 2015 at 4:12 AM, Rajeev rastogi
rajeev.rast...@huawei.com wrote:
 On 03 August 2015 18:40, Merlin Moncure [mailto:mmonc...@gmail.com] Wrote:
On Sun, Aug 2, 2015 at 11:37 PM, Rajeev rastogi
rajeev.rast...@huawei.com wrote:
 On 31 July 2015 23:10, Robert Haas Wrote:
I think we're going entirely down the wrong path here.  Why is it ever
useful for a backend's lock requests to conflict with themselves, even
with autonomous transactions?  That seems like an artifact of somebody
else's implementation that we should be happy we don't need to copy.

 IMHO, since most of the locking are managed at transaction level not
backend level and we consider main  autonomous transaction to be
independent transaction, then practically they may conflict right.
 It is also right as you said that there is no as such useful use-cases
where autonomous transaction conflicts with main (parent) transaction.
But we cannot take it for granted as user might make a mistake. So at-
least we should have some mechanism to handle this rare case, for which
as of now I think throwing error from autonomous transaction as one of
the solution. Once error thrown from autonomous transaction, main
transaction may continue as it is (or abort main transaction also??).

hm.  OK, what's the behavior of:

BEGIN
  UPDATE foo SET x = x + 1 WHERE foo_id = 1;

  BEGIN WITH AUTONOMOUS TRANSACTION
UPDATE foo SET x = x + 1 WHERE foo_id = 1;
  END;

  RAISE EXCEPTION ...;
EXCEPTION ...

END;

 It should throw an error (or something equivalent) as the second update will 
 wait for record lock to get released, which in this case will not happen till 
 second update finishes. So catch 22.

Yeah. Point being, from my point of view autonomous transactions have
to conflict with the master transaction (or any transaction really).
I agree the right course of action is to error out immediately...what
else could you do?  There isn't even a deadlock in the classic sense
and allowing control to continue would result in indeterminate
behavior FWICT.

Also,
*) What do the other candidate implementations do?  IMO, 
compatibilityshould be the underlying design principle.

 Oracle throws error in such case. But we can decide on what behavior we want 
 to keep.

gotcha.  makes sense.

*) What will the SQL only feature look like?

 Similar to PL as mentioned in your example, we can provide the SQL only 
 feature also.

*) Is the SPI interface going to be extended to expose AT?

 I don’t think at this point that there is any need of exposing SPI interface 
 for this.

Ok, how do AT work in a non-plpgsql (SQL only) scenario?  Are you
going to similarly extend BEGIN TRANSACTION?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-08-03 Thread Merlin Moncure
On Sun, Aug 2, 2015 at 11:37 PM, Rajeev rastogi
rajeev.rast...@huawei.com wrote:
 On 31 July 2015 23:10, Robert Haas Wrote:
I think we're going entirely down the wrong path here.  Why is it ever useful 
for a backend's lock requests to conflict with themselves, even with 
autonomous transactions?  That seems like an artifact of somebody else's 
implementation that we should be happy we don't need to copy.

 IMHO, since most of the locking are managed at transaction level not backend 
 level and we consider main  autonomous transaction to be independent 
 transaction, then practically they may conflict right.
 It is also right as you said that there is no as such useful use-cases where 
 autonomous transaction conflicts with main (parent) transaction. But we 
 cannot take it for granted as user might make a mistake. So at-least we 
 should have some mechanism to handle this rare case, for which as of now I 
 think throwing error from autonomous transaction as one of the solution. Once 
 error thrown from autonomous transaction, main transaction may continue as it 
 is (or abort main transaction also??).

hm.  OK, what's the behavior of:

BEGIN
  UPDATE foo SET x = x + 1 WHERE foo_id = 1;

  BEGIN WITH AUTONOMOUS TRANSACTION
UPDATE foo SET x = x + 1 WHERE foo_id = 1;
  END;

  RAISE EXCEPTION ...;
EXCEPTION ...

END;

Also,
*) What do the other candidate implementations do?  IMO, compatibility
should be the underlying design principle.

*) What will the SQL only feature look like?

*) Is the SPI interface going to be extended to expose AT?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-08-02 Thread Rajeev rastogi
On 31 July 2015 23:10, Robert Haas Wrote: 
On Tue, Jul 28, 2015 at 6:01 AM, Craig Ringer cr...@2ndquadrant.com wrote:
 That should be practical to special-case by maintaining a list of 
 parent transaction (virtual?) transaction IDs. Attempts to wait on a 
 lock held by any of those should fail immediately. There's no point 
 waiting for the deadlock detector since the outer tx can never 
 progress and commit/rollback to release locks, and it might not be 
 able to see the parent/child relationship from outside the backend 
 doing the nested tx anyway.

I think we're going entirely down the wrong path here.  Why is it ever useful 
for a backend's lock requests to conflict with themselves, even with 
autonomous transactions?  That seems like an artifact of somebody else's 
implementation that we should be happy we don't need to copy.

IMHO, since most of the locking are managed at transaction level not backend 
level and we consider main  autonomous transaction to be independent 
transaction, then practically they may conflict right.   
It is also right as you said that there is no as such useful use-cases where 
autonomous transaction conflicts with main (parent) transaction. But we cannot 
take it for granted as user might make a mistake. So at-least we should have 
some mechanism to handle this rare case, for which as of now I think throwing 
error from autonomous transaction as one of the solution. Once error thrown 
from autonomous transaction, main transaction may continue as it is (or abort 
main transaction also??). 

Any other suggestion to handle this will be really helpful.

Thanks and Regards,
Kumar Rajeev Rastogi




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-31 Thread Robert Haas
On Tue, Jul 28, 2015 at 6:01 AM, Craig Ringer cr...@2ndquadrant.com wrote:
 That should be practical to special-case by maintaining a list of
 parent transaction (virtual?) transaction IDs. Attempts to wait on a
 lock held by any of those should fail immediately. There's no point
 waiting for the deadlock detector since the outer tx can never
 progress and commit/rollback to release locks, and it might not be
 able to see the parent/child relationship from outside the backend
 doing the nested tx anyway.

I think we're going entirely down the wrong path here.  Why is it ever
useful for a backend's lock requests to conflict with themselves, even
with autonomous transactions?  That seems like an artifact of somebody
else's implementation that we should be happy we don't need to copy.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-30 Thread Rajeev rastogi
On 28 July 2015 03:21, Josh Berkus Wrote:
 
On 07/27/2015 02:47 AM, Rajeev rastogi wrote:
 Why have any fixed maximum?
 Since we are planning to have nested autonomous transaction, so it is 
 required to have limit on this so that resources can be controlled.

Is there a particular reason why this limit wouldn't just be max_stack_depth?

We will require to allocate some initial resources in order to handle all 
nested autonomous transaction. 
So I think it is better to have some different configuration parameter.

Thanks and Regards,
Kumar Rajeev Rastogi

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-28 Thread Craig Ringer
On 23 July 2015 at 13:31, Rajeev rastogi rajeev.rast...@huawei.com wrote:

 1.The autonomous transaction treated as a completely different
 transaction from the master transaction.

Personally I think that's a lot more useful than having the inner tx
able to see the outer tx's uncommitted changes.

 2.It should be allowed to deadlock with master transaction. We
 need to work-out a solution to avoid deadlock.

The deadlock case in autonomous tx's is a bit different.

Assuming you don't intend to allow interleaving, where you can switch
between transactions at will rather than just at begin/commit, the
only way a deadlock can happen is when the outer tx holds a lock that
the inner tx tries to acquire.

That should be practical to special-case by maintaining a list of
parent transaction (virtual?) transaction IDs. Attempts to wait on a
lock held by any of those should fail immediately. There's no point
waiting for the deadlock detector since the outer tx can never
progress and commit/rollback to release locks, and it might not be
able to see the parent/child relationship from outside the backend
doing the nested tx anyway.

There's no need to check the parent list until we actually try to wait
on a lock, though I don't know whether it's practical to delay until
then.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-28 Thread Joel Jacobson
On Tue, Jul 28, 2015 at 12:56 AM, Josh Berkus j...@agliodbs.com wrote:

 Ah, ok.  The goal of the project is that the writer of X() *cannot*
 prevent Y() from writing its data (B1) and committing it.


 One of the primary use cases for ATX is audit triggers.  If a function
 writer could override ATX and prevent the audit triggers from
 committing, then that use case would be violated.

 Can you explain what use case you have where simply telling the staff
 if you use ATX without clearing it, you'll be fired is not sufficient?
  Possibly there's something we failed to account for in the unconference
 discussion.


I fully understand and agree why you want to prevent X() from letting Y()
commit, if the use-case is e.g. auditing.

I'll try to explain where I'm coming from by providing a bit of background
and context.

One of the greatest strengths with writing an entire application using only
sql and plpgsql functions,
is you don't have to worry about side effects when calling functions, since
you are always in total control to
rollback all your writes at your stack-depth and deeper down the stack, and
the caller can likewise be certain
that if the function it called threw an exception, all of its work is
rollbacked.

However, if e.g. making use of plperlu functions, it's possible those
functions might have written files to disk
or made network connections to the outside world,
i.e. it's possible they have caused side-effects that naturally cannot be
rollbacked by postgres.

Currently, in our codebase at Trustly, we have already quite a few plperlu
functions.
In each one of them, we need to think carefully about side-effects,
it's usually fine since most of them are immutable.

But if we had to worry about all plpgsql functions being able to write
things without consent from the caller,
then we would have a completely different situation with increased
complexity and risk for failures.

if you use ATX without clearing it, you'll be fired
We trust each other so we don't have that kind of problem.
But still, even when trusting your own and others code, I would say quite
often you make use of a function
for which there are small details you have forgotten about or never knew in
the first place,
and if that little detail would be something written in an ATX, that could
be a problem
if you the caller wouldn't want whatever was written to be written.

Don't get me wrong, ATX is something I would absoltely love, since then you
could
for instance in function doing password validation, update the
FailedLoginAttempts column
in an ATX and then still raise an exception to rollback the operation and
return an error to the client.

However, the need for ATXs is at least for us a special need you won't need
in most functions,
and since the risk and complexity increases with it, I would prefer if it
can be enaled/disabled
by default globally and explicitly enabled/disabled per function.

If the global default is disabled or if it's disabled for a specific
function, like for X() in your example,
and if it's enabled for Y(), then X() tries to call Y() you should get an
error even before Y() is executed.

That way we can still do auditing, since X() couldn't execute Y() since it
was declared as AT,
and that would be the same thing as if X() wouldn't have the line of code
in it that executs Y(),
something which X() is in power of as if X() calls Y() or not, is
ultimately X()'s decision.

If we declare entire functions as AT, then we only have to check before
executing the function
if AT is allowed or not in the context, determined by the global default or
if the caller function is defined AT or NOT AT.

Use cases:

1. X[AT] - Y[AT]
OK, since caller X() is declared AT i.e. allows AT in itself and in callees.

2. X[AT] - Y[NOT AT]
OK, since caller X() is declared AT i.e. allows AT in itself and in callees,
and since Y() is NOT AT, i.e. not making use of AT and not allowing it in
callees
that is not in violation with anything.

3: X[NOT AT] - Y[AT]
Invalid, since caller X() is declared NOT AT, i.e. disallows AT in itself
and in callees,
and since Y() is declared AT it cannot be executed since it's declared AT.

4: X[NOT AT] - Y[NOT AT]
OK, since caller X() is declared NOT AT, i.e. disallows AT in itself and in
callees,
and since Y() is also declared NOT AT, it can be executed.


Re: [HACKERS] Autonomous Transaction is back

2015-07-28 Thread Rajeev rastogi
On 28 July 2015 15:31, Craig Ringer Wrote:


 2.It should be allowed to deadlock with master transaction.
We
 need to work-out a solution to avoid deadlock.

The deadlock case in autonomous tx's is a bit different.

Assuming you don't intend to allow interleaving, where you can switch
between transactions at will rather than just at begin/commit, the only
way a deadlock can happen is when the outer tx holds a lock that the
inner tx tries to acquire.

That should be practical to special-case by maintaining a list of parent
transaction (virtual?) transaction IDs. Attempts to wait on a lock held
by any of those should fail immediately. There's no point waiting for
the deadlock detector since the outer tx can never progress and
commit/rollback to release locks, and it might not be able to see the
parent/child relationship from outside the backend doing the nested tx
anyway.

Thanks, sounds to be a good idea. I shall evaluate the same.

Thanks and Regards,
Kumar Rajeev Rastogi

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-28 Thread Ashutosh Bapat
 Ah, you're missing how commits in ATX are expected to work.  Let me
 illustrate:

 X (
Data write A1
call Y(
 Start ATX
 Data write B1
 Commit ATX
)
Data write A2
Exception
 )

 In this workflow, B1 would be committed and persistent. Neither A1 nor
 A2 would be committed, or visible to other users.  Depending on what
 implementation we end up with, A1 might not even be visible to Y().


A1 should never be visible to Y(), else we will end up with
inconsistencies. E.g.

A1 is a primary key and B1 writes a foreign key referencing A1. Commit ATX,
will not complain as it sees A1, but in case X rolls back, we may have B1
referencing nothing.

Am I missing something?

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Autonomous Transaction is back

2015-07-27 Thread Joel Jacobson
On Fri, Jul 24, 2015 at 9:39 PM, Merlin Moncure mmonc...@gmail.com wrote:

 On Thu, Jul 23, 2015 at 1:49 PM, Josh Berkus j...@agliodbs.com wrote:
  Batch Jobs: large data-manipulation tasks which need to be broken up
  into segments, with each segment committing separately.  Example:
  updating 1 million records in batches of 1000.

 Autonomous transactions are not a good fit for this case; stored
 procedures are a better way to go for any scenario where you don't
 want be be in a snapshot (for example, suppose you want to change
 isolation level on the fly).


Hm, you mean we need real stored procedures in PostgreSQL and not just
functions?

If not, I think it would be sufficient to add Autonomous Transaction
support to the type of functions we already have in pg to allow writing a
batch job function which would commit after X numbers of modified rows,
instead of having to write a script in an external language such as Perl to
call the function in a while-loop and commit in between each function call.

However, we should also add a way for the caller to protect against an
Autonomous Transaction in a function called by the caller. Imagine if
you're the author of function X() and within X() make use of some other
function Y() which has been written by some other author, and within your
function X(), it's very important either all of your work or none at all
gets committed, then you need to make sure none of the changes you made
before calling Y() gets committed, and thus we need a way to prevent Y()
from starting and committing an Autonomous Transaction, otherwise we would
increase the risk and complexity of working with functions and plpgsql in
PostgreSQL as you would then need to be sure none of the functions you are
using within a function will start and commit an ATX.


Re: [HACKERS] Autonomous Transaction is back

2015-07-27 Thread Josh Berkus
On 07/27/2015 02:41 PM, Joel Jacobson wrote:
 However, we should also add a way for the caller to protect against an
 Autonomous Transaction in a function called by the caller. Imagine if
 you're the author of function X() and within X() make use of some other
 function Y() which has been written by some other author, and within
 your function X(), it's very important either all of your work or none
 at all gets committed, then you need to make sure none of the changes
 you made before calling Y() gets committed, and thus we need a way to
 prevent Y() from starting and committing an Autonomous Transaction,
 otherwise we would increase the risk and complexity of working with
 functions and plpgsql in PostgreSQL as you would then need to be sure
 none of the functions you are using within a function will start and
 commit an ATX.

Ah, you're missing how commits in ATX are expected to work.  Let me
illustrate:

X (
   Data write A1
   call Y(
Start ATX
Data write B1
Commit ATX
   )
   Data write A2
   Exception
)

In this workflow, B1 would be committed and persistent. Neither A1 nor
A2 would be committed, or visible to other users.  Depending on what
implementation we end up with, A1 might not even be visible to Y().

So that solves your use case without any need to block ATXs in called
functions.  However, it leads to some interesting cases involving
self-deadlocks; see the original post on this thread.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-27 Thread Josh Berkus
On 07/27/2015 02:47 AM, Rajeev rastogi wrote:
 Why have any fixed maximum?
 Since we are planning to have nested autonomous transaction, so it is 
 required to have limit on this so that resources can be controlled.

Is there a particular reason why this limit wouldn't just be
max_stack_depth?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-27 Thread Merlin Moncure
On Mon, Jul 27, 2015 at 4:41 PM, Joel Jacobson j...@trustly.com wrote:
 On Fri, Jul 24, 2015 at 9:39 PM, Merlin Moncure mmonc...@gmail.com wrote:

 On Thu, Jul 23, 2015 at 1:49 PM, Josh Berkus j...@agliodbs.com wrote:
  Batch Jobs: large data-manipulation tasks which need to be broken up
  into segments, with each segment committing separately.  Example:
  updating 1 million records in batches of 1000.

 Autonomous transactions are not a good fit for this case; stored
 procedures are a better way to go for any scenario where you don't
 want be be in a snapshot (for example, suppose you want to change
 isolation level on the fly).


 Hm, you mean we need real stored procedures in PostgreSQL and not just
 functions?

Yes, exactly.

Autonomous transactions aren't really set up for cases where the
function runs for a very long time or indefinitely.  This is the
'advancing xmin' problem as Josh puts it but I think the problem is
much bigger than that.  Anyways, this is mostly irrelevant to
autonomous transactions as long as the design isn't extended to try
and cover that case.

Is the Autonomous Transaction feature only going to be exposed through pl/pgsql?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-27 Thread Jim Nasby

On 7/27/15 5:56 PM, Josh Berkus wrote:

Can you explain what use case you have where simply telling the staff
if you use ATX without clearing it, you'll be fired is not sufficient?
  Possibly there's something we failed to account for in the unconference
discussion.


That there's no way to enforce that, short of hand-auditing code? 
There's already enough things that are difficult/impossible to enforce, 
I'd rather not add another one.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-27 Thread Joel Jacobson
On Mon, Jul 27, 2015 at 11:49 PM, Josh Berkus j...@agliodbs.com wrote:

 Ah, you're missing how commits in ATX are expected to work.  Let me
 illustrate:

 X (
Data write A1
call Y(
 Start ATX
 Data write B1
 Commit ATX
)
Data write A2
Exception
 )

 In this workflow, B1 would be committed and persistent. Neither A1 nor
 A2 would be committed, or visible to other users.  Depending on what
 implementation we end up with, A1 might not even be visible to Y().

 So that solves your use case without any need to block ATXs in called
 functions.  However, it leads to some interesting cases involving
 self-deadlocks; see the original post on this thread.


I don't follow. In your example above, if I'm X(), how do I ensure Y()
won't have committed anyting at all when I later at Exception decide to
rollback everything from Data write A1 to Data write A2 including any
writes made by Y() (in the example Data write B1)?

I understand the Exception will take care of rollbacking my (X's) writes,
but that's not sufficient if you want to make sure you rollback
*everything*, including any writes made by functions you call.

Right now, when writing a function, if you raise an exception, you can be
sure all writes you have made will be rollbacked, but your caller function
might caught the exception and decide to carry on and commit work made
before your function was called, but at least you can be confident your
writes won't be committed as long as you don't caught the exception you
raised in your own function. If I understand it correctly, that would
change with the addition of Autonomous Transaction, unless given a way to
prevent a function you call from starting and commiting a Autonomous
Transaction. Wrong? If so, then please show how to prevent Y() from
commiting the Data write B1 in your example, I don't get it.


Re: [HACKERS] Autonomous Transaction is back

2015-07-27 Thread Josh Berkus
On 07/27/2015 03:12 PM, Joel Jacobson wrote:
 On Mon, Jul 27, 2015 at 11:49 PM, Josh Berkus j...@agliodbs.com
 mailto:j...@agliodbs.com wrote:
 
 Ah, you're missing how commits in ATX are expected to work.  Let me
 illustrate:
 
 X (
Data write A1
call Y(
 Start ATX
 Data write B1
 Commit ATX
)
Data write A2
Exception
 )
 
 In this workflow, B1 would be committed and persistent. Neither A1 nor
 A2 would be committed, or visible to other users.  Depending on what
 implementation we end up with, A1 might not even be visible to Y().
 
 So that solves your use case without any need to block ATXs in called
 functions.  However, it leads to some interesting cases involving
 self-deadlocks; see the original post on this thread.
 
 
 I don't follow. In your example above, if I'm X(), how do I ensure Y()
 won't have committed anyting at all when I later at Exception decide
 to rollback everything from Data write A1 to Data write A2 including
 any writes made by Y() (in the example Data write B1)?

Ah, ok.  The goal of the project is that the writer of X() *cannot*
prevent Y() from writing its data (B1) and committing it.

One of the primary use cases for ATX is audit triggers.  If a function
writer could override ATX and prevent the audit triggers from
committing, then that use case would be violated.

Can you explain what use case you have where simply telling the staff
if you use ATX without clearing it, you'll be fired is not sufficient?
 Possibly there's something we failed to account for in the unconference
discussion.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-27 Thread Jim Nasby

On 7/27/15 5:12 PM, Joel Jacobson wrote:

Right now, when writing a function, if you raise an exception, you can
be sure all writes you have made will be rollbacked, but your caller
function might caught the exception and decide to carry on and commit
work made before your function was called, but at least you can be
confident your writes won't be committed as long as you don't caught the
exception you raised in your own function. If I understand it correctly,
that would change with the addition of Autonomous Transaction, unless
given a way to prevent a function you call from starting and commiting
a Autonomous Transaction. Wrong? If so, then please show how to prevent
Y() from commiting the Data write B1 in your example, I don't get it.


What's being described here doesn't make sense in either use case ([1]  
[2]), but I do understand the concern about what 3rd party software is 
doing. It would be nice to have the ability to disallow and/or disable 
autonomous transactions, but I don't see a practical way of doing that 
other than introducing a new GUC. I'm not sure if it's worth that effort.


[1] the batch process use case: batches that still hold their own 
transaction open don't gain anything.


[2] the audit logging case. If you didn't care about auditing 
surviving regardless of a rollback then you wouldn't go to the extra 
work of an autonomous transaction to begin with.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-27 Thread Jim Nasby

On 7/27/15 6:40 PM, Jim Nasby wrote:

On 7/27/15 5:12 PM, Joel Jacobson wrote:

Right now, when writing a function, if you raise an exception, you can
be sure all writes you have made will be rollbacked, but your caller
function might caught the exception and decide to carry on and commit
work made before your function was called, but at least you can be
confident your writes won't be committed as long as you don't caught the
exception you raised in your own function. If I understand it correctly,
that would change with the addition of Autonomous Transaction, unless
given a way to prevent a function you call from starting and commiting
a Autonomous Transaction. Wrong? If so, then please show how to prevent
Y() from commiting the Data write B1 in your example, I don't get it.


What's being described here doesn't make sense in either use case ([1] 
[2]), but I do understand the concern about what 3rd party software is
doing. It would be nice to have the ability to disallow and/or disable
autonomous transactions, but I don't see a practical way of doing that
other than introducing a new GUC. I'm not sure if it's worth that effort.


It just occurred to me that another option would be to have an event 
trigger for beginning an autonomous transaction.



[1] the batch process use case: batches that still hold their own
transaction open don't gain anything.

[2] the audit logging case. If you didn't care about auditing
surviving regardless of a rollback then you wouldn't go to the extra
work of an autonomous transaction to begin with.



--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-27 Thread Pavel Stehule
2015-07-27 23:59 GMT+02:00 Merlin Moncure mmonc...@gmail.com:

 On Mon, Jul 27, 2015 at 4:41 PM, Joel Jacobson j...@trustly.com wrote:
  On Fri, Jul 24, 2015 at 9:39 PM, Merlin Moncure mmonc...@gmail.com
 wrote:
 
  On Thu, Jul 23, 2015 at 1:49 PM, Josh Berkus j...@agliodbs.com wrote:
   Batch Jobs: large data-manipulation tasks which need to be broken up
   into segments, with each segment committing separately.  Example:
   updating 1 million records in batches of 1000.
 
  Autonomous transactions are not a good fit for this case; stored
  procedures are a better way to go for any scenario where you don't
  want be be in a snapshot (for example, suppose you want to change
  isolation level on the fly).
 
 
  Hm, you mean we need real stored procedures in PostgreSQL and not just
  functions?

 Yes, exactly.

 Autonomous transactions aren't really set up for cases where the
 function runs for a very long time or indefinitely.  This is the
 'advancing xmin' problem as Josh puts it but I think the problem is
 much bigger than that.  Anyways, this is mostly irrelevant to
 autonomous transactions as long as the design isn't extended to try
 and cover that case.

 Is the Autonomous Transaction feature only going to be exposed through
 pl/pgsql?


I hope not.

The integration with plpgsql can be secondary question. In this case I
prefer a relation to block statement without possibility to explicit
COMMIT. Minimally in functions.

some like

BEGIN
  BEGIN AUTONOMOUS
   ...
  END;
END;

This is consistent with current subtransaction support, and disallow some
corner cases like forgotten COMMIT.

Regards

Pavel


 merlin


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Autonomous Transaction is back

2015-07-27 Thread Rajeev rastogi
On 23 July 2015 21:04, Robert Haas Wrote:
On Thu, Jul 23, 2015 at 1:31 AM, Rajeev rastogi rajeev.rast...@huawei.com 
wrote:
 2.It should be allowed to deadlock with master transaction. We
 need to work-out a solution to avoid deadlock.

This sentence seems to contradict itself.  I thought the consensus was that 
the transaction should NOT conflict with the master transaction.

Since we are saying transaction is autonomous to parent transaction, we cannot 
guarantee that it does not take any conflicting lock unless otherwise designed 
so by the application.
But yes, we should have mechanism to deal with the possible deadlock.

 3.It can support multiple level of nesting based on the
 configuration (may be max as 70).

Why have any fixed maximum?

Since we are planning to have nested autonomous transaction, so it is required 
to have limit on this so that resources can be controlled.

 2. The above commands can be issued either inside the procedure to 
 make few statements of procedure inside autonomous transaction or even 
 in stand-alone query execution.

I think inside a procedure the autonomous transaction will need to be 
lexically scoped.  You won't be able to do this, for example:

BEGIN AUTONOMOUS TRANSACTION;
FOR x IN SELECT ... LOOP
COMMIT;
BEGIN AUTONOMOUS TRANSACTION;
END LOOP;

I am not sure, how we will be able to control this. IMHO user should be able to 
control this, especially since it does not have any meaning from user 
perspective.
Please let me know if I am missing something here.

Rather you'd have to do something like this:

FOR x IN SELECT .. LOOP
   BEGIN WITH AUTONOMOUS TRANSACTION
   do stuff
   END;
END LOOP;


Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-24 Thread Merlin Moncure
On Thu, Jul 23, 2015 at 1:49 PM, Josh Berkus j...@agliodbs.com wrote:
 Batch Jobs: large data-manipulation tasks which need to be broken up
 into segments, with each segment committing separately.  Example:
 updating 1 million records in batches of 1000.

Autonomous transactions are not a good fit for this case; stored
procedures are a better way to go for any scenario where you don't
want be be in a snapshot (for example, suppose you want to change
isolation level on the fly).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-23 Thread Ashutosh Bapat
On Thu, Jul 23, 2015 at 11:01 AM, Rajeev rastogi rajeev.rast...@huawei.com
wrote:

  After few failed attempt to propose Autonomous transaction earlier. I
 along with Simon Riggs would like to propose again but completely different
 in approach.



 We also had discussion about this feature in last PGCon2015 Unconference
 Day, those who missed this discussion, please refer


 https://wiki.postgresql.org/wiki/AutonomousTransactionsUnconference2015





 Before jumping into the design and code proposal for this feature, me
 along with Simon Riggs wanted to propose its behavior and usage to keep
 everyone in the same boat.

 So we have summarized the behavior and usage of the Autonomous Transaction
 based on the discussion with community members in last PGCon2015
 Unconference Day:



 *Behavior of Autonomous Transaction*:

 1.The autonomous transaction treated as a completely different
 transaction from the master transaction.

 2.It should be allowed to deadlock with master transaction. We
 need to work-out a solution to avoid deadlock.


For an autonomous transaction not to conflict with the master, either it
shouldn't take conflicting resources or when it does so, they should be
shared by the master transaction. The first needs to be implemented in an
application, the later will need careful design of how to share such
resources (hierarchial resource sharing?) and how to release the resources
when the ATX ends. Given that the transactions are autonomous as per point
1, they should be allowed to conflict and the deadlock detection would take
care of such deadlocks.


  3.It can support multiple level of nesting based on the
 configuration (may be max as 70).



This might just be an implementation artifact in other RDBMSes, and may not
necessarily apply to the implementation done here. So, we should not list
it as necessary behaviour of ATX. Just as an example (no way suggesting
that the design should be this way), if we implement autonomous
transactions using background worker processes or additional backends like
prepared transactions, we will have a limit on the number of simultaneously
open ATX. This may translate to nesting limit if we implement 4, but not if
we implement parallel autonomous transactions.


  4.Outer (i.e. main or upper autonomous) transaction to be
 suspended while the inner autonomous transaction is running.

 5.Outer transaction should not see data of inner till inner is
 committed (serializable upper transaction should not see even after inner
 transaction commit).


To me only 1 and 5 seem to be the real behaviour we should define, may be
4, but that restricts parallelism. Although parallel autonomous
transactions will have its own problems to solve like sharing code. But the
design shouldn't restrict it from being a future possibility, I guess.




 *How to Use Autonomous Transaction*:

 1. We can issue explicit command to start an Autonomous transaction as
 below:

 BEGIN AUTONOMOUS TRANSACTION  (Don’t worry about keywords
 at this point.)

 Do you work.

 COMMIT/ROLLBACK   (Will commit/rollback the autonomous
 transaction and will return to main transaction or upper autonomous
 transaction).



 2. The above commands can be issued either inside the procedure to make
 few statements of procedure inside autonomous transaction or even in
 stand-alone query execution.

 3. We can make whole procedure itself as autonomous, which will be similar
 to start autonomous transaction in the beginning of the procedure and
 commit/rollback at the end of the procedure.



 There was another discussion in Unconference Day to decide whether to
 implement COMMIT/ROLLBACK inside the procedure or autonomous transaction.
 So our opinion about this is that

 COMMIT/ROLLBACK inside procedure will be somewhat
 different from Autonomous Transaction as incase of first, once we commit
 inside the procedure,

 it commits everything done before call of procedure. This is the behavior
 of Oracle.

 So in this case user required to be very careful to not do any operation
 before call of procedure, which is not yet intended to be committed inside
 procedure.



 So we can prefer to implement Autonomous Transaction, which will not only
 be compatible with Oracle but also gives really strong required features.



 I have not put the use-cases here as already we agree about its strong
 use-cases.



 Requesting for everyone's opinion regarding this based on which we can
 proceed to enhance/tune/re-write our design.



 *Thanks and Regards,*

 *Kumar Rajeev Rastogi *






-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Autonomous Transaction is back

2015-07-23 Thread Robert Haas
On Thu, Jul 23, 2015 at 2:49 PM, Josh Berkus j...@agliodbs.com wrote:
 On 07/23/2015 11:39 AM, Robert Haas wrote:
 On Thu, Jul 23, 2015 at 2:33 PM, Josh Berkus j...@agliodbs.com wrote:
 Requesting for everyone's opinion regarding this based on which we can
 proceed to enhance/tune/re-write our design.

 So, one of the things which came up during the discussion was advancing
 XMIN, which is not important to the audit logging use case, but is very
 important for the batch job use case.  What have you concluded regarding
 this item?

 Could you explain more specifically what you are talking about here?

 Yeah, my notes are kinda incoherent, no?

 There's two core use-cases for Autonomous Transactions (hereafter ATX):

 * audit logging
 * batch jobs

 Audit Logging: triggers or other statements which should leave a record
 even when a transaction aborts.  While audit logging is the main example
 of this use case, any kind of logging of failures is applicable.

 Batch Jobs: large data-manipulation tasks which need to be broken up
 into segments, with each segment committing separately.  Example:
 updating 1 million records in batches of 1000.

 Unlike the Audit Logging case, Batch Jobs generally want XMIN to advance
 so that updated/imported/deleted rows can be vacuumed or HOT updated.
 Thus the need to allow XMIN to advance.

 One of the things we kind of concluded from our discussion was that the
 two core use-cases are probably different features:

 Audit Logging:
 * requires 2-way data interaction with outer transaction
 * no parallelism
 * XMIN does not need to advance
 * master transaction should still commit/fail
 * needs to support nesting

 Batch Jobs:
 * 1-way data interaction sufficient (master--child)
 * parallelism desired
 * XMIN should advance
 * master process could be transactionless
 * does not need to support nesting

 Of these two, the Audit Logging case is the more important one to
 implement because there is no real workaround for it.  Batch Jobs can,
 and are, handled by external scripting, and having ATX for them is more
 of a convenience than anything else.

You're still not really explaining what you mean by xmin should
advance.  If the parent transaction holds a snapshot, or for as long
as it does, xmin can't be advanced safely.  If it doesn't, you'll be
fine.  I suppose the situation you're worried about is where we
execute a stored procedure that repeatedly spawns autonomous
transactions.  Since the parent transaction will always have a
snapshot, you won't advance xmin until the entire stored procedure
finishes.

That's a problem, but I think it is rather unfair to say that it has
anything to do with autonomous transactions.  Run a procedure without
needing to hold a snapshot is a completely separate feature request
from allow autonomous transactions, and it's probably neither easy
nor uncontroversial.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-23 Thread Josh Berkus
On 07/23/2015 12:40 PM, Robert Haas wrote:
 That's a problem, but I think it is rather unfair to say that it has
 anything to do with autonomous transactions.  Run a procedure without
 needing to hold a snapshot is a completely separate feature request
 from allow autonomous transactions, and it's probably neither easy
 nor uncontroversial.

That's more-or-less a summary of what I just posted.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-23 Thread Robert Haas
On Thu, Jul 23, 2015 at 1:31 AM, Rajeev rastogi
rajeev.rast...@huawei.com wrote:
 2.It should be allowed to deadlock with master transaction. We
 need to work-out a solution to avoid deadlock.

This sentence seems to contradict itself.  I thought the consensus was
that the transaction should NOT conflict with the master transaction.

 3.It can support multiple level of nesting based on the
 configuration (may be max as 70).

Why have any fixed maximum?

 2. The above commands can be issued either inside the procedure to make few
 statements of procedure inside autonomous transaction or even in stand-alone
 query execution.

I think inside a procedure the autonomous transaction will need to be
lexically scoped.  You won't be able to do this, for example:

BEGIN AUTONOMOUS TRANSACTION;
FOR x IN SELECT ... LOOP
COMMIT;
BEGIN AUTONOMOUS TRANSACTION;
END LOOP;

Rather you'd have to do something like this:

FOR x IN SELECT .. LOOP
   BEGIN WITH AUTONOMOUS TRANSACTION
   do stuff
   END;
END LOOP;

In that way it would work like an EXCEPTION block, which can be
attached to a begin block to establish a subtransaction.  There's no
way in PL/pgsql to persistently enter a subtransaction; but you can
have one that applies to a particular lexical scope (the BEGIN block
to which EXCEPTION is attached).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-23 Thread Josh Berkus

 Requesting for everyone's opinion regarding this based on which we can
 proceed to enhance/tune/re-write our design.   

So, one of the things which came up during the discussion was advancing
XMIN, which is not important to the audit logging use case, but is very
important for the batch job use case.  What have you concluded regarding
this item?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-23 Thread Robert Haas
On Thu, Jul 23, 2015 at 2:33 PM, Josh Berkus j...@agliodbs.com wrote:
 Requesting for everyone's opinion regarding this based on which we can
 proceed to enhance/tune/re-write our design.

 So, one of the things which came up during the discussion was advancing
 XMIN, which is not important to the audit logging use case, but is very
 important for the batch job use case.  What have you concluded regarding
 this item?

Could you explain more specifically what you are talking about here?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction is back

2015-07-23 Thread Josh Berkus
On 07/23/2015 11:39 AM, Robert Haas wrote:
 On Thu, Jul 23, 2015 at 2:33 PM, Josh Berkus j...@agliodbs.com wrote:
 Requesting for everyone's opinion regarding this based on which we can
 proceed to enhance/tune/re-write our design.

 So, one of the things which came up during the discussion was advancing
 XMIN, which is not important to the audit logging use case, but is very
 important for the batch job use case.  What have you concluded regarding
 this item?
 
 Could you explain more specifically what you are talking about here?
 

Yeah, my notes are kinda incoherent, no?

There's two core use-cases for Autonomous Transactions (hereafter ATX):

* audit logging
* batch jobs

Audit Logging: triggers or other statements which should leave a record
even when a transaction aborts.  While audit logging is the main example
of this use case, any kind of logging of failures is applicable.

Batch Jobs: large data-manipulation tasks which need to be broken up
into segments, with each segment committing separately.  Example:
updating 1 million records in batches of 1000.

Unlike the Audit Logging case, Batch Jobs generally want XMIN to advance
so that updated/imported/deleted rows can be vacuumed or HOT updated.
Thus the need to allow XMIN to advance.

One of the things we kind of concluded from our discussion was that the
two core use-cases are probably different features:

Audit Logging:
* requires 2-way data interaction with outer transaction
* no parallelism
* XMIN does not need to advance
* master transaction should still commit/fail
* needs to support nesting

Batch Jobs:
* 1-way data interaction sufficient (master--child)
* parallelism desired
* XMIN should advance
* master process could be transactionless
* does not need to support nesting

Of these two, the Audit Logging case is the more important one to
implement because there is no real workaround for it.  Batch Jobs can,
and are, handled by external scripting, and having ATX for them is more
of a convenience than anything else.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Autonomous Transaction is back

2015-07-22 Thread Rajeev rastogi
After few failed attempt to propose Autonomous transaction earlier. I along 
with Simon Riggs would like to propose again but completely different in 
approach.

We also had discussion about this feature in last PGCon2015 Unconference Day, 
those who missed this discussion, please refer

https://wiki.postgresql.org/wiki/AutonomousTransactionsUnconference2015


Before jumping into the design and code proposal for this feature, me along 
with Simon Riggs wanted to propose its behavior and usage to keep everyone in 
the same boat.
So we have summarized the behavior and usage of the Autonomous Transaction 
based on the discussion with community members in last PGCon2015 Unconference 
Day:

Behavior of Autonomous Transaction:
1.The autonomous transaction treated as a completely different 
transaction from the master transaction.
2.It should be allowed to deadlock with master transaction. We need 
to work-out a solution to avoid deadlock.
3.It can support multiple level of nesting based on the 
configuration (may be max as 70).
4.Outer (i.e. main or upper autonomous) transaction to be suspended 
while the inner autonomous transaction is running.
5.Outer transaction should not see data of inner till inner is 
committed (serializable upper transaction should not see even after inner 
transaction commit).

How to Use Autonomous Transaction:
1. We can issue explicit command to start an Autonomous transaction as below:
BEGIN AUTONOMOUS TRANSACTION  (Don't worry about keywords at 
this point.)
Do you work.
COMMIT/ROLLBACK   (Will commit/rollback the autonomous 
transaction and will return to main transaction or upper autonomous 
transaction).

2. The above commands can be issued either inside the procedure to make few 
statements of procedure inside autonomous transaction or even in stand-alone 
query execution.
3. We can make whole procedure itself as autonomous, which will be similar to 
start autonomous transaction in the beginning of the procedure and 
commit/rollback at the end of the procedure.

There was another discussion in Unconference Day to decide whether to implement 
COMMIT/ROLLBACK inside the procedure or autonomous transaction. So our opinion 
about this is that
COMMIT/ROLLBACK inside procedure will be somewhat different 
from Autonomous Transaction as incase of first, once we commit inside the 
procedure,
it commits everything done before call of procedure. This is the behavior of 
Oracle.
So in this case user required to be very careful to not do any operation before 
call of procedure, which is not yet intended to be committed inside procedure.

So we can prefer to implement Autonomous Transaction, which will not only be 
compatible with Oracle but also gives really strong required features.

I have not put the use-cases here as already we agree about its strong 
use-cases.

Requesting for everyone's opinion regarding this based on which we can proceed 
to enhance/tune/re-write our design.

Thanks and Regards,
Kumar Rajeev Rastogi