Re: [HACKERS] Autonomous Transaction is back
On Thu, Sep 10, 2015 at 8:39 PM, Noah Mischwrote: > On Wed, Sep 09, 2015 at 10:04:01AM -0400, Robert Haas wrote: >> On Sun, Sep 6, 2015 at 1:56 AM, Noah Misch wrote: >> > What design principle(s) have you been using to decide how autonomous >> > transactions should behave? >> >> I have to admit to a complete lack of principle. I'm quite frightened >> of what this is going to need from the lock manager, and I'm trying to >> wriggle out of having to do things there that are going to be nastily >> hard. My wriggling isn't going very well, though. > > It's an exceptionally-challenging development project, agreed. So much code > assumes the 1:1 relationship between backends and top-level transactions. I guess I'm being obtuse, but can you explain why that assumption must be revisited? I don't see why it has to be...I must be missing something. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Fri, Sep 11, 2015 at 02:30:53PM -0500, Merlin Moncure wrote: > On Thu, Sep 10, 2015 at 8:39 PM, Noah Mischwrote: > > It's an exceptionally-challenging development project, agreed. So much code > > assumes the 1:1 relationship between backends and top-level transactions. > > I guess I'm being obtuse, but can you explain why that assumption must > be revisited? It's not imperative, but the proposal on the table does so. Robert described doing so as "a key design goal for this feature"[1]. I have a general picture of why the proposers chose that, but I will defer to them for any elaboration. [1] http://www.postgresql.org/message-id/ca+tgmoze__kohfuyvzpxvtjsb85xm34kc_t41-oxhtb_111...@mail.gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Wed, Sep 09, 2015 at 10:04:01AM -0400, Robert Haas wrote: > On Sun, Sep 6, 2015 at 1:56 AM, Noah Mischwrote: > > What design principle(s) have you been using to decide how autonomous > > transactions should behave? > > I have to admit to a complete lack of principle. I'm quite frightened > of what this is going to need from the lock manager, and I'm trying to > wriggle out of having to do things there that are going to be nastily > hard. My wriggling isn't going very well, though. It's an exceptionally-challenging development project, agreed. So much code assumes the 1:1 relationship between backends and top-level transactions. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Sun, Sep 6, 2015 at 1:56 AM, Noah Mischwrote: > What design principle(s) have you been using to decide how autonomous > transactions should behave? I have to admit to a complete lack of principle. I'm quite frightened of what this is going to need from the lock manager, and I'm trying to wriggle out of having to do things there that are going to be nastily hard. My wriggling isn't going very well, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Wed, Sep 9, 2015 at 9:04 AM, Robert Haaswrote: > On Sun, Sep 6, 2015 at 1:56 AM, Noah Misch wrote: >> What design principle(s) have you been using to decide how autonomous >> transactions should behave? > > I have to admit to a complete lack of principle. I'm quite frightened > of what this is going to need from the lock manager, and I'm trying to > wriggle out of having to do things there that are going to be nastily > hard. My wriggling isn't going very well, though. Hm. Here is current dblink behavior: postgres=# create table l (id int); CREATE TABLE postgres=# insert into l values(1); INSERT 0 1 postgres=# update l set id =2 where id = 1; UPDATE 1 Time: 0.595 ms postgres=# select dblink('', 'update l set id = 3 where id = 1'); Does the lock manager really needs to be extended to address this case? pg_locks pretty clearly explains what's happening, via: postgres=# select locktype, transactionid, pid, granted from pg_locks where not granted; locktype│ transactionid │ pid │ granted ───┼───┼───┼─ transactionid │ 88380 │ 20543 │ f and postgres=# select locktype, transactionid, pid, granted from pg_locks where transactionid = 88380; locktype│ transactionid │ pid │ granted ───┼───┼───┼─ transactionid │ 88380 │ 20543 │ f transactionid │ 88380 │ 19022 │ t If pg_locks and/or pg_stat_activity were extended with a 'parent_pid' column, a userland query could terminate affected backends with a join against that column where any ungranted. Naturally there's a lot more to it than that; you'd want to issue an appropriate cancellation message and various other things. I suppose I'm probably missing something, but I'm not clear on why the lock manager needs to be overhauled to deal with this case when we can just scan current strictures assuming we can a) manage child pid independently of parent pid and b) easily figure out who is parent of who. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Sun, Sep 6, 2015 at 12:56 AM, Noah Mischwrote: > My comments have flowed out of a principle that autonomous transactions shall > have precisely the same semantics as using another backend via dblink. That's what I thought, too. AT is syntax sugar for dblink approach. Syntax and performance aside, I think the only way dblink style AT could be improved would be to have: 1) better cancel handling, especially cancel child when parent cancels (this is a major headache with dblink) 2) automatic rejection of acquire attempts by child on exclusive assets held by parent #2 is what we've been discussing, but I'm pretty confused. Most especially I'm not clear on whether parent and children share a pid or have unique pid. If pids are unique as in dblink, #2 could be done via a query assuming you have a way of identifying child transactions in pg_stat_activity. The discussion has suggested that the pid is shared, which I found odd but took on faith. I'm also wondering if we need syntax to handle synchronous vs asynchronous execution of the AT (the latter of which is only possible with a separate PID, I think). merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Thu, Sep 03, 2015 at 04:21:55PM -0400, Robert Haas wrote: > On Sat, Aug 22, 2015 at 2:23 AM, Noah Mischwrote: > >> > Can you get away with only looking at tuples though? For example, > >> > what about advisory locks? Table locks? > >> > >> Well, that's an interesting question. Can we get away with regarding > >> those things as non-conflicting, as between the parent and child > >> transactions? > > > > For system lock types, no. While one could define advisory locks to work > > differently, we should assume that today's advisory lockers have > > expectations > > like those of system lockers. An autonomous transaction should not bypass > > any > > lock that a transaction of another backend could not bypass. > > Why? > > Suppose you do this: > > BEGIN; > DECLARE CURSOR foo FOR SELECT * FROM foo; > BEGIN AUTONOMOUS TRANSACTION; > ALTER TABLE foo ALTER bar TYPE int; > > This has got to fail for safety reasons, but CheckTableNotInUse() is > on it. Suppose you do this: > > BEGIN; > LOCK foo; > BEGIN AUTONOMOUS TRANSACTION; > INSERT INTO foo VALUES ('spelunk'); > > How will making this fail improve anything? Core PostgreSQL doesn't care. This is a user interface design decision to be made in light of current SQL expectations and future SQL author wishes. The LOCK reference page, our contract with users, says nothing to constrain the choice. LOCK is exceptional in that we never get much insight into the caller's expectations. I think LOCK should follow closely the built-in commands that take the same locks. This variation of your examples must fail in order to avoid a paradox if the first transaction aborts: BEGIN; ALTER TABLE foo ALTER bar TYPE frob; BEGIN AUTONOMOUS TRANSACTION; INSERT INTO foo VALUES ('spelunk'); If we made that fail and made your second example succeed, that would imply "LOCK foo" acquires a special kind of AccessExclusiveLock differing from what ALTER TABLE acquires. That's incompatible with my sense of the LOCK command's role in the system. An ability to procure an option to acquire, without waiting, a lock and delegate that option to another transaction would have applications. It's a different feature calling for distinct syntax. My comments have flowed out of a principle that autonomous transactions shall have precisely the same semantics as using another backend via dblink. They should have less overhead. They may give different error messages. They shall not permit sequences of commands that fail in a dblink implementation of the same multi-transaction sequence. I chose this principle because it fits my intuitive notion of transaction "autonomy" and because everything I've heard about other implementations suggests that they work in that way. If some RDBMS implementation does otherwise, I would find that persuasive. What design principle(s) have you been using to decide how autonomous transactions should behave? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Sat, Aug 22, 2015 at 2:23 AM, Noah Mischwrote: >> > Can you get away with only looking at tuples though? For example, >> > what about advisory locks? Table locks? >> >> Well, that's an interesting question. Can we get away with regarding >> those things as non-conflicting, as between the parent and child >> transactions? > > For system lock types, no. While one could define advisory locks to work > differently, we should assume that today's advisory lockers have expectations > like those of system lockers. An autonomous transaction should not bypass any > lock that a transaction of another backend could not bypass. Why? Suppose you do this: BEGIN; DECLARE CURSOR foo FOR SELECT * FROM foo; BEGIN AUTONOMOUS TRANSACTION; ALTER TABLE foo ALTER bar TYPE int; This has got to fail for safety reasons, but CheckTableNotInUse() is on it. Suppose you do this: BEGIN; LOCK foo; BEGIN AUTONOMOUS TRANSACTION; INSERT INTO foo VALUES ('spelunk'); How will making this fail improve anything? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Fri, Aug 21, 2015 at 10:06:44AM -0400, Robert Haas wrote: On Tue, Aug 18, 2015 at 3:06 PM, Merlin Moncure mmonc...@gmail.com wrote: On Tue, Aug 18, 2015 at 8:17 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, Aug 15, 2015 at 6:47 PM, Noah Misch n...@leadboat.com wrote: [1] That's not to say it must use the shmem lock structures and deadlock detector. This footnote goes to my point. So, I agree that this scenario should be an error. What I don't agree with is the idea that it should be the deadlock detector's job to throw that error. I couldn't gather from your earlier messages that this scenario should get an error, so I'm glad to have that clarified. Can you get away with only looking at tuples though? For example, what about advisory locks? Table locks? Well, that's an interesting question. Can we get away with regarding those things as non-conflicting, as between the parent and child transactions? For system lock types, no. While one could define advisory locks to work differently, we should assume that today's advisory lockers have expectations like those of system lockers. An autonomous transaction should not bypass any lock that a transaction of another backend could not bypass. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Tue, Aug 18, 2015 at 3:06 PM, Merlin Moncure mmonc...@gmail.com wrote: On Tue, Aug 18, 2015 at 8:17 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, Aug 15, 2015 at 6:47 PM, Noah Misch n...@leadboat.com wrote: CREATE TABLE t (c) AS SELECT 1; BEGIN; UPDATE t SET c = 2 WHERE c = 1; BEGIN_AUTONOMOUS; UPDATE t SET c = 3 WHERE c = 1; UPDATE t SET c = 4 WHERE c = 2; COMMIT_AUTONOMOUS; ROLLBACK; If you replace the autonomous transaction with a savepoint, the c=3 update finds no rows, and the c=4 update changes one row. When the outer transaction aborts, only the original c=1 row remains live. If you replace the autonomous transaction with a dblink/pg_background call, the c=3 update waits indefinitely for c=2 to commit or abort, an undetected deadlock. Suppose you make the autonomous transaction see tuples like in the savepoint case. The c=3 update finds no rows to update, and the c=4 update changes one row. When the outer transaction aborts, you have two live rows (c=1 and c=4). Suppose you instead make the autonomous transaction see tuples like in the dblink case, yet let c=3 ignore the lock and change a row. If both the autonomous transaction and the outer transaction were to commit, then you get two live rows (c=2 and c=3). Neither of those outcomes is acceptable, of course. In today's tuple update rules, c=3 must deadlock[1]. Other credible tuple update rules may not have this problem, but nothing jumps to mind. [1] That's not to say it must use the shmem lock structures and deadlock detector. This footnote goes to my point. It seems clear to me that having the autonomous transaction see the effects of the outer uncommitted transaction is a recipe for trouble. If the autonomous transaction updates a row and commits, and the outer transaction later aborts, the resulting state is inconsistent with any serial history. I'm fairly certain that's going to leave us in an unhappy place. Even more obviously, ending up with two committed row versions that are both updates of a single ancestor version is no good. So, I agree that this scenario should be an error. What I don't agree with is the idea that it should be the deadlock detector's job to throw that error. Rather, I think that when we examine the xmax of the tuple we can see - which is the original one, not the one updated by the outer transaction - we should check whether that XID belongs to an outer transaction. Hm: do you mean 'an' outer transaction (meaning, basically, any in progress transaction) or the outer transaction of the AT. I think you mean outer transaction of the AT, which makes a lot of sense and should be easy and fast to test. It's like an implied NOWAIT if the locker is the AT and the lockee is the parent. Can you get away with only looking at tuples though? For example, what about advisory locks? Table locks? Well, that's an interesting question. Can we get away with regarding those things as non-conflicting, as between the parent and child transactions? If we want to insist that they conflict, the easy case is easy: when acquiring a lock, we can consult the local lock table hash and see if an outer transaction already holds such a lock. If so, we error out. The problem is when somebody else is waiting for the lock we hold. That gets complicated, because the deadlock detector can't work out that we're deadlocking with ourselves unless there are two separate PGPROCs, one for the inner transaction and one for the outer transaction, and even then it might think it can rearrange the lock queue so that the outer transaction goes first, which in fact won't work here. I'm not exactly sure how to make this work out without involving the deadlock detector, but I think we should try hard to get there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On 18 August 2015 21:18, Robert Haas Wrote: This footnote goes to my point. It seems clear to me that having the autonomous transaction see the effects of the outer uncommitted transaction is a recipe for trouble. If the autonomous transaction updates a row and commits, and the outer transaction later aborts, the resulting state is inconsistent with any serial history. I'm fairly certain that's going to leave us in an unhappy place. Even more obviously, ending up with two committed row versions that are both updates of a single ancestor version is no good. So, I agree that this scenario should be an error. What I don't agree with is the idea that it should be the deadlock detector's job to throw that error. Rather, I think that when we examine the xmax of the tuple we can see - which is the original one, not the one updated by the outer transaction - we should check whether that XID belongs to an outer transaction. If it does, we should throw an error instead of trying to lock it. That way (1) the error message will be clear and specific to the situation and (2) we don't need a separate PGPROC for each autonomous transaction. The first of those benefits is agreeable; the second one is, in my opinion, a key design goal for this feature. Yes I agree with this. I was in favor of error all the time without involving deadlock detector. Thanks and Regards, Kumar Rajeev Rastogi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Tue, Aug 18, 2015 at 8:17 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, Aug 15, 2015 at 6:47 PM, Noah Misch n...@leadboat.com wrote: CREATE TABLE t (c) AS SELECT 1; BEGIN; UPDATE t SET c = 2 WHERE c = 1; BEGIN_AUTONOMOUS; UPDATE t SET c = 3 WHERE c = 1; UPDATE t SET c = 4 WHERE c = 2; COMMIT_AUTONOMOUS; ROLLBACK; If you replace the autonomous transaction with a savepoint, the c=3 update finds no rows, and the c=4 update changes one row. When the outer transaction aborts, only the original c=1 row remains live. If you replace the autonomous transaction with a dblink/pg_background call, the c=3 update waits indefinitely for c=2 to commit or abort, an undetected deadlock. Suppose you make the autonomous transaction see tuples like in the savepoint case. The c=3 update finds no rows to update, and the c=4 update changes one row. When the outer transaction aborts, you have two live rows (c=1 and c=4). Suppose you instead make the autonomous transaction see tuples like in the dblink case, yet let c=3 ignore the lock and change a row. If both the autonomous transaction and the outer transaction were to commit, then you get two live rows (c=2 and c=3). Neither of those outcomes is acceptable, of course. In today's tuple update rules, c=3 must deadlock[1]. Other credible tuple update rules may not have this problem, but nothing jumps to mind. [1] That's not to say it must use the shmem lock structures and deadlock detector. This footnote goes to my point. It seems clear to me that having the autonomous transaction see the effects of the outer uncommitted transaction is a recipe for trouble. If the autonomous transaction updates a row and commits, and the outer transaction later aborts, the resulting state is inconsistent with any serial history. I'm fairly certain that's going to leave us in an unhappy place. Even more obviously, ending up with two committed row versions that are both updates of a single ancestor version is no good. So, I agree that this scenario should be an error. What I don't agree with is the idea that it should be the deadlock detector's job to throw that error. Rather, I think that when we examine the xmax of the tuple we can see - which is the original one, not the one updated by the outer transaction - we should check whether that XID belongs to an outer transaction. Hm: do you mean 'an' outer transaction (meaning, basically, any in progress transaction) or the outer transaction of the AT. I think you mean outer transaction of the AT, which makes a lot of sense and should be easy and fast to test. It's like an implied NOWAIT if the locker is the AT and the lockee is the parent. Can you get away with only looking at tuples though? For example, what about advisory locks? Table locks? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Sat, Aug 15, 2015 at 6:47 PM, Noah Misch n...@leadboat.com wrote: CREATE TABLE t (c) AS SELECT 1; BEGIN; UPDATE t SET c = 2 WHERE c = 1; BEGIN_AUTONOMOUS; UPDATE t SET c = 3 WHERE c = 1; UPDATE t SET c = 4 WHERE c = 2; COMMIT_AUTONOMOUS; ROLLBACK; If you replace the autonomous transaction with a savepoint, the c=3 update finds no rows, and the c=4 update changes one row. When the outer transaction aborts, only the original c=1 row remains live. If you replace the autonomous transaction with a dblink/pg_background call, the c=3 update waits indefinitely for c=2 to commit or abort, an undetected deadlock. Suppose you make the autonomous transaction see tuples like in the savepoint case. The c=3 update finds no rows to update, and the c=4 update changes one row. When the outer transaction aborts, you have two live rows (c=1 and c=4). Suppose you instead make the autonomous transaction see tuples like in the dblink case, yet let c=3 ignore the lock and change a row. If both the autonomous transaction and the outer transaction were to commit, then you get two live rows (c=2 and c=3). Neither of those outcomes is acceptable, of course. In today's tuple update rules, c=3 must deadlock[1]. Other credible tuple update rules may not have this problem, but nothing jumps to mind. [1] That's not to say it must use the shmem lock structures and deadlock detector. This footnote goes to my point. It seems clear to me that having the autonomous transaction see the effects of the outer uncommitted transaction is a recipe for trouble. If the autonomous transaction updates a row and commits, and the outer transaction later aborts, the resulting state is inconsistent with any serial history. I'm fairly certain that's going to leave us in an unhappy place. Even more obviously, ending up with two committed row versions that are both updates of a single ancestor version is no good. So, I agree that this scenario should be an error. What I don't agree with is the idea that it should be the deadlock detector's job to throw that error. Rather, I think that when we examine the xmax of the tuple we can see - which is the original one, not the one updated by the outer transaction - we should check whether that XID belongs to an outer transaction. If it does, we should throw an error instead of trying to lock it. That way (1) the error message will be clear and specific to the situation and (2) we don't need a separate PGPROC for each autonomous transaction. The first of those benefits is agreeable; the second one is, in my opinion, a key design goal for this feature. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
Noah Misch wrote: If the autonomous transaction can interact with uncommitted work in a way that other backends could not, crazy things happen when the autonomous transaction commits and the suspended transaction aborts: CREATE TABLE t (c) AS SELECT 1; BEGIN; UPDATE t SET c = 2 WHERE c = 1; BEGIN_AUTONOMOUS; UPDATE t SET c = 3 WHERE c = 1; UPDATE t SET c = 4 WHERE c = 2; COMMIT_AUTONOMOUS; ROLLBACK; If you replace the autonomous transaction with a savepoint, the c=3 update finds no rows, and the c=4 update changes one row. When the outer transaction aborts, only the original c=1 row remains live. If you replace the autonomous transaction with a dblink/pg_background call, the c=3 update waits indefinitely for c=2 to commit or abort, an undetected deadlock. My starting expectation is that the semantics of an autonomous transaction will be exactly those of dblink/pg_background. (I said that during the unconference session.) The application would need to read data from tables before switching to the autonomous section. Autonomous transactions are then a performance and syntactic help, not a source of new semantics. Does any database have autonomous transactions that do otherwise? Oracle behaves like that, i.e. it deadlocks with your example: SQL SELECT * FROM t; C -- 1 SQL CREATE PROCEDURE proc2 IS 2 PRAGMA AUTONOMOUS_TRANSACTION; 3 BEGIN 4 UPDATE t SET c = 3 WHERE c = 1; 5 UPDATE t SET c = 4 WHERE c = 2; 6 COMMIT; 7 END; 8 / Procedure created. SQL CREATE PROCEDURE proc1 IS 2 BEGIN 3 UPDATE t SET c = 2 WHERE c = 1; 4 proc2; 5 ROLLBACK; 6 END; 7 / Procedure created. SQL CALL proc1(); CALL proc1() * ERROR at line 1: ORA-00060: deadlock detected while waiting for resource ORA-06512: at LAURENZ.PROC2, line 4 ORA-06512: at LAURENZ.PROC1, line 4 Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Sat, Aug 15, 2015 at 10:20:55PM -0300, Alvaro Herrera wrote: Noah Misch wrote: In today's scenarios, the later query cannot commit unless the suspended query also commits. (Changing that is the raison d'être of autonomous transactions.) If the autonomous transaction can interact with uncommitted work in a way that other backends could not, crazy things happen when the autonomous transaction commits and the suspended transaction aborts: CREATE TABLE t (c) AS SELECT 1; BEGIN; UPDATE t SET c = 2 WHERE c = 1; BEGIN_AUTONOMOUS; UPDATE t SET c = 3 WHERE c = 1; UPDATE t SET c = 4 WHERE c = 2; COMMIT_AUTONOMOUS; ROLLBACK; If you replace the autonomous transaction with a savepoint, the c=3 update finds no rows, and the c=4 update changes one row. When the outer transaction aborts, only the original c=1 row remains live. If you replace the autonomous transaction with a dblink/pg_background call, the c=3 update waits indefinitely for c=2 to commit or abort, an undetected deadlock. Maybe what we need to solve this is to restrict what the autonomous transaction can do; for instance, make it so that the autonomous transaction can see all rows of the outer transaction as if the outer transaction were committed, but trying to update any such row raises an error. As far as I can see, this closes this particular problem. (We likely need additional rules to close all holes, but hopefully you get the idea.) Perhaps there exists a set of rules strong enough to eliminate all problematic visibility scenarios, but which still enables behavior useful enough to cover the proposed use cases. The audit scenario is covered because the audit trail doesn't need to modify the audited tuples themselves, only read them. My starting expectation is that the semantics of an autonomous transaction will be exactly those of dblink/pg_background. (I said that during the unconference session.) The application would need to read data from tables before switching to the autonomous section. Autonomous transactions are then a performance and syntactic help, not a source of new semantics. Does any database have autonomous transactions that do otherwise? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Fri, Aug 07, 2015 at 11:26:08AM -0400, Robert Haas wrote: On Thu, Aug 6, 2015 at 11:04 PM, Merlin Moncure mmonc...@gmail.com wrote: I don't necessarily disagree with what you're saying, but it's not clear to me what the proposed behavior is. Since the AT can commit before the outer, ISTM *any* ungranted lock requested by the AT but held by the outer leads to either A: functional deadlock (regardless of implementation details) or B: special behavior. I don't accept that. We've already GOT cases where a query can be suspended and other queries can be running in the same backend. You can do that via cursors. Those cases work fine, and the deadlock detector doesn't know anything about them. How is this any different? In today's scenarios, the later query cannot commit unless the suspended query also commits. (Changing that is the raison d'être of autonomous transactions.) If the autonomous transaction can interact with uncommitted work in a way that other backends could not, crazy things happen when the autonomous transaction commits and the suspended transaction aborts: CREATE TABLE t (c) AS SELECT 1; BEGIN; UPDATE t SET c = 2 WHERE c = 1; BEGIN_AUTONOMOUS; UPDATE t SET c = 3 WHERE c = 1; UPDATE t SET c = 4 WHERE c = 2; COMMIT_AUTONOMOUS; ROLLBACK; If you replace the autonomous transaction with a savepoint, the c=3 update finds no rows, and the c=4 update changes one row. When the outer transaction aborts, only the original c=1 row remains live. If you replace the autonomous transaction with a dblink/pg_background call, the c=3 update waits indefinitely for c=2 to commit or abort, an undetected deadlock. Suppose you make the autonomous transaction see tuples like in the savepoint case. The c=3 update finds no rows to update, and the c=4 update changes one row. When the outer transaction aborts, you have two live rows (c=1 and c=4). Suppose you instead make the autonomous transaction see tuples like in the dblink case, yet let c=3 ignore the lock and change a row. If both the autonomous transaction and the outer transaction were to commit, then you get two live rows (c=2 and c=3). Neither of those outcomes is acceptable, of course. In today's tuple update rules, c=3 must deadlock[1]. Other credible tuple update rules may not have this problem, but nothing jumps to mind. [1] That's not to say it must use the shmem lock structures and deadlock detector. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
Noah Misch wrote: In today's scenarios, the later query cannot commit unless the suspended query also commits. (Changing that is the raison d'être of autonomous transactions.) If the autonomous transaction can interact with uncommitted work in a way that other backends could not, crazy things happen when the autonomous transaction commits and the suspended transaction aborts: CREATE TABLE t (c) AS SELECT 1; BEGIN; UPDATE t SET c = 2 WHERE c = 1; BEGIN_AUTONOMOUS; UPDATE t SET c = 3 WHERE c = 1; UPDATE t SET c = 4 WHERE c = 2; COMMIT_AUTONOMOUS; ROLLBACK; If you replace the autonomous transaction with a savepoint, the c=3 update finds no rows, and the c=4 update changes one row. When the outer transaction aborts, only the original c=1 row remains live. If you replace the autonomous transaction with a dblink/pg_background call, the c=3 update waits indefinitely for c=2 to commit or abort, an undetected deadlock. Maybe what we need to solve this is to restrict what the autonomous transaction can do; for instance, make it so that the autonomous transaction can see all rows of the outer transaction as if the outer transaction were committed, but trying to update any such row raises an error. As far as I can see, this closes this particular problem. (We likely need additional rules to close all holes, but hopefully you get the idea.) Perhaps there exists a set of rules strong enough to eliminate all problematic visibility scenarios, but which still enables behavior useful enough to cover the proposed use cases. The audit scenario is covered because the audit trail doesn't need to modify the audited tuples themselves, only read them. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Thu, Aug 6, 2015 at 11:04 PM, Merlin Moncure mmonc...@gmail.com wrote: I don't necessarily disagree with what you're saying, but it's not clear to me what the proposed behavior is. Since the AT can commit before the outer, ISTM *any* ungranted lock requested by the AT but held by the outer leads to either A: functional deadlock (regardless of implementation details) or B: special behavior. I don't accept that. We've already GOT cases where a query can be suspended and other queries can be running in the same backend. You can do that via cursors. Those cases work fine, and the deadlock detector doesn't know anything about them. How is this any different? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Mon, Aug 3, 2015 at 9:09 AM, Merlin Moncure mmonc...@gmail.com wrote: hm. OK, what's the behavior of: BEGIN UPDATE foo SET x = x + 1 WHERE foo_id = 1; BEGIN WITH AUTONOMOUS TRANSACTION UPDATE foo SET x = x + 1 WHERE foo_id = 1; END; RAISE EXCEPTION ...; EXCEPTION ... END; Sure, so that case might need a little bit of special handling. That doesn't mean it's a good idea for heavyweight locks to conflict in general. I think you're going to find that implementing the latter is an extremely unrewarding task, and that the benefits are seriously negative. For example, consider: BEGIN UPDATE foo SET x = x + 1 WHERE foo_id = 1; BEGIN WITH AUTONOMOUS TRANSACTION UPDATE foo SET x = x + 1 WHERE foo_id = 2; END; END; Now, suppose that a concurrent session does LOCK TABLE foo after the first UPDATE and before the second one. That's now a soft deadlock. But the only way the deadlock detector can see that is if the main transaction and the autonomous transaction have separate PGPROC entries, which is a design we explicitly rejected because it puts a tight limit on the number of ATs that can be in progress and the level to which those ATs can be nested. But let's say you don't care, so we go back to that design. The deadlock detector will have to be taught that the outer transaction can't help but wait for the inner transaction, so we teach it that. Now it can see that the only way to resolve the deadlock without aborting any transactions is to reorder the lock request from the autonomous transaction ahead of the concurrent session that is seeking a full table lock. So the autonomous transaction acquires the lock without blocking after all. You have exactly the same result that you would have had anyway but with a phenomenal amount of additional code and complexity. And for what? In the original example, the way the deadlock is going to be reported is like this: ERROR: deadlock detected DETAIL: Process 12345 waits for ShareLock on transaction 1000; blocked by process 12345. Process 12345 waits for ShareLock on transaction 1001; blocked by process 12345. That is not a model of clarity. On the other hand, if you just make a rule that attempting to update or delete a tuple that an outer transaction has already updated throws a bespoke error, you can do something like this: ERROR: tuple to be updated was already modified by a suspended outer transaction ...which has precedent in an existing message in trigger.c. Similarly, if you try to drop a table that the outer transaction has locked, the natural thing is for CheckTableNotInUse() to catch that and report it this way: ERROR: cannot DROP TABLE foo because it is being used by active queries in this session If you work hard enough, you can instead make that generate a deadlock error message, but you're going to have to work pretty hard, and the result is worse. I'd really like to hear some more *specific* scenarios where it's valuable for locks to conflict between the outer transaction and the AT. I grant that tuple updates are a case where the conflict has to be detected somehow, but I don't accept that the lock manager is the best way to do that, and I don't accept that there are a large number of other cases that will need similar handling. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Thu, Aug 6, 2015 at 4:15 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Aug 3, 2015 at 9:09 AM, Merlin Moncure mmonc...@gmail.com wrote: hm. OK, what's the behavior of: BEGIN UPDATE foo SET x = x + 1 WHERE foo_id = 1; BEGIN WITH AUTONOMOUS TRANSACTION UPDATE foo SET x = x + 1 WHERE foo_id = 1; END; RAISE EXCEPTION ...; EXCEPTION ... END; Sure, so that case might need a little bit of special handling. That doesn't mean it's a good idea for heavyweight locks to conflict in general. I think you're going to find that implementing the latter is an extremely unrewarding task, and that the benefits are seriously negative. For example, consider: BEGIN UPDATE foo SET x = x + 1 WHERE foo_id = 1; BEGIN WITH AUTONOMOUS TRANSACTION UPDATE foo SET x = x + 1 WHERE foo_id = 2; END; END; Now, suppose that a concurrent session does LOCK TABLE foo after the first UPDATE and before the second one. That's now a soft deadlock. But the only way the deadlock detector can see that is if the main transaction and the autonomous transaction have separate PGPROC entries, which is a design we explicitly rejected because it puts a tight limit on the number of ATs that can be in progress and the level to which those ATs can be nested. But let's say you don't care, so we go back to that design. The deadlock detector will have to be taught that the outer transaction can't help but wait for the inner transaction, so we teach it that. Now it can see that the only way to resolve the deadlock without aborting any transactions is to reorder the lock request from the autonomous transaction ahead of the concurrent session that is seeking a full table lock. So the autonomous transaction acquires the lock without blocking after all. You have exactly the same result that you would have had anyway but with a phenomenal amount of additional code and complexity. And for what? In the original example, the way the deadlock is going to be reported is like this: ERROR: deadlock detected DETAIL: Process 12345 waits for ShareLock on transaction 1000; blocked by process 12345. Process 12345 waits for ShareLock on transaction 1001; blocked by process 12345. That is not a model of clarity. On the other hand, if you just make a rule that attempting to update or delete a tuple that an outer transaction has already updated throws a bespoke error, you can do something like this: ERROR: tuple to be updated was already modified by a suspended outer transaction ...which has precedent in an existing message in trigger.c. Similarly, if you try to drop a table that the outer transaction has locked, the natural thing is for CheckTableNotInUse() to catch that and report it this way: ERROR: cannot DROP TABLE foo because it is being used by active queries in this session If you work hard enough, you can instead make that generate a deadlock error message, but you're going to have to work pretty hard, and the result is worse. I'd really like to hear some more *specific* scenarios where it's valuable for locks to conflict between the outer transaction and the AT. I grant that tuple updates are a case where the conflict has to be detected somehow, but I don't accept that the lock manager is the best way to do that, and I don't accept that there are a large number of other cases that will need similar handling. I don't necessarily disagree with what you're saying, but it's not clear to me what the proposed behavior is. Since the AT can commit before the outer, ISTM *any* ungranted lock requested by the AT but held by the outer leads to either A: functional deadlock (regardless of implementation details) or B: special behavior. Deadlocks would certainly require some acrobatics to detect and resolve due to the fact that one party to the lock is not in fact blocked on a lock but on the outer's execution state. So maybe the right thing to do is to simply ignore the problem and hang both transactions until timeout or cancel; this isn't really much different vs. ghetto dblink style AT that is done today in my experience. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On 03 August 2015 18:40, Merlin Moncure [mailto:mmonc...@gmail.com] Wrote: On Sun, Aug 2, 2015 at 11:37 PM, Rajeev rastogi rajeev.rast...@huawei.com wrote: On 31 July 2015 23:10, Robert Haas Wrote: I think we're going entirely down the wrong path here. Why is it ever useful for a backend's lock requests to conflict with themselves, even with autonomous transactions? That seems like an artifact of somebody else's implementation that we should be happy we don't need to copy. IMHO, since most of the locking are managed at transaction level not backend level and we consider main autonomous transaction to be independent transaction, then practically they may conflict right. It is also right as you said that there is no as such useful use-cases where autonomous transaction conflicts with main (parent) transaction. But we cannot take it for granted as user might make a mistake. So at- least we should have some mechanism to handle this rare case, for which as of now I think throwing error from autonomous transaction as one of the solution. Once error thrown from autonomous transaction, main transaction may continue as it is (or abort main transaction also??). hm. OK, what's the behavior of: BEGIN UPDATE foo SET x = x + 1 WHERE foo_id = 1; BEGIN WITH AUTONOMOUS TRANSACTION UPDATE foo SET x = x + 1 WHERE foo_id = 1; END; RAISE EXCEPTION ...; EXCEPTION ... END; It should throw an error (or something equivalent) as the second update will wait for record lock to get released, which in this case will not happen till second update finishes. So catch 22. Also, *) What do the other candidate implementations do? IMO, compatibility should be the underlying design principle. Oracle throws error in such case. But we can decide on what behavior we want to keep. *) What will the SQL only feature look like? Similar to PL as mentioned in your example, we can provide the SQL only feature also. *) Is the SPI interface going to be extended to expose AT? I don’t think at this point that there is any need of exposing SPI interface for this. Thanks and Regards, Kumar Rajeev Rastogi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Tue, Aug 4, 2015 at 4:12 AM, Rajeev rastogi rajeev.rast...@huawei.com wrote: On 03 August 2015 18:40, Merlin Moncure [mailto:mmonc...@gmail.com] Wrote: On Sun, Aug 2, 2015 at 11:37 PM, Rajeev rastogi rajeev.rast...@huawei.com wrote: On 31 July 2015 23:10, Robert Haas Wrote: I think we're going entirely down the wrong path here. Why is it ever useful for a backend's lock requests to conflict with themselves, even with autonomous transactions? That seems like an artifact of somebody else's implementation that we should be happy we don't need to copy. IMHO, since most of the locking are managed at transaction level not backend level and we consider main autonomous transaction to be independent transaction, then practically they may conflict right. It is also right as you said that there is no as such useful use-cases where autonomous transaction conflicts with main (parent) transaction. But we cannot take it for granted as user might make a mistake. So at- least we should have some mechanism to handle this rare case, for which as of now I think throwing error from autonomous transaction as one of the solution. Once error thrown from autonomous transaction, main transaction may continue as it is (or abort main transaction also??). hm. OK, what's the behavior of: BEGIN UPDATE foo SET x = x + 1 WHERE foo_id = 1; BEGIN WITH AUTONOMOUS TRANSACTION UPDATE foo SET x = x + 1 WHERE foo_id = 1; END; RAISE EXCEPTION ...; EXCEPTION ... END; It should throw an error (or something equivalent) as the second update will wait for record lock to get released, which in this case will not happen till second update finishes. So catch 22. Yeah. Point being, from my point of view autonomous transactions have to conflict with the master transaction (or any transaction really). I agree the right course of action is to error out immediately...what else could you do? There isn't even a deadlock in the classic sense and allowing control to continue would result in indeterminate behavior FWICT. Also, *) What do the other candidate implementations do? IMO, compatibilityshould be the underlying design principle. Oracle throws error in such case. But we can decide on what behavior we want to keep. gotcha. makes sense. *) What will the SQL only feature look like? Similar to PL as mentioned in your example, we can provide the SQL only feature also. *) Is the SPI interface going to be extended to expose AT? I don’t think at this point that there is any need of exposing SPI interface for this. Ok, how do AT work in a non-plpgsql (SQL only) scenario? Are you going to similarly extend BEGIN TRANSACTION? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Sun, Aug 2, 2015 at 11:37 PM, Rajeev rastogi rajeev.rast...@huawei.com wrote: On 31 July 2015 23:10, Robert Haas Wrote: I think we're going entirely down the wrong path here. Why is it ever useful for a backend's lock requests to conflict with themselves, even with autonomous transactions? That seems like an artifact of somebody else's implementation that we should be happy we don't need to copy. IMHO, since most of the locking are managed at transaction level not backend level and we consider main autonomous transaction to be independent transaction, then practically they may conflict right. It is also right as you said that there is no as such useful use-cases where autonomous transaction conflicts with main (parent) transaction. But we cannot take it for granted as user might make a mistake. So at-least we should have some mechanism to handle this rare case, for which as of now I think throwing error from autonomous transaction as one of the solution. Once error thrown from autonomous transaction, main transaction may continue as it is (or abort main transaction also??). hm. OK, what's the behavior of: BEGIN UPDATE foo SET x = x + 1 WHERE foo_id = 1; BEGIN WITH AUTONOMOUS TRANSACTION UPDATE foo SET x = x + 1 WHERE foo_id = 1; END; RAISE EXCEPTION ...; EXCEPTION ... END; Also, *) What do the other candidate implementations do? IMO, compatibility should be the underlying design principle. *) What will the SQL only feature look like? *) Is the SPI interface going to be extended to expose AT? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On 31 July 2015 23:10, Robert Haas Wrote: On Tue, Jul 28, 2015 at 6:01 AM, Craig Ringer cr...@2ndquadrant.com wrote: That should be practical to special-case by maintaining a list of parent transaction (virtual?) transaction IDs. Attempts to wait on a lock held by any of those should fail immediately. There's no point waiting for the deadlock detector since the outer tx can never progress and commit/rollback to release locks, and it might not be able to see the parent/child relationship from outside the backend doing the nested tx anyway. I think we're going entirely down the wrong path here. Why is it ever useful for a backend's lock requests to conflict with themselves, even with autonomous transactions? That seems like an artifact of somebody else's implementation that we should be happy we don't need to copy. IMHO, since most of the locking are managed at transaction level not backend level and we consider main autonomous transaction to be independent transaction, then practically they may conflict right. It is also right as you said that there is no as such useful use-cases where autonomous transaction conflicts with main (parent) transaction. But we cannot take it for granted as user might make a mistake. So at-least we should have some mechanism to handle this rare case, for which as of now I think throwing error from autonomous transaction as one of the solution. Once error thrown from autonomous transaction, main transaction may continue as it is (or abort main transaction also??). Any other suggestion to handle this will be really helpful. Thanks and Regards, Kumar Rajeev Rastogi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Tue, Jul 28, 2015 at 6:01 AM, Craig Ringer cr...@2ndquadrant.com wrote: That should be practical to special-case by maintaining a list of parent transaction (virtual?) transaction IDs. Attempts to wait on a lock held by any of those should fail immediately. There's no point waiting for the deadlock detector since the outer tx can never progress and commit/rollback to release locks, and it might not be able to see the parent/child relationship from outside the backend doing the nested tx anyway. I think we're going entirely down the wrong path here. Why is it ever useful for a backend's lock requests to conflict with themselves, even with autonomous transactions? That seems like an artifact of somebody else's implementation that we should be happy we don't need to copy. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On 28 July 2015 03:21, Josh Berkus Wrote: On 07/27/2015 02:47 AM, Rajeev rastogi wrote: Why have any fixed maximum? Since we are planning to have nested autonomous transaction, so it is required to have limit on this so that resources can be controlled. Is there a particular reason why this limit wouldn't just be max_stack_depth? We will require to allocate some initial resources in order to handle all nested autonomous transaction. So I think it is better to have some different configuration parameter. Thanks and Regards, Kumar Rajeev Rastogi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On 23 July 2015 at 13:31, Rajeev rastogi rajeev.rast...@huawei.com wrote: 1.The autonomous transaction treated as a completely different transaction from the master transaction. Personally I think that's a lot more useful than having the inner tx able to see the outer tx's uncommitted changes. 2.It should be allowed to deadlock with master transaction. We need to work-out a solution to avoid deadlock. The deadlock case in autonomous tx's is a bit different. Assuming you don't intend to allow interleaving, where you can switch between transactions at will rather than just at begin/commit, the only way a deadlock can happen is when the outer tx holds a lock that the inner tx tries to acquire. That should be practical to special-case by maintaining a list of parent transaction (virtual?) transaction IDs. Attempts to wait on a lock held by any of those should fail immediately. There's no point waiting for the deadlock detector since the outer tx can never progress and commit/rollback to release locks, and it might not be able to see the parent/child relationship from outside the backend doing the nested tx anyway. There's no need to check the parent list until we actually try to wait on a lock, though I don't know whether it's practical to delay until then. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Tue, Jul 28, 2015 at 12:56 AM, Josh Berkus j...@agliodbs.com wrote: Ah, ok. The goal of the project is that the writer of X() *cannot* prevent Y() from writing its data (B1) and committing it. One of the primary use cases for ATX is audit triggers. If a function writer could override ATX and prevent the audit triggers from committing, then that use case would be violated. Can you explain what use case you have where simply telling the staff if you use ATX without clearing it, you'll be fired is not sufficient? Possibly there's something we failed to account for in the unconference discussion. I fully understand and agree why you want to prevent X() from letting Y() commit, if the use-case is e.g. auditing. I'll try to explain where I'm coming from by providing a bit of background and context. One of the greatest strengths with writing an entire application using only sql and plpgsql functions, is you don't have to worry about side effects when calling functions, since you are always in total control to rollback all your writes at your stack-depth and deeper down the stack, and the caller can likewise be certain that if the function it called threw an exception, all of its work is rollbacked. However, if e.g. making use of plperlu functions, it's possible those functions might have written files to disk or made network connections to the outside world, i.e. it's possible they have caused side-effects that naturally cannot be rollbacked by postgres. Currently, in our codebase at Trustly, we have already quite a few plperlu functions. In each one of them, we need to think carefully about side-effects, it's usually fine since most of them are immutable. But if we had to worry about all plpgsql functions being able to write things without consent from the caller, then we would have a completely different situation with increased complexity and risk for failures. if you use ATX without clearing it, you'll be fired We trust each other so we don't have that kind of problem. But still, even when trusting your own and others code, I would say quite often you make use of a function for which there are small details you have forgotten about or never knew in the first place, and if that little detail would be something written in an ATX, that could be a problem if you the caller wouldn't want whatever was written to be written. Don't get me wrong, ATX is something I would absoltely love, since then you could for instance in function doing password validation, update the FailedLoginAttempts column in an ATX and then still raise an exception to rollback the operation and return an error to the client. However, the need for ATXs is at least for us a special need you won't need in most functions, and since the risk and complexity increases with it, I would prefer if it can be enaled/disabled by default globally and explicitly enabled/disabled per function. If the global default is disabled or if it's disabled for a specific function, like for X() in your example, and if it's enabled for Y(), then X() tries to call Y() you should get an error even before Y() is executed. That way we can still do auditing, since X() couldn't execute Y() since it was declared as AT, and that would be the same thing as if X() wouldn't have the line of code in it that executs Y(), something which X() is in power of as if X() calls Y() or not, is ultimately X()'s decision. If we declare entire functions as AT, then we only have to check before executing the function if AT is allowed or not in the context, determined by the global default or if the caller function is defined AT or NOT AT. Use cases: 1. X[AT] - Y[AT] OK, since caller X() is declared AT i.e. allows AT in itself and in callees. 2. X[AT] - Y[NOT AT] OK, since caller X() is declared AT i.e. allows AT in itself and in callees, and since Y() is NOT AT, i.e. not making use of AT and not allowing it in callees that is not in violation with anything. 3: X[NOT AT] - Y[AT] Invalid, since caller X() is declared NOT AT, i.e. disallows AT in itself and in callees, and since Y() is declared AT it cannot be executed since it's declared AT. 4: X[NOT AT] - Y[NOT AT] OK, since caller X() is declared NOT AT, i.e. disallows AT in itself and in callees, and since Y() is also declared NOT AT, it can be executed.
Re: [HACKERS] Autonomous Transaction is back
On 28 July 2015 15:31, Craig Ringer Wrote: 2.It should be allowed to deadlock with master transaction. We need to work-out a solution to avoid deadlock. The deadlock case in autonomous tx's is a bit different. Assuming you don't intend to allow interleaving, where you can switch between transactions at will rather than just at begin/commit, the only way a deadlock can happen is when the outer tx holds a lock that the inner tx tries to acquire. That should be practical to special-case by maintaining a list of parent transaction (virtual?) transaction IDs. Attempts to wait on a lock held by any of those should fail immediately. There's no point waiting for the deadlock detector since the outer tx can never progress and commit/rollback to release locks, and it might not be able to see the parent/child relationship from outside the backend doing the nested tx anyway. Thanks, sounds to be a good idea. I shall evaluate the same. Thanks and Regards, Kumar Rajeev Rastogi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
Ah, you're missing how commits in ATX are expected to work. Let me illustrate: X ( Data write A1 call Y( Start ATX Data write B1 Commit ATX ) Data write A2 Exception ) In this workflow, B1 would be committed and persistent. Neither A1 nor A2 would be committed, or visible to other users. Depending on what implementation we end up with, A1 might not even be visible to Y(). A1 should never be visible to Y(), else we will end up with inconsistencies. E.g. A1 is a primary key and B1 writes a foreign key referencing A1. Commit ATX, will not complain as it sees A1, but in case X rolls back, we may have B1 referencing nothing. Am I missing something? -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Autonomous Transaction is back
On Fri, Jul 24, 2015 at 9:39 PM, Merlin Moncure mmonc...@gmail.com wrote: On Thu, Jul 23, 2015 at 1:49 PM, Josh Berkus j...@agliodbs.com wrote: Batch Jobs: large data-manipulation tasks which need to be broken up into segments, with each segment committing separately. Example: updating 1 million records in batches of 1000. Autonomous transactions are not a good fit for this case; stored procedures are a better way to go for any scenario where you don't want be be in a snapshot (for example, suppose you want to change isolation level on the fly). Hm, you mean we need real stored procedures in PostgreSQL and not just functions? If not, I think it would be sufficient to add Autonomous Transaction support to the type of functions we already have in pg to allow writing a batch job function which would commit after X numbers of modified rows, instead of having to write a script in an external language such as Perl to call the function in a while-loop and commit in between each function call. However, we should also add a way for the caller to protect against an Autonomous Transaction in a function called by the caller. Imagine if you're the author of function X() and within X() make use of some other function Y() which has been written by some other author, and within your function X(), it's very important either all of your work or none at all gets committed, then you need to make sure none of the changes you made before calling Y() gets committed, and thus we need a way to prevent Y() from starting and committing an Autonomous Transaction, otherwise we would increase the risk and complexity of working with functions and plpgsql in PostgreSQL as you would then need to be sure none of the functions you are using within a function will start and commit an ATX.
Re: [HACKERS] Autonomous Transaction is back
On 07/27/2015 02:41 PM, Joel Jacobson wrote: However, we should also add a way for the caller to protect against an Autonomous Transaction in a function called by the caller. Imagine if you're the author of function X() and within X() make use of some other function Y() which has been written by some other author, and within your function X(), it's very important either all of your work or none at all gets committed, then you need to make sure none of the changes you made before calling Y() gets committed, and thus we need a way to prevent Y() from starting and committing an Autonomous Transaction, otherwise we would increase the risk and complexity of working with functions and plpgsql in PostgreSQL as you would then need to be sure none of the functions you are using within a function will start and commit an ATX. Ah, you're missing how commits in ATX are expected to work. Let me illustrate: X ( Data write A1 call Y( Start ATX Data write B1 Commit ATX ) Data write A2 Exception ) In this workflow, B1 would be committed and persistent. Neither A1 nor A2 would be committed, or visible to other users. Depending on what implementation we end up with, A1 might not even be visible to Y(). So that solves your use case without any need to block ATXs in called functions. However, it leads to some interesting cases involving self-deadlocks; see the original post on this thread. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On 07/27/2015 02:47 AM, Rajeev rastogi wrote: Why have any fixed maximum? Since we are planning to have nested autonomous transaction, so it is required to have limit on this so that resources can be controlled. Is there a particular reason why this limit wouldn't just be max_stack_depth? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Mon, Jul 27, 2015 at 4:41 PM, Joel Jacobson j...@trustly.com wrote: On Fri, Jul 24, 2015 at 9:39 PM, Merlin Moncure mmonc...@gmail.com wrote: On Thu, Jul 23, 2015 at 1:49 PM, Josh Berkus j...@agliodbs.com wrote: Batch Jobs: large data-manipulation tasks which need to be broken up into segments, with each segment committing separately. Example: updating 1 million records in batches of 1000. Autonomous transactions are not a good fit for this case; stored procedures are a better way to go for any scenario where you don't want be be in a snapshot (for example, suppose you want to change isolation level on the fly). Hm, you mean we need real stored procedures in PostgreSQL and not just functions? Yes, exactly. Autonomous transactions aren't really set up for cases where the function runs for a very long time or indefinitely. This is the 'advancing xmin' problem as Josh puts it but I think the problem is much bigger than that. Anyways, this is mostly irrelevant to autonomous transactions as long as the design isn't extended to try and cover that case. Is the Autonomous Transaction feature only going to be exposed through pl/pgsql? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On 7/27/15 5:56 PM, Josh Berkus wrote: Can you explain what use case you have where simply telling the staff if you use ATX without clearing it, you'll be fired is not sufficient? Possibly there's something we failed to account for in the unconference discussion. That there's no way to enforce that, short of hand-auditing code? There's already enough things that are difficult/impossible to enforce, I'd rather not add another one. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Mon, Jul 27, 2015 at 11:49 PM, Josh Berkus j...@agliodbs.com wrote: Ah, you're missing how commits in ATX are expected to work. Let me illustrate: X ( Data write A1 call Y( Start ATX Data write B1 Commit ATX ) Data write A2 Exception ) In this workflow, B1 would be committed and persistent. Neither A1 nor A2 would be committed, or visible to other users. Depending on what implementation we end up with, A1 might not even be visible to Y(). So that solves your use case without any need to block ATXs in called functions. However, it leads to some interesting cases involving self-deadlocks; see the original post on this thread. I don't follow. In your example above, if I'm X(), how do I ensure Y() won't have committed anyting at all when I later at Exception decide to rollback everything from Data write A1 to Data write A2 including any writes made by Y() (in the example Data write B1)? I understand the Exception will take care of rollbacking my (X's) writes, but that's not sufficient if you want to make sure you rollback *everything*, including any writes made by functions you call. Right now, when writing a function, if you raise an exception, you can be sure all writes you have made will be rollbacked, but your caller function might caught the exception and decide to carry on and commit work made before your function was called, but at least you can be confident your writes won't be committed as long as you don't caught the exception you raised in your own function. If I understand it correctly, that would change with the addition of Autonomous Transaction, unless given a way to prevent a function you call from starting and commiting a Autonomous Transaction. Wrong? If so, then please show how to prevent Y() from commiting the Data write B1 in your example, I don't get it.
Re: [HACKERS] Autonomous Transaction is back
On 07/27/2015 03:12 PM, Joel Jacobson wrote: On Mon, Jul 27, 2015 at 11:49 PM, Josh Berkus j...@agliodbs.com mailto:j...@agliodbs.com wrote: Ah, you're missing how commits in ATX are expected to work. Let me illustrate: X ( Data write A1 call Y( Start ATX Data write B1 Commit ATX ) Data write A2 Exception ) In this workflow, B1 would be committed and persistent. Neither A1 nor A2 would be committed, or visible to other users. Depending on what implementation we end up with, A1 might not even be visible to Y(). So that solves your use case without any need to block ATXs in called functions. However, it leads to some interesting cases involving self-deadlocks; see the original post on this thread. I don't follow. In your example above, if I'm X(), how do I ensure Y() won't have committed anyting at all when I later at Exception decide to rollback everything from Data write A1 to Data write A2 including any writes made by Y() (in the example Data write B1)? Ah, ok. The goal of the project is that the writer of X() *cannot* prevent Y() from writing its data (B1) and committing it. One of the primary use cases for ATX is audit triggers. If a function writer could override ATX and prevent the audit triggers from committing, then that use case would be violated. Can you explain what use case you have where simply telling the staff if you use ATX without clearing it, you'll be fired is not sufficient? Possibly there's something we failed to account for in the unconference discussion. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On 7/27/15 5:12 PM, Joel Jacobson wrote: Right now, when writing a function, if you raise an exception, you can be sure all writes you have made will be rollbacked, but your caller function might caught the exception and decide to carry on and commit work made before your function was called, but at least you can be confident your writes won't be committed as long as you don't caught the exception you raised in your own function. If I understand it correctly, that would change with the addition of Autonomous Transaction, unless given a way to prevent a function you call from starting and commiting a Autonomous Transaction. Wrong? If so, then please show how to prevent Y() from commiting the Data write B1 in your example, I don't get it. What's being described here doesn't make sense in either use case ([1] [2]), but I do understand the concern about what 3rd party software is doing. It would be nice to have the ability to disallow and/or disable autonomous transactions, but I don't see a practical way of doing that other than introducing a new GUC. I'm not sure if it's worth that effort. [1] the batch process use case: batches that still hold their own transaction open don't gain anything. [2] the audit logging case. If you didn't care about auditing surviving regardless of a rollback then you wouldn't go to the extra work of an autonomous transaction to begin with. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On 7/27/15 6:40 PM, Jim Nasby wrote: On 7/27/15 5:12 PM, Joel Jacobson wrote: Right now, when writing a function, if you raise an exception, you can be sure all writes you have made will be rollbacked, but your caller function might caught the exception and decide to carry on and commit work made before your function was called, but at least you can be confident your writes won't be committed as long as you don't caught the exception you raised in your own function. If I understand it correctly, that would change with the addition of Autonomous Transaction, unless given a way to prevent a function you call from starting and commiting a Autonomous Transaction. Wrong? If so, then please show how to prevent Y() from commiting the Data write B1 in your example, I don't get it. What's being described here doesn't make sense in either use case ([1] [2]), but I do understand the concern about what 3rd party software is doing. It would be nice to have the ability to disallow and/or disable autonomous transactions, but I don't see a practical way of doing that other than introducing a new GUC. I'm not sure if it's worth that effort. It just occurred to me that another option would be to have an event trigger for beginning an autonomous transaction. [1] the batch process use case: batches that still hold their own transaction open don't gain anything. [2] the audit logging case. If you didn't care about auditing surviving regardless of a rollback then you wouldn't go to the extra work of an autonomous transaction to begin with. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
2015-07-27 23:59 GMT+02:00 Merlin Moncure mmonc...@gmail.com: On Mon, Jul 27, 2015 at 4:41 PM, Joel Jacobson j...@trustly.com wrote: On Fri, Jul 24, 2015 at 9:39 PM, Merlin Moncure mmonc...@gmail.com wrote: On Thu, Jul 23, 2015 at 1:49 PM, Josh Berkus j...@agliodbs.com wrote: Batch Jobs: large data-manipulation tasks which need to be broken up into segments, with each segment committing separately. Example: updating 1 million records in batches of 1000. Autonomous transactions are not a good fit for this case; stored procedures are a better way to go for any scenario where you don't want be be in a snapshot (for example, suppose you want to change isolation level on the fly). Hm, you mean we need real stored procedures in PostgreSQL and not just functions? Yes, exactly. Autonomous transactions aren't really set up for cases where the function runs for a very long time or indefinitely. This is the 'advancing xmin' problem as Josh puts it but I think the problem is much bigger than that. Anyways, this is mostly irrelevant to autonomous transactions as long as the design isn't extended to try and cover that case. Is the Autonomous Transaction feature only going to be exposed through pl/pgsql? I hope not. The integration with plpgsql can be secondary question. In this case I prefer a relation to block statement without possibility to explicit COMMIT. Minimally in functions. some like BEGIN BEGIN AUTONOMOUS ... END; END; This is consistent with current subtransaction support, and disallow some corner cases like forgotten COMMIT. Regards Pavel merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On 23 July 2015 21:04, Robert Haas Wrote: On Thu, Jul 23, 2015 at 1:31 AM, Rajeev rastogi rajeev.rast...@huawei.com wrote: 2.It should be allowed to deadlock with master transaction. We need to work-out a solution to avoid deadlock. This sentence seems to contradict itself. I thought the consensus was that the transaction should NOT conflict with the master transaction. Since we are saying transaction is autonomous to parent transaction, we cannot guarantee that it does not take any conflicting lock unless otherwise designed so by the application. But yes, we should have mechanism to deal with the possible deadlock. 3.It can support multiple level of nesting based on the configuration (may be max as 70). Why have any fixed maximum? Since we are planning to have nested autonomous transaction, so it is required to have limit on this so that resources can be controlled. 2. The above commands can be issued either inside the procedure to make few statements of procedure inside autonomous transaction or even in stand-alone query execution. I think inside a procedure the autonomous transaction will need to be lexically scoped. You won't be able to do this, for example: BEGIN AUTONOMOUS TRANSACTION; FOR x IN SELECT ... LOOP COMMIT; BEGIN AUTONOMOUS TRANSACTION; END LOOP; I am not sure, how we will be able to control this. IMHO user should be able to control this, especially since it does not have any meaning from user perspective. Please let me know if I am missing something here. Rather you'd have to do something like this: FOR x IN SELECT .. LOOP BEGIN WITH AUTONOMOUS TRANSACTION do stuff END; END LOOP; Thanks and Regards, Kumar Rajeev Rastogi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Thu, Jul 23, 2015 at 1:49 PM, Josh Berkus j...@agliodbs.com wrote: Batch Jobs: large data-manipulation tasks which need to be broken up into segments, with each segment committing separately. Example: updating 1 million records in batches of 1000. Autonomous transactions are not a good fit for this case; stored procedures are a better way to go for any scenario where you don't want be be in a snapshot (for example, suppose you want to change isolation level on the fly). merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Thu, Jul 23, 2015 at 11:01 AM, Rajeev rastogi rajeev.rast...@huawei.com wrote: After few failed attempt to propose Autonomous transaction earlier. I along with Simon Riggs would like to propose again but completely different in approach. We also had discussion about this feature in last PGCon2015 Unconference Day, those who missed this discussion, please refer https://wiki.postgresql.org/wiki/AutonomousTransactionsUnconference2015 Before jumping into the design and code proposal for this feature, me along with Simon Riggs wanted to propose its behavior and usage to keep everyone in the same boat. So we have summarized the behavior and usage of the Autonomous Transaction based on the discussion with community members in last PGCon2015 Unconference Day: *Behavior of Autonomous Transaction*: 1.The autonomous transaction treated as a completely different transaction from the master transaction. 2.It should be allowed to deadlock with master transaction. We need to work-out a solution to avoid deadlock. For an autonomous transaction not to conflict with the master, either it shouldn't take conflicting resources or when it does so, they should be shared by the master transaction. The first needs to be implemented in an application, the later will need careful design of how to share such resources (hierarchial resource sharing?) and how to release the resources when the ATX ends. Given that the transactions are autonomous as per point 1, they should be allowed to conflict and the deadlock detection would take care of such deadlocks. 3.It can support multiple level of nesting based on the configuration (may be max as 70). This might just be an implementation artifact in other RDBMSes, and may not necessarily apply to the implementation done here. So, we should not list it as necessary behaviour of ATX. Just as an example (no way suggesting that the design should be this way), if we implement autonomous transactions using background worker processes or additional backends like prepared transactions, we will have a limit on the number of simultaneously open ATX. This may translate to nesting limit if we implement 4, but not if we implement parallel autonomous transactions. 4.Outer (i.e. main or upper autonomous) transaction to be suspended while the inner autonomous transaction is running. 5.Outer transaction should not see data of inner till inner is committed (serializable upper transaction should not see even after inner transaction commit). To me only 1 and 5 seem to be the real behaviour we should define, may be 4, but that restricts parallelism. Although parallel autonomous transactions will have its own problems to solve like sharing code. But the design shouldn't restrict it from being a future possibility, I guess. *How to Use Autonomous Transaction*: 1. We can issue explicit command to start an Autonomous transaction as below: BEGIN AUTONOMOUS TRANSACTION (Don’t worry about keywords at this point.) Do you work. COMMIT/ROLLBACK (Will commit/rollback the autonomous transaction and will return to main transaction or upper autonomous transaction). 2. The above commands can be issued either inside the procedure to make few statements of procedure inside autonomous transaction or even in stand-alone query execution. 3. We can make whole procedure itself as autonomous, which will be similar to start autonomous transaction in the beginning of the procedure and commit/rollback at the end of the procedure. There was another discussion in Unconference Day to decide whether to implement COMMIT/ROLLBACK inside the procedure or autonomous transaction. So our opinion about this is that COMMIT/ROLLBACK inside procedure will be somewhat different from Autonomous Transaction as incase of first, once we commit inside the procedure, it commits everything done before call of procedure. This is the behavior of Oracle. So in this case user required to be very careful to not do any operation before call of procedure, which is not yet intended to be committed inside procedure. So we can prefer to implement Autonomous Transaction, which will not only be compatible with Oracle but also gives really strong required features. I have not put the use-cases here as already we agree about its strong use-cases. Requesting for everyone's opinion regarding this based on which we can proceed to enhance/tune/re-write our design. *Thanks and Regards,* *Kumar Rajeev Rastogi * -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Autonomous Transaction is back
On Thu, Jul 23, 2015 at 2:49 PM, Josh Berkus j...@agliodbs.com wrote: On 07/23/2015 11:39 AM, Robert Haas wrote: On Thu, Jul 23, 2015 at 2:33 PM, Josh Berkus j...@agliodbs.com wrote: Requesting for everyone's opinion regarding this based on which we can proceed to enhance/tune/re-write our design. So, one of the things which came up during the discussion was advancing XMIN, which is not important to the audit logging use case, but is very important for the batch job use case. What have you concluded regarding this item? Could you explain more specifically what you are talking about here? Yeah, my notes are kinda incoherent, no? There's two core use-cases for Autonomous Transactions (hereafter ATX): * audit logging * batch jobs Audit Logging: triggers or other statements which should leave a record even when a transaction aborts. While audit logging is the main example of this use case, any kind of logging of failures is applicable. Batch Jobs: large data-manipulation tasks which need to be broken up into segments, with each segment committing separately. Example: updating 1 million records in batches of 1000. Unlike the Audit Logging case, Batch Jobs generally want XMIN to advance so that updated/imported/deleted rows can be vacuumed or HOT updated. Thus the need to allow XMIN to advance. One of the things we kind of concluded from our discussion was that the two core use-cases are probably different features: Audit Logging: * requires 2-way data interaction with outer transaction * no parallelism * XMIN does not need to advance * master transaction should still commit/fail * needs to support nesting Batch Jobs: * 1-way data interaction sufficient (master--child) * parallelism desired * XMIN should advance * master process could be transactionless * does not need to support nesting Of these two, the Audit Logging case is the more important one to implement because there is no real workaround for it. Batch Jobs can, and are, handled by external scripting, and having ATX for them is more of a convenience than anything else. You're still not really explaining what you mean by xmin should advance. If the parent transaction holds a snapshot, or for as long as it does, xmin can't be advanced safely. If it doesn't, you'll be fine. I suppose the situation you're worried about is where we execute a stored procedure that repeatedly spawns autonomous transactions. Since the parent transaction will always have a snapshot, you won't advance xmin until the entire stored procedure finishes. That's a problem, but I think it is rather unfair to say that it has anything to do with autonomous transactions. Run a procedure without needing to hold a snapshot is a completely separate feature request from allow autonomous transactions, and it's probably neither easy nor uncontroversial. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On 07/23/2015 12:40 PM, Robert Haas wrote: That's a problem, but I think it is rather unfair to say that it has anything to do with autonomous transactions. Run a procedure without needing to hold a snapshot is a completely separate feature request from allow autonomous transactions, and it's probably neither easy nor uncontroversial. That's more-or-less a summary of what I just posted. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Thu, Jul 23, 2015 at 1:31 AM, Rajeev rastogi rajeev.rast...@huawei.com wrote: 2.It should be allowed to deadlock with master transaction. We need to work-out a solution to avoid deadlock. This sentence seems to contradict itself. I thought the consensus was that the transaction should NOT conflict with the master transaction. 3.It can support multiple level of nesting based on the configuration (may be max as 70). Why have any fixed maximum? 2. The above commands can be issued either inside the procedure to make few statements of procedure inside autonomous transaction or even in stand-alone query execution. I think inside a procedure the autonomous transaction will need to be lexically scoped. You won't be able to do this, for example: BEGIN AUTONOMOUS TRANSACTION; FOR x IN SELECT ... LOOP COMMIT; BEGIN AUTONOMOUS TRANSACTION; END LOOP; Rather you'd have to do something like this: FOR x IN SELECT .. LOOP BEGIN WITH AUTONOMOUS TRANSACTION do stuff END; END LOOP; In that way it would work like an EXCEPTION block, which can be attached to a begin block to establish a subtransaction. There's no way in PL/pgsql to persistently enter a subtransaction; but you can have one that applies to a particular lexical scope (the BEGIN block to which EXCEPTION is attached). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
Requesting for everyone's opinion regarding this based on which we can proceed to enhance/tune/re-write our design. So, one of the things which came up during the discussion was advancing XMIN, which is not important to the audit logging use case, but is very important for the batch job use case. What have you concluded regarding this item? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On Thu, Jul 23, 2015 at 2:33 PM, Josh Berkus j...@agliodbs.com wrote: Requesting for everyone's opinion regarding this based on which we can proceed to enhance/tune/re-write our design. So, one of the things which came up during the discussion was advancing XMIN, which is not important to the audit logging use case, but is very important for the batch job use case. What have you concluded regarding this item? Could you explain more specifically what you are talking about here? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction is back
On 07/23/2015 11:39 AM, Robert Haas wrote: On Thu, Jul 23, 2015 at 2:33 PM, Josh Berkus j...@agliodbs.com wrote: Requesting for everyone's opinion regarding this based on which we can proceed to enhance/tune/re-write our design. So, one of the things which came up during the discussion was advancing XMIN, which is not important to the audit logging use case, but is very important for the batch job use case. What have you concluded regarding this item? Could you explain more specifically what you are talking about here? Yeah, my notes are kinda incoherent, no? There's two core use-cases for Autonomous Transactions (hereafter ATX): * audit logging * batch jobs Audit Logging: triggers or other statements which should leave a record even when a transaction aborts. While audit logging is the main example of this use case, any kind of logging of failures is applicable. Batch Jobs: large data-manipulation tasks which need to be broken up into segments, with each segment committing separately. Example: updating 1 million records in batches of 1000. Unlike the Audit Logging case, Batch Jobs generally want XMIN to advance so that updated/imported/deleted rows can be vacuumed or HOT updated. Thus the need to allow XMIN to advance. One of the things we kind of concluded from our discussion was that the two core use-cases are probably different features: Audit Logging: * requires 2-way data interaction with outer transaction * no parallelism * XMIN does not need to advance * master transaction should still commit/fail * needs to support nesting Batch Jobs: * 1-way data interaction sufficient (master--child) * parallelism desired * XMIN should advance * master process could be transactionless * does not need to support nesting Of these two, the Audit Logging case is the more important one to implement because there is no real workaround for it. Batch Jobs can, and are, handled by external scripting, and having ATX for them is more of a convenience than anything else. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Autonomous Transaction is back
After few failed attempt to propose Autonomous transaction earlier. I along with Simon Riggs would like to propose again but completely different in approach. We also had discussion about this feature in last PGCon2015 Unconference Day, those who missed this discussion, please refer https://wiki.postgresql.org/wiki/AutonomousTransactionsUnconference2015 Before jumping into the design and code proposal for this feature, me along with Simon Riggs wanted to propose its behavior and usage to keep everyone in the same boat. So we have summarized the behavior and usage of the Autonomous Transaction based on the discussion with community members in last PGCon2015 Unconference Day: Behavior of Autonomous Transaction: 1.The autonomous transaction treated as a completely different transaction from the master transaction. 2.It should be allowed to deadlock with master transaction. We need to work-out a solution to avoid deadlock. 3.It can support multiple level of nesting based on the configuration (may be max as 70). 4.Outer (i.e. main or upper autonomous) transaction to be suspended while the inner autonomous transaction is running. 5.Outer transaction should not see data of inner till inner is committed (serializable upper transaction should not see even after inner transaction commit). How to Use Autonomous Transaction: 1. We can issue explicit command to start an Autonomous transaction as below: BEGIN AUTONOMOUS TRANSACTION (Don't worry about keywords at this point.) Do you work. COMMIT/ROLLBACK (Will commit/rollback the autonomous transaction and will return to main transaction or upper autonomous transaction). 2. The above commands can be issued either inside the procedure to make few statements of procedure inside autonomous transaction or even in stand-alone query execution. 3. We can make whole procedure itself as autonomous, which will be similar to start autonomous transaction in the beginning of the procedure and commit/rollback at the end of the procedure. There was another discussion in Unconference Day to decide whether to implement COMMIT/ROLLBACK inside the procedure or autonomous transaction. So our opinion about this is that COMMIT/ROLLBACK inside procedure will be somewhat different from Autonomous Transaction as incase of first, once we commit inside the procedure, it commits everything done before call of procedure. This is the behavior of Oracle. So in this case user required to be very careful to not do any operation before call of procedure, which is not yet intended to be committed inside procedure. So we can prefer to implement Autonomous Transaction, which will not only be compatible with Oracle but also gives really strong required features. I have not put the use-cases here as already we agree about its strong use-cases. Requesting for everyone's opinion regarding this based on which we can proceed to enhance/tune/re-write our design. Thanks and Regards, Kumar Rajeev Rastogi