Re: [HACKERS] stored procedures - use cases?
Hi, On 05/10/2011 02:55 PM, Robert Haas wrote: On Mon, May 9, 2011 at 11:58 PM, Pavel Stehule pavel.steh...@gmail.com wrote: no - you are little bit confused :). CALL and function execution shares nothing. There is significant differences between function and procedure. Function is called only from executor - from some plan, and you have to know a structure of result before run. The execution of CALL is much simple - you just execute code - without plan and waiting for any result - if there is. I think the distinction between function and procedure is misleading here. Some envision stored *procedures* to be able to return values, result sets and possibly even *multiple* result sets. The main features seem to be (1) explicit transaction control and/or execution of commands like VACUUM that can't be invoked from within a transaction, I think that's the main point of stored procedures. (2) autonomous transactions To me autonomous transactions seem orthogonal. Those can be used to implement (1) above, but might have other uses for regular transactions as well. (The point I'm taking home here is that you might want to control not only one concurrent transaction, but several from a stored procedure. So far, I assumed only one.) and (3) returning multiple result sets. But I don't think anybody would be desperately unhappy if it magically became possible to do those things from regular functions, unlikely as that may seem. That point definitely is on my wish-list for UDFs already. I didn't think of this as having to do with stored procedures, either. Regards Markus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
On Mon, May 9, 2011 at 11:58 PM, Pavel Stehule pavel.steh...@gmail.com wrote: no - you are little bit confused :). CALL and function execution shares nothing. There is significant differences between function and procedure. Function is called only from executor - from some plan, and you have to know a structure of result before run. The execution of CALL is much simple - you just execute code - without plan and waiting for any result - if there is. Now I'm a little confused, or you are. Surely any SQL has to be planned and executed, regardless of whether it appears in a function, a stored procedure, or anywhere else. Non-SQL statements within a stored procedure don't need to go through the planner and executor, but that's true in PL/python or PL/pgsql or whatever today. I think people are using the term stored procedures to refer to approximately whatever it is that they're unhappy that functions don't allow, and that's leading to a lot of people talking across each other. The main features seem to be (1) explicit transaction control and/or execution of commands like VACUUM that can't be invoked from within a transaction, (2) autonomous transactions, and (3) returning multiple result sets. But I don't think anybody would be desperately unhappy if it magically became possible to do those things from regular functions, unlikely as that may seem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
2011/5/10 Robert Haas robertmh...@gmail.com: On Mon, May 9, 2011 at 11:58 PM, Pavel Stehule pavel.steh...@gmail.com wrote: no - you are little bit confused :). CALL and function execution shares nothing. There is significant differences between function and procedure. Function is called only from executor - from some plan, and you have to know a structure of result before run. The execution of CALL is much simple - you just execute code - without plan and waiting for any result - if there is. Now I'm a little confused, or you are. Surely any SQL has to be planned and executed, regardless of whether it appears in a function, a stored procedure, or anywhere else. Non-SQL statements within a stored procedure don't need to go through the planner and executor, but that's true in PL/python or PL/pgsql or whatever today. CALL statement is util command than SQL. It has to execute some NON SQL code. You can thinking about CALL statement like synonymum for SELECT, but it isn't correct (it is my opinion) The stored procedures was prior stored functions (more corectly UDF - user defined functions). These old time stored procedures was simply - it was client code moved on server. Usually these procedures was executed in different process or different thread. Inside procedures was full client's side functionality and there wasn't a network overhead. CALL statement is +/- remote call. It isn't SQL statement. I think people are using the term stored procedures to refer to approximately whatever it is that they're unhappy that functions don't allow, and that's leading to a lot of people talking across each other. The main features seem to be (1) explicit transaction control and/or execution of commands like VACUUM that can't be invoked from within a transaction, (2) autonomous transactions, and (3) returning multiple result sets. But I don't think anybody would be desperately unhappy if it magically became possible to do those things from regular functions, unlikely as that may seem. yes. @2 Autonomous transaction doesn't need stored procedures. Autonomous transaction can be isolated by function's flag, by some special PL/pgSQL statement: like BEGIN EXECUTE AUTONOMOUS '' END; @3 is possible now too - but not too much user friendly. Point 3 is strange. Oracle doesn't support it. Support in DB2 is little bit strange. And it is well supported by MySQL, MSSQL, maybe Informix, Sybase. Pavel Regards Pavel -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
Kevin Grittner wrote: Peter Eisentraut pete...@gmx.net wrote: what would be the actual use cases of any of these features? Let's collect some, so we can think of ways to make them work. The two things which leap to mind for me are: (1) All the \d commands in psql should be implemented in SPs so that they are available from any client, through calling one SP equivalent to one \d command. The \d commands would be changed to call the SPs for releases recent enough to support this. Eventually psql would be free of worrying about which release contained which columns in which system tables, because it would just be passing the parameters in and displaying whatever results came back. I have used products which implemented something like this, and found it quite useful. Uh, why does this require stored procedures? Seems our existing function capabilities are even better suited to this. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: On m??n, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote: (1) All the \d commands in psql should be implemented in SPs so that they are available from any client, through calling one SP equivalent to one \d command. You don't need stored procedures with special transaction behavior for this. No, but what you *would* need is the ability to return multiple result sets from one call. Even then, you could not exactly duplicate the current output of \d; but you could duplicate the functionality. Oh, good point. Thanks. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
On 05/09/2011 08:20 PM, Bruce Momjian wrote: Tom Lane wrote: Peter Eisentrautpete...@gmx.net writes: On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote: (1) All the \d commands in psql should be implemented in SPs so that they are available from any client, through calling one SP equivalent to one \d command. You don't need stored procedures with special transaction behavior for this. No, but what you *would* need is the ability to return multiple result sets from one call. Even then, you could not exactly duplicate the current output of \d; but you could duplicate the functionality. Oh, good point. Thanks. Multiple resultsets in one call would be a good thing, though, no? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
On Mon, May 9, 2011 at 9:21 PM, Andrew Dunstan and...@dunslane.net wrote: On 05/09/2011 08:20 PM, Bruce Momjian wrote: Tom Lane wrote: Peter Eisentrautpete...@gmx.net writes: On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote: (1) All the \d commands in psql should be implemented in SPs so that they are available from any client, through calling one SP equivalent to one \d command. You don't need stored procedures with special transaction behavior for this. No, but what you *would* need is the ability to return multiple result sets from one call. Even then, you could not exactly duplicate the current output of \d; but you could duplicate the functionality. Oh, good point. Thanks. Multiple resultsets in one call would be a good thing, though, no? cheers I *thought* the purpose of having stored procedures was to allow a substrate supporting running multiple transactions, so it could do things like: - Managing vacuums - Managing transactions - Replacing some of the need for dblink. - Being an in-DB piece that could manage LISTENs It seems to be getting bikeshedded into something with more functional argument functionality than stored functions. I think we could have a perfectly successful implementation of stored procedures that supports ZERO ability to pass arguments in or out. That's quite likely to represent a good start. -- When confronted by a difficult problem, solve it by reducing it to the question, How would the Lone Ranger handle this? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
Christopher Browne wrote: Multiple resultsets in one call would be a good thing, though, no? cheers I *thought* the purpose of having stored procedures was to allow a substrate supporting running multiple transactions, so it could do things like: - Managing vacuums - Managing transactions - Replacing some of the need for dblink. - Being an in-DB piece that could manage LISTENs It seems to be getting bikeshedded into something with more functional argument functionality than stored functions. I think we could have a perfectly successful implementation of stored procedures that supports ZERO ability to pass arguments in or out. That's quite likely to represent a good start. I am kind of confused too, particularly with the CALL syntax. I thought our function call usage was superior in every way to CALL, so why implement CALL? I assume for SQL-standards compliance, right? Does multiple result sets require CALL? I assume autonomous transactions don't require CALL. Are we assuming no one is going to want a function that allows multiple result sets or autonomous transactions? That seems unlikely. I would think CALL is independent of those features. Maybe we need those features to support SQL-standard CALL, and we will just add those features to functions too. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
2011/5/10 Bruce Momjian br...@momjian.us: Christopher Browne wrote: Multiple resultsets in one call would be a good thing, though, no? cheers I *thought* the purpose of having stored procedures was to allow a substrate supporting running multiple transactions, so it could do things like: - Managing vacuums - Managing transactions - Replacing some of the need for dblink. - Being an in-DB piece that could manage LISTENs It seems to be getting bikeshedded into something with more functional argument functionality than stored functions. I think we could have a perfectly successful implementation of stored procedures that supports ZERO ability to pass arguments in or out. That's quite likely to represent a good start. I am kind of confused too, particularly with the CALL syntax. I thought our function call usage was superior in every way to CALL, so why implement CALL? I assume for SQL-standards compliance, right? Does multiple result sets require CALL? I assume autonomous transactions don't require CALL. no - you are little bit confused :). CALL and function execution shares nothing. There is significant differences between function and procedure. Function is called only from executor - from some plan, and you have to know a structure of result before run. The execution of CALL is much simple - you just execute code - without plan and waiting for any result - if there is. Are we assuming no one is going to want a function that allows multiple result sets or autonomous transactions? That seems unlikely. I would think CALL is independent of those features. Maybe we need those features to support SQL-standard CALL, and we will just add those features to functions too. We can use a SETOF cursors for returning a multiple result sets now. But there are a few complications: a) The client should to wait for finish of all sets from multiple result sets - minimally in PL/pgSQL b) client 'psql' doesn't support a unpacking result when result is multiple result set c) The using cursors isn't too comfortable - in comparation to MS SQL or MySQL Regards Pavel -- Bruce Momjian br...@momjian.us http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
On Apr 26, 2011, at 6:08 PM, Kevin Grittner wrote: Josh Berkus j...@agliodbs.com wrote: -- doing a backfill operation for 10GB of computed data, taking 8 hours, where I don't want to hold a transaction open for 8 hours since this is a high-volume OLTP database. Been there, done that. Definitely not a rare use case. We do that so often we've actually written a framework around it and are working on a daemon that will deal with any backfills that have been registered in the system. If we could control transactions that daemon could be entirely in the database... but since we can't, we have to write it in another language outside the database and switch back and forth between the two worlds. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
On Tue, Apr 26, 2011 at 10:12 PM, Greg Stark gsst...@mit.edu wrote: On Tue, Apr 26, 2011 at 11:55 PM, Josh Berkus j...@agliodbs.com wrote: Here's where I wanted autonomous transactions just last week, and didn't have them so I had to use a python script outside the database: -- doing a CREATE INDEX CONCURRENTLY for 150 partitions on a partitioned table. -- doing a backfill operation for 10GB of computed data, taking 8 hours, where I don't want to hold a transaction open for 8 hours since this is a high-volume OLTP database. These don't seem like compelling use cases at all to me. You said you had to fall back to using a python script outside the database, but what disadvantage does that have? Why is moving your application logic into the database an improvement? Honestly in every case where I've had to move code that had been in a function to the application I've found there were tons of benefits. Everything from being able to better control the behaviour, to being able to parallelize the processing over multiple connections, being able to run parts of it at different times, being able to see the progress and control it from another session, being able to manage the code in version control, the list just goes on. Trying to move all the code into the database just makes life harder. my experience has been the opposite. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
These don't seem like compelling use cases at all to me. You said you had to fall back to using a python script outside the database, but what disadvantage does that have? Why is moving your application logic into the database an improvement? Since both were part of a code rollout, it complicated our deployment process considerably and took a deployment which could have been push-button automatic and forced us to do it by manually logging into the shell on the database server. Trying to move all the code into the database just makes life harder. I might make *your* life harder. It makes *mine* easier. If you pursue your argument a little further, Greg, why do we have functions at all? We could do it all in the application. Autonomous transactions have value on their own. But it's not so that you can run create index ocncurrently or vacuum or whatever. Why not? Why are you so intent on making my life harder? They're useful so that a single session can do things like log errors even when a transaction rolls back. That's *also* an excellent use case. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
On Wed, Apr 27, 2011 at 6:48 PM, Josh Berkus j...@agliodbs.com wrote: If you pursue your argument a little further, Greg, why do we have functions at all? We could do it all in the application. Autonomous transactions have value on their own. But it's not so that you can run create index ocncurrently or vacuum or whatever. Why not? Why are you so intent on making my life harder? Because we want to be able to manipulate data in queries in data-type-specific ways. For example we want to do aggregations on the result of a function or index scans across a user data type, etc. If all the functions do is implement application logic then you end up having half your application logic in the application and half in the database and it's hard to keep them in sync. To take the argument in the opposite extreme would you suggest we should have html formatting functions in the database so that people can have their entire web server just be print $dbh-('select web_page(url)') ? They're useful so that a single session can do things like log errors even when a transaction rolls back. That's *also* an excellent use case. What makes it an excellent use case is that it's basically impossible to do without autonomous transactions. You can hack it with dblink but it's much less clean and much higher overhead. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
Greg, Because we want to be able to manipulate data in queries in data-type-specific ways. For example we want to do aggregations on the result of a function or index scans across a user data type, etc. I don't see how this is different from wanting to capture error output, which would face the same issues. You seem to be wanting to make a hard feature easier by telling me that I don't actually want the things I want. Wanna make it even easier? Then Stored Procedures are just functions without a return value. That's a 40-line patch. Done! If all the functions do is implement application logic then you end up having half your application logic in the application and half in the database and it's hard to keep them in sync. You build your applications your way, and I'll build mine my way. I'll just ask you not to try to dictate to me how I should build applications. Especially, since, based on the responses on this thread, a LOT of people would like to have multitransaction control inside a stored procedure script. I suspect that your experience of application development has been rather narrow. To take the argument in the opposite extreme would you suggest we should have html formatting functions in the database so that people can have their entire web server just be print $dbh-('select web_page(url)') ? Actually, you can already sort of do that using XSLT. So I don't necessary think that's a prohibitive idea, depending on implementation. After all, many of the new non-relational databases implement exactly this. They're useful so that a single session can do things like log errors even when a transaction rolls back. That's *also* an excellent use case. What makes it an excellent use case is that it's basically impossible to do without autonomous transactions. You can hack it with dblink but it's much less clean and much higher overhead. You could do it by using application code. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
On Apr 27, 2011, at 3:28 PM, Josh Berkus wrote: Actually, you can already sort of do that using XSLT. So I don't necessary think that's a prohibitive idea, depending on implementation. After all, many of the new non-relational databases implement exactly this. The proposed JSON data type and construction functions (once there's agreement on an implementation) will allow this, too. Just serve JSON. Boom, instant REST server. David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote: (1) All the \d commands in psql should be implemented in SPs so that they are available from any client, through calling one SP equivalent to one \d command. You don't need stored procedures with special transaction behavior for this. In fact, you probably shouldn't use them even if you had them, because you surely want a consistent view of, say, a table. (2) In certain types of loads -- in particular converting data from old systems into the database for a new system -- you need to load several tables in parallel, with queries among the tables which are being loaded. The ability to batch many DML statements into one transaction is important, to avoid excessive COMMIT overhead and related disk output; however, the ability to ANALYZE tables periodically is equally important, to prevent each access to an initially-empty table from being done as a table scan after it has millions of rows. VACUUM might become equally important if there are counts or totals being accumulated in some tables, or status columns are being updated, as rows are added to other tables. I'm not sure I really follow this. If your aim is to batch DML statements and avoid COMMIT overhead, why would you want to use stored procedures that possibly span multiple transactions? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
Peter Eisentraut pete...@gmx.net writes: On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote: (1) All the \d commands in psql should be implemented in SPs so that they are available from any client, through calling one SP equivalent to one \d command. You don't need stored procedures with special transaction behavior for this. No, but what you *would* need is the ability to return multiple result sets from one call. Even then, you could not exactly duplicate the current output of \d; but you could duplicate the functionality. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
Peter Eisentraut pete...@gmx.net wrote: On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote: (1) All the \d commands in psql should be implemented in SPs so that they are available from any client, through calling one SP equivalent to one \d command. You don't need stored procedures with special transaction behavior for this. In fact, you probably shouldn't use them even if you had them, because you surely want a consistent view of, say, a table. Agreed. I was just outlining use cases here, not trying to make a case for something in particular with each one. (2) In certain types of loads -- in particular converting data from old systems into the database for a new system -- you need to load several tables in parallel, with queries among the tables which are being loaded. The ability to batch many DML statements into one transaction is important, to avoid excessive COMMIT overhead and related disk output; however, the ability to ANALYZE tables periodically is equally important, to prevent each access to an initially-empty table from being done as a table scan after it has millions of rows. VACUUM might become equally important if there are counts or totals being accumulated in some tables, or status columns are being updated, as rows are added to other tables. I'm not sure I really follow this. If your aim is to batch DML statements and avoid COMMIT overhead, why would you want to use stored procedures that possibly span multiple transactions? The point is that if such a conversion is run in a situation where table access is always done on a plan based on empty tables, it starts to get pretty slow after a while. You need to commit, analyze, and start a new transaction for the queries to make new plans which run well. This obviously isn't an issue when you're blasting entire tables in through COPY commands without needing to reference other data being concurrently loaded. So, rough pseudo-code where this is done in a client app with autovacuum disabled would look something like: open input stream of non-normalized data open database connection while not EOF on input start transaction for 5 top level inputs (break on EOF) parse apart messy data, load into multiple tables (logic involves queries against tables being loaded) (some updates besides straight inserts) (print exceptions for questionable or undigestable data) end for commit transaction vacuum analyze end while In database products with stored procedures it has usually been faster to use an SP in the target database than to use a client program. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
Another point, as there appear to be diverging camps about supertransactional stored procedures vs. autonomous transactions, what would be the actual use cases of any of these features? Let's collect some, so we can think of ways to make them work. Here's where I wanted autonomous transactions just last week, and didn't have them so I had to use a python script outside the database: -- doing a CREATE INDEX CONCURRENTLY for 150 partitions on a partitioned table. -- doing a backfill operation for 10GB of computed data, taking 8 hours, where I don't want to hold a transaction open for 8 hours since this is a high-volume OLTP database. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
Tom Lane t...@sss.pgh.pa.us wrote: No, but what you *would* need is the ability to return multiple result sets from one call. At least. Even then, you could not exactly duplicate the current output of \d; but you could duplicate the functionality. I would think that psql could duplicate the output pretty closely, especially if the output of the stored procedure was a stream of intermingled result sets and messages (as from ereport). This is what many products provide. They usually show messages with a class '00' SQLSTATE just as plain text lines, and decorate the more severe levels with appropriate additional information. A while back I included a link to show what Sybase returns from their sp_help SP for various object types: http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc36273.1550/html/sprocs/X85190.htm Note the lines like: Object does not have any indexes. This came from the server as a SQLSTATE '0' message. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
Josh Berkus j...@agliodbs.com wrote: -- doing a backfill operation for 10GB of computed data, taking 8 hours, where I don't want to hold a transaction open for 8 hours since this is a high-volume OLTP database. Been there, done that. Definitely not a rare use case. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
On Tue, Apr 26, 2011 at 11:55 PM, Josh Berkus j...@agliodbs.com wrote: Here's where I wanted autonomous transactions just last week, and didn't have them so I had to use a python script outside the database: -- doing a CREATE INDEX CONCURRENTLY for 150 partitions on a partitioned table. -- doing a backfill operation for 10GB of computed data, taking 8 hours, where I don't want to hold a transaction open for 8 hours since this is a high-volume OLTP database. These don't seem like compelling use cases at all to me. You said you had to fall back to using a python script outside the database, but what disadvantage does that have? Why is moving your application logic into the database an improvement? Honestly in every case where I've had to move code that had been in a function to the application I've found there were tons of benefits. Everything from being able to better control the behaviour, to being able to parallelize the processing over multiple connections, being able to run parts of it at different times, being able to see the progress and control it from another session, being able to manage the code in version control, the list just goes on. Trying to move all the code into the database just makes life harder. Autonomous transactions have value on their own. But it's not so that you can run create index ocncurrently or vacuum or whatever. They're useful so that a single session can do things like log errors even when a transaction rolls back. Actually that's the only example I can think of but it's a pretty good use case on its own and I'm sure it's not entirely unique. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
On Mon, Apr 25, 2011 at 12:07 PM, Peter Eisentraut pete...@gmx.net wrote: Another point, as there appear to be diverging camps about supertransactional stored procedures vs. autonomous transactions, what would be the actual use cases of any of these features? Let's collect some, so we can think of ways to make them work. Some number of moons ago it would have been highly desirable to be able to create daemon worker processes out of UDFs. In practice, many such daemons want to do their own snapshot management (that is to say, acquire new ones...) and there's no nice way to do that by extending postgres. Instead, you cargo cult onto what autovacuum does and release your own postgres binary use SPI from outside a snapshot. Although it would be better still to have a worker pool type mechanic (see the async discussion happening recently), being able to have contribs or modules where one could run: SELECT do_the_thing(); And block indefinitely doing cross-snapshot work would be pretty useful, I feel. As a thought exercise, could one create: SELECT autovacuum(tuning, parameters, one, through, n); as a C UDF without bizarro snapshot mangling? (I believe we did play some tricks to escape the snapshot even in this case, but they weren't very lucid in the code, if memory serves). In any case, I've encountered at least a few situations where I'd like to be able to opt-out of getting one and exactly one snapshot in the daemon/worker case. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
On tor, 2011-04-21 at 18:24 +0300, Peter Eisentraut wrote: So the topic of real stored procedures came up again. Meaning a function-like object that executes outside of a regular transaction, with the ability to start and stop SQL transactions itself. I would like to collect some specs on this feature. So does anyone have links to documentation of existing implementations, or their own spec writeup? A lot of people appear to have a very clear idea of this concept in their own head, so let's start collecting those. Another point, as there appear to be diverging camps about supertransactional stored procedures vs. autonomous transactions, what would be the actual use cases of any of these features? Let's collect some, so we can think of ways to make them work. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
Peter Eisentraut pete...@gmx.net wrote: what would be the actual use cases of any of these features? Let's collect some, so we can think of ways to make them work. The two things which leap to mind for me are: (1) All the \d commands in psql should be implemented in SPs so that they are available from any client, through calling one SP equivalent to one \d command. The \d commands would be changed to call the SPs for releases recent enough to support this. Eventually psql would be free of worrying about which release contained which columns in which system tables, because it would just be passing the parameters in and displaying whatever results came back. I have used products which implemented something like this, and found it quite useful. (2) In certain types of loads -- in particular converting data from old systems into the database for a new system -- you need to load several tables in parallel, with queries among the tables which are being loaded. The ability to batch many DML statements into one transaction is important, to avoid excessive COMMIT overhead and related disk output; however, the ability to ANALYZE tables periodically is equally important, to prevent each access to an initially-empty table from being done as a table scan after it has millions of rows. VACUUM might become equally important if there are counts or totals being accumulated in some tables, or status columns are being updated, as rows are added to other tables. I've often had to do something like this during conversions. This could be handled in an external program (I've often done it in Java), but performance might be better if a stored procedure in PostgreSQL was able to keep SQL/MED streams of data open while committing and performing this maintenance every so many rows. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
Peter Eisentraut wrote: Another point, as there appear to be diverging camps about supertransactional stored procedures vs. autonomous transactions, what would be the actual use cases of any of these features? Looping over hundreds of identical schema executing DDL statements on each. We can't do this in a single transaction because it consumes all of shared memory with locks. -- todd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
On Mon, Apr 25, 2011 at 2:07 PM, Peter Eisentraut pete...@gmx.net wrote: On tor, 2011-04-21 at 18:24 +0300, Peter Eisentraut wrote: So the topic of real stored procedures came up again. Meaning a function-like object that executes outside of a regular transaction, with the ability to start and stop SQL transactions itself. I would like to collect some specs on this feature. So does anyone have links to documentation of existing implementations, or their own spec writeup? A lot of people appear to have a very clear idea of this concept in their own head, so let's start collecting those. Another point, as there appear to be diverging camps about supertransactional stored procedures vs. autonomous transactions, what would be the actual use cases of any of these features? Let's collect some, so we can think of ways to make them work. My answer is this: plpgsql with its first class SQL expressions, direct access to the postgres type system, and other nifty features has proven for me to be superior to all other languages in terms of defect rate, output progress for input work, and other metrics one might apply by a significant margin. By adding super-transactional (I prefer the phrasing, 'explicit control of transaction state') features you can eliminate all kinds of cases where you might otherwise be forced to coding on the client side. Lots of people prefer not to do this (or recoil in horror at the mere suggestion of doing so), and that's fine, but I don't like being prohibited from being able to do so by technical constraint. Explicit transaction controls remove those constraints. Anyone who really 'gets' plpgsql programming knows exactly what I'm talking about and has bumped into those constraints. Autonomous transactions, basically a formalization of the dblink style techniques of running SQL in a parallel transaction state, are also useful, but for different reasons. You can extend them pretty far to do most of things explicit transactions give you (like creating lots of tables or running 'CLUSTER') although I find having to force users to maintain separate transaction states just to do so to be a bit of a kludge, and the outermost function still has to terminate within a limited timeframe. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures - use cases?
Peter Eisentraut wrote: Another point, as there appear to be diverging camps about supertransactional stored procedures vs. autonomous transactions, what would be the actual use cases of any of these features? Let's collect some, so we can think of ways to make them work. An analogy I like to use for a very capable DBMS is that of an operating system, and each autonomous transaction is like a distinct process/thread in this system. The DBMS is like a virtual machine in which processes/autonomous transactions run. Like with an operating system, a process/auto-transaction can be started by another one, or by the OS/DBMS (or a root process/auto), and once running all processes are mutually independent to a large extent, in that each has its own separatable privileges or state or view of the database, the database being an analogy to the file system. A process/auto-transaction can be started by a DBMS client, analogous to a user, but it doesn't have to be. The message passing feature that Pg has, listen/notify, is like inter-process communication between these processes/autos. A stored procedure always runs within the context of one process/auto, and a regular transaction or savepoint or whatever is specific to a process/auto. Has anyone else thought of the DBMS as operating system analogy? I don't recall specifically reading this anywhere, but expect the thought may be common. -- Darren Duncan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers