Re: [HACKERS] stored procedures - use cases?

2011-05-12 Thread Markus Wanner
Hi,

On 05/10/2011 02:55 PM, Robert Haas wrote:
 On Mon, May 9, 2011 at 11:58 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 no - you are little bit confused :). CALL and function execution
 shares nothing. There is significant differences between function and
 procedure. Function is called only from executor - from some plan, and
 you have to know a structure of result before run. The execution of
 CALL is much simple - you just execute code - without plan and waiting
 for any result - if there is.

I think the distinction between function and procedure is misleading
here.  Some envision stored *procedures* to be able to return values,
result sets and possibly even *multiple* result sets.

 The main features seem to be (1) explicit transaction control
 and/or execution of commands like VACUUM that can't be invoked from
 within a transaction,

I think that's the main point of stored procedures.

 (2) autonomous transactions

To me autonomous transactions seem orthogonal.  Those can be used to
implement (1) above, but might have other uses for regular transactions
as well.

(The point I'm taking home here is that you might want to control not
only one concurrent transaction, but several from a stored procedure.
 So far, I assumed only one.)

 and (3) returning
 multiple result sets.  But I don't think anybody would be desperately
 unhappy if it magically became possible to do those things from
 regular functions, unlikely as that may seem.

That point definitely is on my wish-list for UDFs already.  I didn't
think of this as having to do with stored procedures, either.

Regards

Markus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-05-10 Thread Robert Haas
On Mon, May 9, 2011 at 11:58 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 no - you are little bit confused :). CALL and function execution
 shares nothing. There is significant differences between function and
 procedure. Function is called only from executor - from some plan, and
 you have to know a structure of result before run. The execution of
 CALL is much simple - you just execute code - without plan and waiting
 for any result - if there is.

Now I'm a little confused, or you are.  Surely any SQL has to be
planned and executed, regardless of whether it appears in a function,
a stored procedure, or anywhere else.  Non-SQL statements within a
stored procedure don't need to go through the planner and executor,
but that's true in PL/python or PL/pgsql or whatever today.

I think people are using the term stored procedures to refer to
approximately whatever it is that they're unhappy that functions don't
allow, and that's leading to a lot of people talking across each
other.  The main features seem to be (1) explicit transaction control
and/or execution of commands like VACUUM that can't be invoked from
within a transaction, (2) autonomous transactions, and (3) returning
multiple result sets.  But I don't think anybody would be desperately
unhappy if it magically became possible to do those things from
regular functions, unlikely as that may seem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-05-10 Thread Pavel Stehule
2011/5/10 Robert Haas robertmh...@gmail.com:
 On Mon, May 9, 2011 at 11:58 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 no - you are little bit confused :). CALL and function execution
 shares nothing. There is significant differences between function and
 procedure. Function is called only from executor - from some plan, and
 you have to know a structure of result before run. The execution of
 CALL is much simple - you just execute code - without plan and waiting
 for any result - if there is.

 Now I'm a little confused, or you are.  Surely any SQL has to be
 planned and executed, regardless of whether it appears in a function,
 a stored procedure, or anywhere else.  Non-SQL statements within a
 stored procedure don't need to go through the planner and executor,
 but that's true in PL/python or PL/pgsql or whatever today.


CALL statement is util command than SQL. It has to execute some NON SQL code.

You can thinking about CALL statement like synonymum for SELECT, but
it isn't correct (it is my opinion)

The stored procedures was prior stored functions (more corectly UDF
- user defined functions). These old time stored procedures was
simply - it was client code moved on server. Usually these procedures
was executed in different process or different thread. Inside
procedures was full client's side functionality and there wasn't a
network overhead. CALL statement is +/- remote call. It isn't SQL
statement.

 I think people are using the term stored procedures to refer to
 approximately whatever it is that they're unhappy that functions don't
 allow, and that's leading to a lot of people talking across each
 other.  The main features seem to be (1) explicit transaction control
 and/or execution of commands like VACUUM that can't be invoked from
 within a transaction, (2) autonomous transactions, and (3) returning
 multiple result sets.  But I don't think anybody would be desperately
 unhappy if it magically became possible to do those things from
 regular functions, unlikely as that may seem.


yes.

@2 Autonomous transaction doesn't need stored procedures. Autonomous
transaction can be isolated by function's flag, by some special
PL/pgSQL statement:

like

BEGIN
  EXECUTE AUTONOMOUS ''
END;

@3 is possible now too - but not too much user friendly. Point 3 is
strange. Oracle doesn't support it. Support in DB2 is little bit
strange. And it is well supported by MySQL, MSSQL, maybe Informix,
Sybase.

Pavel

Regards

Pavel

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-05-09 Thread Bruce Momjian
Kevin Grittner wrote:
 Peter Eisentraut pete...@gmx.net wrote:
  
  what would be the actual use cases of any of these features? 
  Let's collect some, so we can think of ways to make them work.
  
 The two things which leap to mind for me are:
  
 (1)  All the \d commands in psql should be implemented in SPs so
 that they are available from any client, through calling one SP
 equivalent to one \d command.  The \d commands would be changed to
 call the SPs for releases recent enough to support this.  Eventually
 psql would be free of worrying about which release contained which
 columns in which system tables, because it would just be passing the
 parameters in and displaying whatever results came back.
  
 I have used products which implemented something like this, and
 found it quite useful.

Uh, why does this require stored procedures?  Seems our existing
function capabilities are even better suited to this.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-05-09 Thread Bruce Momjian
Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  On m??n, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote:
  (1)  All the \d commands in psql should be implemented in SPs so
  that they are available from any client, through calling one SP
  equivalent to one \d command.
 
  You don't need stored procedures with special transaction behavior for
  this.
 
 No, but what you *would* need is the ability to return multiple result
 sets from one call.  Even then, you could not exactly duplicate the
 current output of \d; but you could duplicate the functionality.

Oh, good point.  Thanks.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-05-09 Thread Andrew Dunstan



On 05/09/2011 08:20 PM, Bruce Momjian wrote:

Tom Lane wrote:

Peter Eisentrautpete...@gmx.net  writes:

On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote:

(1)  All the \d commands in psql should be implemented in SPs so
that they are available from any client, through calling one SP
equivalent to one \d command.

You don't need stored procedures with special transaction behavior for
this.

No, but what you *would* need is the ability to return multiple result
sets from one call.  Even then, you could not exactly duplicate the
current output of \d; but you could duplicate the functionality.

Oh, good point.  Thanks.


Multiple resultsets in one call would be a good thing, though, no?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-05-09 Thread Christopher Browne
On Mon, May 9, 2011 at 9:21 PM, Andrew Dunstan and...@dunslane.net wrote:


 On 05/09/2011 08:20 PM, Bruce Momjian wrote:

 Tom Lane wrote:

 Peter Eisentrautpete...@gmx.net  writes:

 On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote:

 (1)  All the \d commands in psql should be implemented in SPs so
 that they are available from any client, through calling one SP
 equivalent to one \d command.

 You don't need stored procedures with special transaction behavior for
 this.

 No, but what you *would* need is the ability to return multiple result
 sets from one call.  Even then, you could not exactly duplicate the
 current output of \d; but you could duplicate the functionality.

 Oh, good point.  Thanks.

 Multiple resultsets in one call would be a good thing, though, no?

 cheers

I *thought* the purpose of having stored procedures was to allow a
substrate supporting running multiple transactions, so it could do
things like:
- Managing vacuums
- Managing transactions
- Replacing some of the need for dblink.
- Being an in-DB piece that could manage LISTENs

It seems to be getting bikeshedded into something with more
functional argument functionality than stored functions.

I think we could have a perfectly successful implementation of stored
procedures that supports ZERO ability to pass arguments in or out.
That's quite likely to represent a good start.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-05-09 Thread Bruce Momjian
Christopher Browne wrote:
  Multiple resultsets in one call would be a good thing, though, no?
 
  cheers
 
 I *thought* the purpose of having stored procedures was to allow a
 substrate supporting running multiple transactions, so it could do
 things like:
 - Managing vacuums
 - Managing transactions
 - Replacing some of the need for dblink.
 - Being an in-DB piece that could manage LISTENs
 
 It seems to be getting bikeshedded into something with more
 functional argument functionality than stored functions.
 
 I think we could have a perfectly successful implementation of stored
 procedures that supports ZERO ability to pass arguments in or out.
 That's quite likely to represent a good start.

I am kind of confused too, particularly with the CALL syntax.  I thought
our function call usage was superior in every way to CALL, so why
implement CALL?  I assume for SQL-standards compliance, right?  Does
multiple result sets require CALL?  I assume autonomous transactions
don't require CALL.

Are we assuming no one is going to want a function that allows multiple
result sets or autonomous transactions?  That seems unlikely.  I would
think CALL is independent of those features.  Maybe we need those
features to support SQL-standard CALL, and we will just add those
features to functions too.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-05-09 Thread Pavel Stehule
2011/5/10 Bruce Momjian br...@momjian.us:
 Christopher Browne wrote:
  Multiple resultsets in one call would be a good thing, though, no?
 
  cheers

 I *thought* the purpose of having stored procedures was to allow a
 substrate supporting running multiple transactions, so it could do
 things like:
 - Managing vacuums
 - Managing transactions
 - Replacing some of the need for dblink.
 - Being an in-DB piece that could manage LISTENs

 It seems to be getting bikeshedded into something with more
 functional argument functionality than stored functions.

 I think we could have a perfectly successful implementation of stored
 procedures that supports ZERO ability to pass arguments in or out.
 That's quite likely to represent a good start.

 I am kind of confused too, particularly with the CALL syntax.  I thought
 our function call usage was superior in every way to CALL, so why
 implement CALL?  I assume for SQL-standards compliance, right?  Does
 multiple result sets require CALL?  I assume autonomous transactions
 don't require CALL.


no - you are little bit confused :). CALL and function execution
shares nothing. There is significant differences between function and
procedure. Function is called only from executor - from some plan, and
you have to know a structure of result before run. The execution of
CALL is much simple - you just execute code - without plan and waiting
for any result - if there is.

 Are we assuming no one is going to want a function that allows multiple
 result sets or autonomous transactions?  That seems unlikely.  I would
 think CALL is independent of those features.  Maybe we need those
 features to support SQL-standard CALL, and we will just add those
 features to functions too.


We can use a SETOF cursors for returning a multiple result sets now.
But there are a few complications:

a) The client should to wait for finish of all sets from multiple
result sets - minimally in PL/pgSQL
b) client 'psql' doesn't support a unpacking result when result is
multiple result set
c) The using cursors isn't too comfortable - in comparation to MS SQL or MySQL

Regards

Pavel




 --
  Bruce Momjian  br...@momjian.us        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-29 Thread Jim Nasby
On Apr 26, 2011, at 6:08 PM, Kevin Grittner wrote:
 Josh Berkus j...@agliodbs.com wrote:
 -- doing a backfill operation for 10GB of computed data, taking 8
 hours, where I don't want to hold a transaction open for 8 hours
 since this is a high-volume OLTP database.
 
 Been there, done that.  Definitely not a rare use case.

We do that so often we've actually written a framework around it and are 
working on a daemon that will deal with any backfills that have been registered 
in the system. If we could control transactions that daemon could be entirely 
in the database... but since we can't, we have to write it in another language 
outside the database and switch back and forth between the two worlds.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-27 Thread Merlin Moncure
On Tue, Apr 26, 2011 at 10:12 PM, Greg Stark gsst...@mit.edu wrote:
 On Tue, Apr 26, 2011 at 11:55 PM, Josh Berkus j...@agliodbs.com wrote:
 Here's where I wanted autonomous transactions just last week, and didn't
 have them so I had to use a python script outside the database:

 -- doing a CREATE INDEX CONCURRENTLY for 150 partitions on a partitioned
 table.

 -- doing a backfill operation for 10GB of computed data, taking 8 hours,
 where I don't want to hold a transaction open for 8 hours since this is
 a high-volume OLTP database.

 These don't seem like compelling use cases at all to me. You said you
 had to fall back to using a python script outside the database, but
 what disadvantage does that have? Why is moving your application logic
 into the database an improvement?

 Honestly in every case where I've had to move code that had been in a
 function to the application I've found there were tons of benefits.
 Everything from being able to better control the behaviour, to being
 able to parallelize the processing over multiple connections, being
 able to run parts of it at different times, being able to see the
 progress and control it from another session, being able to manage the
 code in version control, the list just goes on. Trying to move all the
 code into the database just makes life harder.

my experience has been the opposite.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-27 Thread Josh Berkus

 These don't seem like compelling use cases at all to me. You said you
 had to fall back to using a python script outside the database, but
 what disadvantage does that have? Why is moving your application logic
 into the database an improvement?

Since both were part of a code rollout, it complicated our deployment
process considerably and took a deployment which could have been
push-button automatic and forced us to do it by manually logging into
the shell on the database server.

  Trying to move all the
 code into the database just makes life harder.

I might make *your* life harder.  It makes *mine* easier.

If you pursue your argument a little further, Greg, why do we have
functions at all?  We could do it all in the application.

 Autonomous transactions have value on their own. But it's not so that
 you can run create index ocncurrently or vacuum or whatever. 

Why not?  Why are you so intent on making my life harder?

 They're
 useful so that a single session can do things like log errors even
 when a transaction rolls back.

That's *also* an excellent use case.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-27 Thread Greg Stark
On Wed, Apr 27, 2011 at 6:48 PM, Josh Berkus j...@agliodbs.com wrote:
 If you pursue your argument a little further, Greg, why do we have
 functions at all?  We could do it all in the application.

 Autonomous transactions have value on their own. But it's not so that
 you can run create index ocncurrently or vacuum or whatever.

 Why not?  Why are you so intent on making my life harder?

Because we want to be able to manipulate data in queries in
data-type-specific ways. For example we want to do aggregations on the
result of a function or index scans across a user data type, etc. If
all the functions do is implement application logic then you end up
having half your application logic in the application and half in the
database and it's hard to keep them in sync.

To take the argument in the opposite extreme would you suggest we
should have html formatting functions in the database so that people
can have their entire web server just be print $dbh-('select
web_page(url)') ?

 They're
 useful so that a single session can do things like log errors even
 when a transaction rolls back.

 That's *also* an excellent use case.

What makes it an excellent use case is that it's basically impossible
to do without autonomous transactions. You can hack it with dblink but
it's much less clean and much higher overhead.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-27 Thread Josh Berkus
Greg,

 Because we want to be able to manipulate data in queries in
 data-type-specific ways. For example we want to do aggregations on the
 result of a function or index scans across a user data type, etc. 

I don't see how this is different from wanting to capture error output,
which would face the same issues.  You seem to be wanting to make a hard
feature easier by telling me that I don't actually want the things I
want.  Wanna make it even easier?  Then Stored Procedures are just
functions without a return value.  That's a 40-line patch.  Done!

 If
 all the functions do is implement application logic then you end up
 having half your application logic in the application and half in the
 database and it's hard to keep them in sync.

You build your applications your way, and I'll build mine my way.  I'll
just ask you not to try to dictate to me how I should build
applications.  Especially, since, based on the responses on this thread,
a LOT of people would like to have multitransaction control inside a
stored procedure script.  I suspect that your experience of application
development has been rather narrow.

 To take the argument in the opposite extreme would you suggest we
 should have html formatting functions in the database so that people
 can have their entire web server just be print $dbh-('select
 web_page(url)') ?

Actually, you can already sort of do that using XSLT.   So I don't
necessary think that's a prohibitive idea, depending on implementation.
 After all, many of the new non-relational databases implement exactly this.

 They're
 useful so that a single session can do things like log errors even
 when a transaction rolls back.

 That's *also* an excellent use case.
 
 What makes it an excellent use case is that it's basically impossible
 to do without autonomous transactions. You can hack it with dblink but
 it's much less clean and much higher overhead.

You could do it by using application code.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-27 Thread David E. Wheeler
On Apr 27, 2011, at 3:28 PM, Josh Berkus wrote:

 Actually, you can already sort of do that using XSLT.   So I don't
 necessary think that's a prohibitive idea, depending on implementation.
 After all, many of the new non-relational databases implement exactly this.

The proposed JSON data type and construction functions (once there's agreement 
on an implementation) will allow this, too. Just serve JSON. Boom, instant REST 
server.

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-26 Thread Peter Eisentraut
On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote:
 (1)  All the \d commands in psql should be implemented in SPs so
 that they are available from any client, through calling one SP
 equivalent to one \d command.

You don't need stored procedures with special transaction behavior for
this.  In fact, you probably shouldn't use them even if you had them,
because you surely want a consistent view of, say, a table.
 
 (2)  In certain types of loads -- in particular converting data from
 old systems into the database for a new system -- you need to load
 several tables in parallel, with queries among the tables which are
 being loaded.  The ability to batch many DML statements into one
 transaction is important, to avoid excessive COMMIT overhead and
 related disk output; however, the ability to ANALYZE tables
 periodically is equally important, to prevent each access to an
 initially-empty table from being done as a table scan after it has
 millions of rows.  VACUUM might become equally important if there
 are counts or totals being accumulated in some tables, or status
 columns are being updated, as rows are added to other tables.

I'm not sure I really follow this.  If your aim is to batch DML
statements and avoid COMMIT overhead, why would you want to use stored
procedures that possibly span multiple transactions?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-26 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote:
 (1)  All the \d commands in psql should be implemented in SPs so
 that they are available from any client, through calling one SP
 equivalent to one \d command.

 You don't need stored procedures with special transaction behavior for
 this.

No, but what you *would* need is the ability to return multiple result
sets from one call.  Even then, you could not exactly duplicate the
current output of \d; but you could duplicate the functionality.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-26 Thread Kevin Grittner
Peter Eisentraut pete...@gmx.net wrote:
 On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote:
 (1)  All the \d commands in psql should be implemented in SPs so
 that they are available from any client, through calling one SP
 equivalent to one \d command.
 
 You don't need stored procedures with special transaction behavior
 for this.  In fact, you probably shouldn't use them even if you
 had them, because you surely want a consistent view of, say, a
 table.
 
Agreed.  I was just outlining use cases here, not trying to make a
case for something in particular with each one.
 
 (2)  In certain types of loads -- in particular converting data
 from old systems into the database for a new system -- you need
 to load several tables in parallel, with queries among the tables
 which are being loaded.  The ability to batch many DML statements
 into one transaction is important, to avoid excessive COMMIT
 overhead and related disk output; however, the ability to ANALYZE
 tables periodically is equally important, to prevent each access
 to an initially-empty table from being done as a table scan after
 it has millions of rows.  VACUUM might become equally important
 if there are counts or totals being accumulated in some tables,
 or status columns are being updated, as rows are added to other
 tables.
 
 I'm not sure I really follow this.  If your aim is to batch DML
 statements and avoid COMMIT overhead, why would you want to use
 stored procedures that possibly span multiple transactions?
 
The point is that if such a conversion is run in a situation where
table access is always done on a plan based on empty tables, it
starts to get pretty slow after a while.  You need to commit,
analyze, and start a new transaction for the queries to make new
plans which run well.  This obviously isn't an issue when you're
blasting entire tables in through COPY commands without needing to
reference other data being concurrently loaded.
 
So, rough pseudo-code where this is done in a client app with
autovacuum disabled would look something like:
 
  open input stream of non-normalized data
  open database connection
  while not EOF on input
start transaction
for 5 top level inputs (break on EOF)
  parse apart messy data, load into multiple tables
(logic involves queries against tables being loaded)
(some updates besides straight inserts)
(print exceptions for questionable or undigestable data)
end for
commit transaction
vacuum analyze
  end while
 
In database products with stored procedures it has usually been
faster to use an SP in the target database than to use a client
program.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-26 Thread Josh Berkus

 Another point, as there appear to be diverging camps about
 supertransactional stored procedures vs. autonomous transactions, what
 would be the actual use cases of any of these features?  Let's collect
 some, so we can think of ways to make them work.

Here's where I wanted autonomous transactions just last week, and didn't
have them so I had to use a python script outside the database:

-- doing a CREATE INDEX CONCURRENTLY for 150 partitions on a partitioned
table.

-- doing a backfill operation for 10GB of computed data, taking 8 hours,
where I don't want to hold a transaction open for 8 hours since this is
a high-volume OLTP database.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-26 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 No, but what you *would* need is the ability to return multiple
 result sets from one call.
 
At least.
 
 Even then, you could not exactly duplicate the current output of
 \d; but you could duplicate the functionality.
 
I would think that psql could duplicate the output pretty closely, 
especially if the output of the stored procedure was a stream of
intermingled result sets and messages (as from ereport).  This is
what many products provide.  They usually show messages with a class
'00' SQLSTATE just as plain text lines, and decorate the more severe
levels with appropriate additional information.
 
A while back I included a link to show what Sybase returns from
their sp_help SP for various object types:
 
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc36273.1550/html/sprocs/X85190.htm
 
Note the lines like:
 
 Object does not have any indexes.
 
This came from the server as a SQLSTATE '0' message.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-26 Thread Kevin Grittner
Josh Berkus j...@agliodbs.com wrote:
 
 -- doing a backfill operation for 10GB of computed data, taking 8
 hours, where I don't want to hold a transaction open for 8 hours
 since this is a high-volume OLTP database.
 
Been there, done that.  Definitely not a rare use case.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-26 Thread Greg Stark
On Tue, Apr 26, 2011 at 11:55 PM, Josh Berkus j...@agliodbs.com wrote:
 Here's where I wanted autonomous transactions just last week, and didn't
 have them so I had to use a python script outside the database:

 -- doing a CREATE INDEX CONCURRENTLY for 150 partitions on a partitioned
 table.

 -- doing a backfill operation for 10GB of computed data, taking 8 hours,
 where I don't want to hold a transaction open for 8 hours since this is
 a high-volume OLTP database.

These don't seem like compelling use cases at all to me. You said you
had to fall back to using a python script outside the database, but
what disadvantage does that have? Why is moving your application logic
into the database an improvement?

Honestly in every case where I've had to move code that had been in a
function to the application I've found there were tons of benefits.
Everything from being able to better control the behaviour, to being
able to parallelize the processing over multiple connections, being
able to run parts of it at different times, being able to see the
progress and control it from another session, being able to manage the
code in version control, the list just goes on. Trying to move all the
code into the database just makes life harder.

Autonomous transactions have value on their own. But it's not so that
you can run create index ocncurrently or vacuum or whatever. They're
useful so that a single session can do things like log errors even
when a transaction rolls back. Actually that's the only example I can
think of but it's a pretty good use case on its own and I'm sure it's
not entirely unique.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-26 Thread Daniel Farina
On Mon, Apr 25, 2011 at 12:07 PM, Peter Eisentraut pete...@gmx.net wrote:
 Another point, as there appear to be diverging camps about
 supertransactional stored procedures vs. autonomous transactions, what
 would be the actual use cases of any of these features?  Let's collect
 some, so we can think of ways to make them work.

Some number of moons ago it would have been highly desirable to be
able to create daemon worker processes out of UDFs.  In practice, many
such daemons want to do their own snapshot management (that is to say,
acquire new ones...) and there's no nice way to do that by extending
postgres.  Instead, you cargo cult onto what autovacuum does and
release your own postgres binary use SPI from outside a snapshot.
Although it would be better still to have a worker pool type mechanic
(see the async discussion happening recently), being able to have
contribs or modules where one could run:

SELECT do_the_thing();

And block indefinitely doing cross-snapshot work would be pretty
useful, I feel.  As a thought exercise, could one create:

SELECT autovacuum(tuning, parameters, one, through, n); as a C UDF
without bizarro snapshot mangling? (I believe we did play some tricks
to escape the snapshot even in this case, but they weren't very lucid
in the code, if memory serves).

In any case, I've encountered at least a few situations where I'd like
to be able to opt-out of getting one and exactly one snapshot in the
daemon/worker case.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-25 Thread Peter Eisentraut
On tor, 2011-04-21 at 18:24 +0300, Peter Eisentraut wrote:
 So the topic of real stored procedures came up again.  Meaning a
 function-like object that executes outside of a regular transaction,
 with the ability to start and stop SQL transactions itself.
 
 I would like to collect some specs on this feature.  So does anyone have
 links to documentation of existing implementations, or their own spec
 writeup?  A lot of people appear to have a very clear idea of this
 concept in their own head, so let's start collecting those.

Another point, as there appear to be diverging camps about
supertransactional stored procedures vs. autonomous transactions, what
would be the actual use cases of any of these features?  Let's collect
some, so we can think of ways to make them work.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-25 Thread Kevin Grittner
Peter Eisentraut pete...@gmx.net wrote:
 
 what would be the actual use cases of any of these features? 
 Let's collect some, so we can think of ways to make them work.
 
The two things which leap to mind for me are:
 
(1)  All the \d commands in psql should be implemented in SPs so
that they are available from any client, through calling one SP
equivalent to one \d command.  The \d commands would be changed to
call the SPs for releases recent enough to support this.  Eventually
psql would be free of worrying about which release contained which
columns in which system tables, because it would just be passing the
parameters in and displaying whatever results came back.
 
I have used products which implemented something like this, and
found it quite useful.
 
(2)  In certain types of loads -- in particular converting data from
old systems into the database for a new system -- you need to load
several tables in parallel, with queries among the tables which are
being loaded.  The ability to batch many DML statements into one
transaction is important, to avoid excessive COMMIT overhead and
related disk output; however, the ability to ANALYZE tables
periodically is equally important, to prevent each access to an
initially-empty table from being done as a table scan after it has
millions of rows.  VACUUM might become equally important if there
are counts or totals being accumulated in some tables, or status
columns are being updated, as rows are added to other tables.
 
I've often had to do something like this during conversions.  This
could be handled in an external program (I've often done it in
Java), but performance might be better if a stored procedure in
PostgreSQL was able to keep SQL/MED streams of data open while
committing and performing this maintenance every so many rows.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-25 Thread Todd A. Cook

Peter Eisentraut wrote:


Another point, as there appear to be diverging camps about
supertransactional stored procedures vs. autonomous transactions, what
would be the actual use cases of any of these features?


Looping over hundreds of identical schema executing DDL statements on
each.  We can't do this in a single transaction because it consumes
all of shared memory with locks.

-- todd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-25 Thread Merlin Moncure
On Mon, Apr 25, 2011 at 2:07 PM, Peter Eisentraut pete...@gmx.net wrote:
 On tor, 2011-04-21 at 18:24 +0300, Peter Eisentraut wrote:
 So the topic of real stored procedures came up again.  Meaning a
 function-like object that executes outside of a regular transaction,
 with the ability to start and stop SQL transactions itself.

 I would like to collect some specs on this feature.  So does anyone have
 links to documentation of existing implementations, or their own spec
 writeup?  A lot of people appear to have a very clear idea of this
 concept in their own head, so let's start collecting those.

 Another point, as there appear to be diverging camps about
 supertransactional stored procedures vs. autonomous transactions, what
 would be the actual use cases of any of these features?  Let's collect
 some, so we can think of ways to make them work.

My answer is this:
plpgsql with its first class SQL expressions, direct access to the
postgres type system, and other nifty features has proven for me to be
superior to all other languages in terms of defect rate, output
progress for input work, and other metrics one might apply by a
significant margin.  By adding super-transactional (I prefer the
phrasing, 'explicit control of transaction state') features you can
eliminate all kinds of cases where you might otherwise be forced to
coding on the client side.  Lots of people prefer not to do this (or
recoil in horror at the mere suggestion of doing so), and that's fine,
but I don't like being prohibited from being able to do so by
technical constraint.  Explicit transaction controls remove those
constraints.  Anyone who really 'gets' plpgsql programming knows
exactly what I'm talking about and has bumped into those constraints.

Autonomous transactions, basically a formalization of the dblink style
techniques of running SQL in a parallel transaction state, are also
useful, but for different reasons.  You can extend them pretty far to
do most of things explicit transactions give you (like creating lots
of tables or running 'CLUSTER') although I find having to force users
to maintain separate transaction states just to do so to be a bit of a
kludge, and the outermost function still has to terminate within a
limited timeframe.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures - use cases?

2011-04-25 Thread Darren Duncan

Peter Eisentraut wrote:

Another point, as there appear to be diverging camps about
supertransactional stored procedures vs. autonomous transactions, what
would be the actual use cases of any of these features?  Let's collect
some, so we can think of ways to make them work.


An analogy I like to use for a very capable DBMS is that of an operating system, 
and each autonomous transaction is like a distinct process/thread in this 
system.  The DBMS is like a virtual machine in which processes/autonomous 
transactions run.


Like with an operating system, a process/auto-transaction can be started by 
another one, or by the OS/DBMS (or a root process/auto), and once running all 
processes are mutually independent to a large extent, in that each has its own 
separatable privileges or state or view of the database, the database being an 
analogy to the file system.


A process/auto-transaction can be started by a DBMS client, analogous to a user, 
but it doesn't have to be.  The message passing feature that Pg has, 
listen/notify, is like inter-process communication between these processes/autos.


A stored procedure always runs within the context of one process/auto, and a 
regular transaction or savepoint or whatever is specific to a process/auto.


Has anyone else thought of the DBMS as operating system analogy?  I don't recall 
specifically reading this anywhere, but expect the thought may be common.


-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers