Re: [HACKERS] SQL-Invoked Procedures for 8.1

Josh Berkus Thu, 23 Sep 2004 11:13:45 -0700

Gavin, Neil,

> Following is a proposal to implement what SQL2003 calls 'SQL-Invoked
> Procedures' and what most people refer to as stored procedures. Fujitsu
> will be funding Neil Conway and I to work on this feature.


Which, by the way, is way keen.

My comments are based on having professionally written several hundred 
thousand lines of procedural code for PostgreSQL, SQL Server, and Oracle.   I 
believe that your interpretation of the spec is correct but that there are 
several things not covered by the spec, but implemented by other RDBMSes, 
which make stored procedures *useful* which have been omitted.  I feel 
strongly that these things will make a large difference to people thinking of 
migrating to PostgreSQL from other DBMSes, and want to make sure that Neil's 
implementation does not make them harder, instead of easier, to to implement 
later.

> Procedures are nearly identical to functions. 

IMHO, this is largely because the spec regards a great deal of SP 
functionality to be "implementation-defined", and is thus kept as vague as 
possible.   In practice, other DBMSes which have both SPs and Functions treat 
them *very* differently.

> 3) Procedures can be run in the same savepoint level as the caller when
> OLD SAVEPOINT LEVEL is specified at creation time. According to SQL2003,
> functions must be run on a new savepoint level. From my understanding, we
> do not do this currently.
>
> Work will focus on 1 and 2 until we have the concept of savepoint levels
> with functions. Its possible that we will implement this too if there is
> demand.

One of the things which differentiates SPs on other DBs from PostgreSQL 
Functions is transactionality.    In SQL Server and Oracle, SPs are not 
automatically a transaction; instead, they contain transactions within them.   
This is vitally important to DBAs who want to use SPs to automate database 
maintenance, loads, transformations, and other activities which require 
checkpointing within the course of a program.   

For example, I run a nightly data transformation for one client which requires 
16 steps with VACUUMs, ANALYZEs and error-handling between them.    
Currently, the only way I can implement this for PostgreSQL is to have an 
external program (Perl, in my case) manage this and call each step as a 
separate function.    It would be far easier to manage if I could put all of 
the steps, including the vaccums inside one long-running SP, but the required 
transaction container prevents this.

> If we go down the route of saying that procedures are a type of function,
> we have the option of allowing users access to OUT and INOUT in functions.
> This would make procedures simply a subset of functions. What do people
> think?

Well, to be frank, my first thought is, why bother?   If you're just going to 
implement some syntatic sugar on top of the current Function feature, why 
bother at all?

Given the opportunity, I would far prefer to set us on a road that would allow 
us to replicate -- and exceed -- the functionality of Oracle's PL/SQL.   This 
proposal does not do that; in fact, if someone were to start implementing 
such functionality later they might find this code a stumbling block.

> There will be cases when we need to identify whether a routine is a
> function or a procedure. This could be done two ways. We could say that
> any proc in pg_proc which returns void is a procedure or we could store
> this in some 'protype' column. Thoughts?

Well, see my thoughts above on differentiating SPs from Functions.    I 
certainly don't think we should be using the same table.

A second point, which I brought up with you on IRC, is to eliminate 
overloading and allow named parameter calls on SPs.   This is extremely 
useful functionality in T-SQL and PL/SQL; in fact, I'd say that it's 
essential for any operation that wants to create an SP-centric middleware as 
only named parameter calls allow developers to add parameters to existing 
procedures without breaking existing calls.

For anyone who doesn't know what I'm talking about, it's this form:

CREATE PROCEDURE do_some_work ( alpha INT, beta INT, gamma TEXT )
etc.

Where you can:
CALL do_some_work( alpha = 5, beta = 7 ) 
.. and then gamma picks up its default, if any, or even:
CALL do_some_work( gamma = 'Lord Fontleroy', beta = 7, alpha = 1 )

The idea being that for SPs, schema.name is unique regardless of the 
parameters.    Even if implementing named parameter calls is beyond the 
current spec, I will argue strongly in favor of eliminating overloading for 
SPs.   Overloading serves no purpose for them and prohibits the whole concept 
of default values.

> Other databases, and SQL2003, support a few different implementations what
> could be called variables. In SQL2003, there are a few types of
> 'variables':

I completely follow your thinking about variables, and find it makes sense.

> SET VARIABLE <varname> = <val>
> SHOW VARIABLE <varname>

The problem with this is name collisions with the GUC -- especially as it now 
allows add-ins to create their own GUC variables.   However intuitive SET and 
SHOW are, they will lead to problems.   Maybe SETVAR and SHOWVAR?  Or using 
the PL/pgsql assignment operator for the first, and DISPLAY for the second, 
e.g.:

num_logos := 917;
DISPLAY num_logos;

Also, you do realize that this is going to lead to requests for SELECT .... 
INTO on the SQL command line, don't you?

> The other option is that we do it at the protocol level and modify libpq
> (and psql) to add support. This would allow us something like:

This may be a better approach.   I've personally never been comfortable with 
the use of variables outside of SPs and packages; it seems orthagonal to the 
declaritive nature of SQL.  However, this is a aesthic thing and not really 
based on practical considerations.

The more practical consideration is, where will OUT and INOUT parameters be 
used?   Do we want them returned to the SQL session or directly to the 
calling client?   I would think that practicality would argue in favor of the 
latter; I can't see needing variables in SQL except for testing, and having 
them in psql will allow me that.

> That might need some massaging but you get the idea. The commands to psql
> translate to lower level protocol level commands which are: create
> variable, set (ie, modify the value of) variable and describe variable.
> Neil thinks we should have create and set in one step. This is great, since
> it covers most cases, but I'm not sure how we have uninitialised values.
> Perhaps we set to NULL?

Yes.

> The only other question (that I can think of now) with respect to
> variables is how they are affected by transactions. My gut feel is
> that they should have transactional semantics. ÂFor example:

I agree strongly with this, especially since we'll be using Savepoints inside 
the SPs.   Having one's data mods roll back but not the variable values would 
be confusing and lead to a *lot* of debugging.

> We can use the same permissions as for functions.

Agreed.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match

Re: [HACKERS] SQL-Invoked Procedures for 8.1

Reply via email to