On Tue, Oct 16, 2012 at 2:12 PM, Stafford, David x78061
<david.staff...@broadridge.com> wrote:
> On Mon, 15 Oct 2012 10:21:18 -0700, Robert Haas <robertmh...@gmail.com> wrote:
>> Yeah, I think the functionality that we need is pretty much there
>> already today.  What we need to do is to get the syntax to a point
>> where people can write the code they want to write without getting
>> tangled up by it.
>>
>> I think the invention of DO was a big step in the right direction
>> ...
>> With DO, you can write the logic you want
>> as an SQL statement, it's just a clunky and awkward SQL statement.  In
>> my view the goal ought to be to refine that mechanism to remove the
>> clunkiness and awkwardness, rather than to invent something completely
>> new.
>
> As someone who has worked with a number of databases now, none of them really 
> get this DDL integration completely right.  What I'd like to see is 1) a 
> predicate to easily test things about the schema (does this table, column, 
> index, schema, etc. exist?  does it have the right type?) and 2) a way to 
> conditionally execute DDL (and DML, which should fall right out, but it isn't 
> really what this discussion is covering).  I would propose extending the 
> current EXISTS / NOT EXISTS predicate as follows:
>
> [NOT] EXISTS TABLE tab [ ( col [type] [ , col [type]]... ) ]
> [NOT] EXISTS COLUMN tab.col [type]
> [NOT] EXISTS INDEX tab.idx [ ( col [ , col]... ) ]
> [NOT] EXISTS CONSTRAINT tab.cstrt -- not sure what else might be useful here
> [NOT] EXISTS ( select ) -- this is the current EXISTS predicate, of course
> [NOT] EXISTS ANY ROWS FROM tab [join] [WHERE predicate] [GROUP BY col [ , 
> col]...] -- exactly the same as
>                                                                               
>      -- (select 1 FROM etc.)
>                                                                               
>      -- only because I like
>                                                                               
>      -- it better
> (the latter [which by no means am I nuts over; it's just that when extending 
> EXISTS I can't stop neatening it up to my personal preferences] could be 
> extended with [NOT] EXISTS MORE THAN n ROWS FROM and [NOT] EXISTS EXACTLY n 
> ROWS FROM.)
>
> There is a new SQL statement: IF predicate true-statement [ELSE 
> false-statement].
>
> To actually execute this new IF statement, the executor would need an IF node 
> that evaluates the predicate (with ANDs and ORs, just like all SQL 
> predicates) and then executes the rest only if the predicate came out TRUE 
> (or NOT FALSE; I forget which is usually used, and the difference with NULL 
> could be useful, as long as it matches other predicates).  This moves one 
> more bit of procedural logic into the executor.
>
> Another wrinkle is that the dependent statement might not compile, due to 
> missing tables or whatnot.  Actually executing it while it doesn't compile is 
> an error, but we want to defer that error until we actually decide we need to 
> execute it.  Also, it's probably good to try compiling it again at that 
> point.  So my thought would be to try planning the dependent statement(s); if 
> they compile, hook them to the IF node; if not, hook a DEFERRED node to the 
> IF node.  The DEFERRED node has the parse tree (or raw string, whatever makes 
> sense) of the statement; on execution it tries again to plan that statement; 
> if it succeeds, run it; if not, error out.
>
> I'd also add a SEQUENCE node to the executor.  It just runs its children in 
> order (could be n-ary, or if fixed arity nodes are what is in the 
> planner/executor today, could be binary, first left, then right, and right 
> could be another SEQUENCE).  The DEFERRED node means that a CREATE statement 
> could precede use of what is created in the same sequence and all could get 
> planned (with some deferral to execution time) in advance and run in one 
> lump.  This implements DO at the executor level.
>
> The biggest concepts left from plpgsql are looping and variables.  Most 
> variables could be modeled as a single row value; SQL already can update a 
> row, so the planning of assignments and calculations of scalars (and arrays, 
> I think) already fits into things the planner knows about.  Table variables 
> (which I don't know that plpgsql supports, but someday it should) are less 
> defined.  Adding plpgsql's loops to the executor would let whole functions 
> run under one trip through the executor.  This is beyond just improving the 
> DDL support for scripts.
>
> I have written a number of database upgrade scripts.  Over time we've made 
> them less fragile, by checking for the existence of tables, indexes, and most 
> recently, columns.  The usual sequence is:
> 1) check the existence of an index; check that the first few columns are 
> correct; if not, drop the index
> 2) repeat for other indexes that have changed definition over time
> 3) check the existence of the table; create with current layout if it is 
> missing
> 4) check for the presence of a column; if missing, alter the table to add it 
> (of course, we can only add new columns at the end, and occasionally delete a 
> column)
> 5) repeat for more columns
> 6) check the existence of an index; if missing, create it
> 7) repeat for all the indexes
> This is doable in most databases, but pretty messy.  You need to join with 
> infoschema tables, or system tables, or use clunky functions to check for 
> existence; checking types is usually pretty horrid.  Consequently, we only 
> check a few things and trust that the schema is only in a few different 
> states.  A true schema comparator and upgrade solver would be great, but I 
> don't know anyone who has written such a thing.  The extended EXISTS 
> predicate that could check tables and indexes would declutter a lot of our 
> upgrade scripts.  That's the use case for me.

I agree.  I think something like this would be great.  But figuring
out how to make it happen is, of course, the trick.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to