On Tue, Oct 16, 2012 at 2:12 PM, Stafford, David x78061 <david.staff...@broadridge.com> wrote: > On Mon, 15 Oct 2012 10:21:18 -0700, Robert Haas <robertmh...@gmail.com> wrote: >> Yeah, I think the functionality that we need is pretty much there >> already today. What we need to do is to get the syntax to a point >> where people can write the code they want to write without getting >> tangled up by it. >> >> I think the invention of DO was a big step in the right direction >> ... >> With DO, you can write the logic you want >> as an SQL statement, it's just a clunky and awkward SQL statement. In >> my view the goal ought to be to refine that mechanism to remove the >> clunkiness and awkwardness, rather than to invent something completely >> new. > > As someone who has worked with a number of databases now, none of them really > get this DDL integration completely right. What I'd like to see is 1) a > predicate to easily test things about the schema (does this table, column, > index, schema, etc. exist? does it have the right type?) and 2) a way to > conditionally execute DDL (and DML, which should fall right out, but it isn't > really what this discussion is covering). I would propose extending the > current EXISTS / NOT EXISTS predicate as follows: > > [NOT] EXISTS TABLE tab [ ( col [type] [ , col [type]]... ) ] > [NOT] EXISTS COLUMN tab.col [type] > [NOT] EXISTS INDEX tab.idx [ ( col [ , col]... ) ] > [NOT] EXISTS CONSTRAINT tab.cstrt -- not sure what else might be useful here > [NOT] EXISTS ( select ) -- this is the current EXISTS predicate, of course > [NOT] EXISTS ANY ROWS FROM tab [join] [WHERE predicate] [GROUP BY col [ , > col]...] -- exactly the same as > > -- (select 1 FROM etc.) > > -- only because I like > > -- it better > (the latter [which by no means am I nuts over; it's just that when extending > EXISTS I can't stop neatening it up to my personal preferences] could be > extended with [NOT] EXISTS MORE THAN n ROWS FROM and [NOT] EXISTS EXACTLY n > ROWS FROM.) > > There is a new SQL statement: IF predicate true-statement [ELSE > false-statement]. > > To actually execute this new IF statement, the executor would need an IF node > that evaluates the predicate (with ANDs and ORs, just like all SQL > predicates) and then executes the rest only if the predicate came out TRUE > (or NOT FALSE; I forget which is usually used, and the difference with NULL > could be useful, as long as it matches other predicates). This moves one > more bit of procedural logic into the executor. > > Another wrinkle is that the dependent statement might not compile, due to > missing tables or whatnot. Actually executing it while it doesn't compile is > an error, but we want to defer that error until we actually decide we need to > execute it. Also, it's probably good to try compiling it again at that > point. So my thought would be to try planning the dependent statement(s); if > they compile, hook them to the IF node; if not, hook a DEFERRED node to the > IF node. The DEFERRED node has the parse tree (or raw string, whatever makes > sense) of the statement; on execution it tries again to plan that statement; > if it succeeds, run it; if not, error out. > > I'd also add a SEQUENCE node to the executor. It just runs its children in > order (could be n-ary, or if fixed arity nodes are what is in the > planner/executor today, could be binary, first left, then right, and right > could be another SEQUENCE). The DEFERRED node means that a CREATE statement > could precede use of what is created in the same sequence and all could get > planned (with some deferral to execution time) in advance and run in one > lump. This implements DO at the executor level. > > The biggest concepts left from plpgsql are looping and variables. Most > variables could be modeled as a single row value; SQL already can update a > row, so the planning of assignments and calculations of scalars (and arrays, > I think) already fits into things the planner knows about. Table variables > (which I don't know that plpgsql supports, but someday it should) are less > defined. Adding plpgsql's loops to the executor would let whole functions > run under one trip through the executor. This is beyond just improving the > DDL support for scripts. > > I have written a number of database upgrade scripts. Over time we've made > them less fragile, by checking for the existence of tables, indexes, and most > recently, columns. The usual sequence is: > 1) check the existence of an index; check that the first few columns are > correct; if not, drop the index > 2) repeat for other indexes that have changed definition over time > 3) check the existence of the table; create with current layout if it is > missing > 4) check for the presence of a column; if missing, alter the table to add it > (of course, we can only add new columns at the end, and occasionally delete a > column) > 5) repeat for more columns > 6) check the existence of an index; if missing, create it > 7) repeat for all the indexes > This is doable in most databases, but pretty messy. You need to join with > infoschema tables, or system tables, or use clunky functions to check for > existence; checking types is usually pretty horrid. Consequently, we only > check a few things and trust that the schema is only in a few different > states. A true schema comparator and upgrade solver would be great, but I > don't know anyone who has written such a thing. The extended EXISTS > predicate that could check tables and indexes would declutter a lot of our > upgrade scripts. That's the use case for me.
I agree. I think something like this would be great. But figuring out how to make it happen is, of course, the trick. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers