Re: [HACKERS] what's up with IDENTIFIER_LOOKUP_EXPR?

Tom Lane Thu, 04 May 2017 15:05:56 -0700

Robert Haas <robertmh...@gmail.com> writes:
> The PLPGSQL_DTYPE_* constants are another thing that's not really
> documented.


Yeah :-(.  Complain to Jan sometime.

> You've mentioned that we should get rid of
> PLPGSQL_DTYPE_ROW in favor of, uh, whatever's better than that, but
> it's not clear to me what that really means, why one way is better
> than the other way, or what is involved.  I'm not really clear on what
> a PLpgSQL_datum_type is in general, or what any of the specific types
> actually mean.  I'll guess that PLPGSQL_DTYPE_VAR is a variable, but
> beyond that...

Well, from memory ...

PLpgSQL_datum is really a symbol table entry.  The conflict against what
we mean by "Datum" elsewhere is pretty unfortunate.

VAR - variable of scalar type (well, any non-composite type ---
arrays and ranges are this kind of PLpgSQL_datum too, for instance).

REC - variable of composite type, stored as a HeapTuple.

RECFIELD - reference to a field of a REC variable.  The datum includes
the field name and a link to the datum for the parent REC variable.
Notice this implies a runtime lookup of the field name whenever we're
accessing the datum; which sucks for performance but it makes life a
lot easier when you think about the possibility of the composite type
changing underneath you.

ARRAYELEM - reference to an element of an array.  The datum includes
a subscript expression and a link to the datum for the parent array
variable (which can be a VAR, and I think a RECFIELD too).

ROW - this is where it gets fun.  A ROW is effectively a variable
of a possibly-anonymous composite type, and it is defined by a list
(in its own datum) of links to PLpgSQL_datums representing the
individual columns.  Typically the member datums would be VARs
but that's not the only possibility.

As I mentioned earlier, the case that ROW is actually well adapted
for is multiple targets in INTO and similar constructs.  For example,
if you have

        SELECT ...blah blah... INTO a,b,c

then the target of the PLpgSQL_stmt_execsql is represented as a single
ROW datum whose members are the datums for a, b, and c.  That's totally
determined by the text of the function and can't change under us.

However ... somebody thought it'd be cute to use the ROW infrastructure
for variables of named composite types, too.  So if you have

        DECLARE foo some_composite_type;

then the name "foo" refers to a ROW datum, and the plpgsql compiler
generates additional anonymous VAR datums, one for each declared column
in some_composite_type, which become the members of the ROW datum.
The runtime representation is actually that each field value is stored
separately in its datum, as though it were an independent VAR.  Field
references "foo.col1" are not compiled into RECFIELD datums; we just look
up the appropriate member datum during compile and make the expression
tree point to that datum directly.

So, this representation is great for speed of access and modification
of individual fields of the composite variable.  It sucks when you
want to assign to the composite as a whole or retrieve its value as
a whole, because you have to deconstruct or reconstruct a tuple to
do that.  (The REC/RECFIELD approach has approximately the opposite
strengths and weaknesses.)  Also, dealing with changes in the named
composite type is a complete fail, because we've built its structure
into the function's symbol table at parse time.

I forget why there's a dtype for EXPR.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] what's up with IDENTIFIER_LOOKUP_EXPR?

Reply via email to