Re: [HACKERS] Adjusting the API of pull_var_clause()

2016-03-10 Thread Ashutosh Bapat
Now, I'm pretty sure that the last time we touched pull_var_clause's
> API, we intentionally set it up to force every call site to be visited
> when new behaviors were added.  But right at the moment that's looking
> like it was a bad call.
>
> An alternative API design could look like
>
> #define PVC_INCLUDE_AGGREGATES   0x0001 /* include Aggrefs in output list
> */
> #define PVC_RECURSE_AGGREGATES   0x0002 /* recurse into Aggref arguments */
> #define PVC_INCLUDE_PLACEHOLDERS 0x0004 /* include PlaceHolderVars in
> output list */
> #define PVC_RECURSE_PLACEHOLDERS 0x0008 /* recurse into PlaceHolderVar
> arguments */
>
>
extern List *pull_var_clause(Node *node, int flags);
>
> with calls along the line of
>
> pull_var_clause(node, PVC_INCLUDE_AGGREGATES | PVC_RECURSE_PLACEHOLDERS);
>
> the default behavior if you specify no flag being "error if node type
> is seen".
>
> The attraction of this approach is that if we add another behavior
> to pull_var_clause, while we'd still likely need to run around and
> check every call site, it wouldn't be positively guaranteed that
> we'd need to edit every darn one of them.
>

That can be a problem for extension or any code outside the PG repository
that uses pull_var_clause(). Right now they would notice it because
compilation will fail. Those extensions wouldn't know that a new option has
been added and will be presented the result with default option. That may
not be bad for window nodes but I am sure that, that would be the case for
other nodes.

The name of the function suggests that should get all the Var nodes from a
given expression. Throwing error when an unexpected node is encountered,
seems to be a side effect. So RECURSE should be the default behaviour and
not REJECT.

I am not sure whether REJECT behaviour is something of a sanity check and
not the real thing. How many calls which specify REJECT_AGGREGATE, really
expect an aggregate to be in the expression passed. Probably not. If they
really case, shouldn't there be a separate API for checking just that. In
fact, we may want to change the  API to indicate where to stop recursing
e.g. at aggregate nodes or placeholder nodes or window nodes. Since we are
thinking of changing the API, we may consider this change as well.
Although, I think this would have been OK when pull_var_clause was being
written afresh. Now, that we have this API, I am not sure whether the
effort is worth the result.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


[HACKERS] Adjusting the API of pull_var_clause()

2016-03-10 Thread Tom Lane
Over in the "Optimizer questions" thread, it's become apparent that
we need to fix pull_var_clause() to offer multiple behaviors for
WindowFunc nodes that are parallel to the ones it has for Aggrefs
(viz, reject, recurse, or include in result).  This should've been
done when window functions were introduced, likely; but we've escaped
the need for it so far because the planner hasn't done any real
analysis of post-WindowAgg targetlists.

The straightforward way to do this would be to add another enum type
similar to PVCAggregateBehavior and a fourth argument to pull_var_clause,
plus tedious updates of all twenty-or-so existing call sites, almost all
of which should choose PVC_REJECT_WINDOWFUNCS because they'd not expect
to get invoked on expressions that could contain WindowFuncs.

Now, I'm pretty sure that the last time we touched pull_var_clause's
API, we intentionally set it up to force every call site to be visited
when new behaviors were added.  But right at the moment that's looking
like it was a bad call.

An alternative API design could look like

#define PVC_INCLUDE_AGGREGATES   0x0001 /* include Aggrefs in output list */
#define PVC_RECURSE_AGGREGATES   0x0002 /* recurse into Aggref arguments */
#define PVC_INCLUDE_PLACEHOLDERS 0x0004 /* include PlaceHolderVars in output 
list */
#define PVC_RECURSE_PLACEHOLDERS 0x0008 /* recurse into PlaceHolderVar 
arguments */

extern List *pull_var_clause(Node *node, int flags);

with calls along the line of

pull_var_clause(node, PVC_INCLUDE_AGGREGATES | PVC_RECURSE_PLACEHOLDERS);

the default behavior if you specify no flag being "error if node type
is seen".

The attraction of this approach is that if we add another behavior
to pull_var_clause, while we'd still likely need to run around and
check every call site, it wouldn't be positively guaranteed that
we'd need to edit every darn one of them.

This might all be moot of course.  Either way, we'll have to touch every
call site today; and there is nothing on the horizon suggesting that we'll
need to make another change in pull_var_clause in the foreseeable future.

I'm undecided which way to fix it.  Anybody else have an opinion?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers