Re: Structuring SubQueries as Functions

Julian Hyde Thu, 01 Dec 2022 12:01:17 -0800

I do agree that a correlated sub-query is a function call. If you write your 
queries using CROSS APPLY this becomes clear.

Decorrelation is very useful. Some execution engines, especially the highly 
parallel/distributed ones, stopping and restarting subqueries requires a lot of 
communication. So Calcite supports decorrelation, and it is Calcite’s preferred 
execution strategy. But there are definitely engines, and queries, that are 
better executed in correlated form.

By the way, the Froid project [1] takes this idea to the limit, and applies 
decorrelation techniques to function calls (creating ‘magic sets’ of all 
possible arguments).

Calcite’s decorrelation code is old and brittle. But if I recall correctly, you 
don’t have to do decorrelation in SqlToRelConverter; you can defer, and do the 
decorrelation using planner rules. 

Julian

[1] https://dl.acm.org/doi/10.1145/3186728.3164140 

> On Dec 1, 2022, at 11:09 AM, James Starr <[email protected]> wrote:
> 
> Currently sub-query correlated variables have a brittle contract with
> their containing RelNode.  Simple rules such as ones that transpose
> filters and projects are unaware of this contract and would be
> difficult to retrofit to handle all the rules to be sub-query aware.
> 
> A correlated sub-query is logically a function call with where its
> parameters are the values used for the correlated inputs.  If the
> SubQuery object was structured such that the inputs that are used as
> correlated variables were explicit sub nodes of the sub-query object,
> then most rules and utilities, such as the trimmer, would just work as
> expected.  SqlToRel could also be simplified since there would only be
> one place to add the CorrelationId oppose to 3.
> 
> James

Re: Structuring SubQueries as Functions

Reply via email to