[
https://issues.apache.org/jira/browse/CALCITE-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18090451#comment-18090451
]
Mihai Budiu commented on CALCITE-7608:
--------------------------------------
This operator is much weaker than LINQ's SelectMany, because it is not higher
order, and it doesn't take a closure returning an iterator. It is much closer
to a beefed-up Uncollect than a true SelectMany in this respect.
It really is a generalization of Uncollect, and [~zwh] suggested naming it
accordingly.
I am pretty confident that this will only affect the code paths that touch
Uncollect today. We won't be able to remove Uncollect for a while, because
other back-ends will not expect this new operator, but it should be deprecated
in the longer term.
The decorrelator could work the way you describe it. There is some danger that
the Uncollect+Correlate combo will get separated by other optimizations (e.g.,
a projection in between?), and thus may make it difficult to substitute the
pattern. Look at the pattern substituted, it's actually quite complicated:
{code:java}
LogicalProject (outerProject)
LogicalCorrelate(cor=[$cor0], joinType=[inner|left|...])
left (any RelNode)
Uncollect
LogicalProject($cor0.f_i, $cor0.f_j, ...) (innerProject)
LogicalValues {code}
This is rewritten as
{code:java}
LogicalProject (outerProject, remapped to SM output indices)
LogicalSelectMany(collectionFields=[f_i, f_j, ...])
left {code}
The original plan is quite unwieldy
> Introduce a SelectMany operator
> -------------------------------
>
> Key: CALCITE-7608
> URL: https://issues.apache.org/jira/browse/CALCITE-7608
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.42.0
> Reporter: Mihai Budiu
> Assignee: Mihai Budiu
> Priority: Minor
> Labels: pull-request-available
>
> Today UNNEST is implemented using the Uncollect operator. We propose adding
> an alternative LogicalSelectMany operator, which generalizes Uncollect.
> (Notice that Enumerable API already has a SelectMany.) The main difference
> between Uncollect and SelectMany is that Uncollect unnests all the fields of
> its input relation, whereas LogicalSelectMany would only unnest SOME of the
> fields of the input collection, preserving the other ones in each output row.
> This distinction is very important, because:
> * LogicalSelectMany can be directly and efficiently implemented using the
> Enumerable SelectMany
> * UNNEST used in a cross-join is implemented using an Uncollect and a
> LogicalCorrelate. However, the same UNNEST can be represented using just one
> LogicalSelectMany node
> * Neither the old nor the new decorrelator can actually eliminate
> LogicalCorrelate nodes that are paired with Uncollect. Using
> LogicalSelectMany we can decorrelate many more plans.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)