[
https://issues.apache.org/jira/browse/CALCITE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074288#comment-17074288
]
Will Yu commented on CALCITE-3789:
----------------------------------
[~julianhyde] Make sense.
To convert SqlNode to RelNode, we need to
* keep the STRUCT type instead of unboxing it in _Uncollect_
* add aliases for all to-be-unnested columns when building RelDataType
So I add a new field for all column alises in _Uncollect_, and use it to
determine whether to unbox STRUCT type and if not, what aliases we should use.
My questions are:
* It seems that other than Collection type, MAP type could be unnested to two
columns (key & value). How shall we handle MAP type in this case?
* In SqlToRelConverterTest, I turned off _decorrelate_ because decorrelated
type is not the same as type after validation. After an initial investigation,
it seems that the row type of Uncollect is not fully flattened, but the index
is calculated on a fully flattened base
(RelStructuredTypeFlattener.postFlattenSize). My question is that whether
_decorrelate_ is a required step for this ticket?
* Generally, I am not sure whether we should put SqlToRelConverter changes and
validation changes together, or better to put them into separate tickets & PRs.
Thanks!
> Support validation of UNNEST multiple array columns like Presto
> ---------------------------------------------------------------
>
> Key: CALCITE-3789
> URL: https://issues.apache.org/jira/browse/CALCITE-3789
> Project: Calcite
> Issue Type: New Feature
> Components: core
> Affects Versions: 1.21.0
> Reporter: Will Yu
> Assignee: Will Yu
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 9h
> Remaining Estimate: 0h
>
> In Presto, users are able to UNNEST multiple array columns and CROSS JOIN
> with the original table . As shown in the [Presto
> doc|https://prestodb.io/docs/current/sql/select.html]:
> {code:sql}
> SELECT numbers, animals, n, a
> FROM (
> VALUES
> (ARRAY[2, 5], ARRAY['dog', 'cat', 'bird']),
> (ARRAY[7, 8, 9], ARRAY['cow', 'pig'])
> ) AS x (numbers, animals)
> CROSS JOIN UNNEST(numbers, animals) AS t (n, a)
> {code}
> yields:
> numbers | animals | n | a
> -----------+------------------+------+------
> [2, 5] | [dog, cat, bird] | 2 | dog
> [2, 5] | [dog, cat, bird] | 5 | cat
> [2, 5] | [dog, cat, bird] | NULL | bird
> [7, 8, 9] | [cow, pig] | 7 | cow
> [7, 8, 9] | [cow, pig] | 8 | pig
> [7, 8, 9] | [cow, pig] | 9 | NULL
> It seems Calcite does not have such a feature to support this semantics. In
> Calcite and for above SQL, _n_ and _a_ will be identified as alias of
> subfields of numbers.
> The plan will be to introduce a new Presto conformance and enable validation
> of such SQLs
--
This message was sent by Atlassian Jira
(v8.3.4#803005)