[jira] [Commented] (CALCITE-3789) Support validation of UNNEST multiple array columns like Presto

Will Yu (Jira) Sun, 05 Apr 2020 20:54:25 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17076022#comment-17076022
 ]


Will Yu commented on CALCITE-3789:
----------------------------------

Let me add more details here on SqlNode to RelNode. 

With this feature, the SQL
{code}
select d.deptno, e, k.empno\n"
+ "from dept_nested_expanded as d CROSS JOIN\n"
+ " UNNEST(d.admins, d.employees) as t(e, k)
{code}

will be validated as 
{code}
select d.deptno, t.e, t.k.empno\n"
+ "from dept_nested_expanded as d CROSS JOIN\n"
+ " UNNEST(d.admins, d.employees) as t(e, k)
{code}

To make sure Calcite could find type of t.e or t.e.empo, we need to 
* keep the original type (admin as varchar, employee as STRUCT) as the row type 
of Uncollect, which is the input of the final Project. 
* Properly alias it.

That's why I change the Uncollect class to hold the list of aliases for each 
array columns. This is also quite aligned with the expected semantics of this 
feature.

> Support validation of UNNEST multiple array columns like Presto
> ---------------------------------------------------------------
>
>                 Key: CALCITE-3789
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3789
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 1.21.0
>            Reporter: Will Yu
>            Assignee: Will Yu
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 10h
>  Remaining Estimate: 0h
>
> In Presto, users are able to UNNEST multiple array columns and CROSS JOIN 
> with the original table . As shown in the [Presto 
> doc|https://prestodb.io/docs/current/sql/select.html]:
> {code:sql}
> SELECT numbers, animals, n, a
> FROM (
>   VALUES
>     (ARRAY[2, 5], ARRAY['dog', 'cat', 'bird']),
>     (ARRAY[7, 8, 9], ARRAY['cow', 'pig'])
> ) AS x (numbers, animals)
> CROSS JOIN UNNEST(numbers, animals) AS t (n, a)
> {code}
> yields:
>   numbers  |     animals      |  n   |  a
> -----------+------------------+------+------
>  [2, 5]    | [dog, cat, bird] |    2 | dog
>  [2, 5]    | [dog, cat, bird] |    5 | cat
>  [2, 5]    | [dog, cat, bird] | NULL | bird
>  [7, 8, 9] | [cow, pig]       |    7 | cow
>  [7, 8, 9] | [cow, pig]       |    8 | pig
>  [7, 8, 9] | [cow, pig]       |    9 | NULL
> It seems Calcite does not have such a feature to support this semantics. In 
> Calcite and for above SQL, _n_ and _a_ will be identified as alias of 
> subfields of numbers.
> The plan will be to introduce a new Presto conformance and enable validation 
> of such SQLs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3789) Support validation of UNNEST multiple array columns like Presto

Reply via email to