[ 
https://issues.apache.org/jira/browse/CALCITE-3787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036607#comment-17036607
 ] 

Will Yu commented on CALCITE-3787:
----------------------------------

[~julianhyde] Yea, agree on the current behavior (cartesian product) is 
expected.

 

 If *_UNNEST(a_array, b_array)_* behaves like Presto behavior, which UNNEST two 
arrays together under condition(row count == array index), a rule to merge 
*_UNNEST(a_array) CROSS JOIN UNNEST(b_array)_* to *_UNNEST(a_array, b_array)_* 
will not be the expected behavior.

 

To support such semantics (*_UNNEST(a_array, b_array)_* ), we need to a way to 
alias elements of a_array and b_array. One way is to adopt something like 
*_UNNEST(a_array, b_array) as t(a_element, b_element)_*, but it does not fit 
into current Calcite SQL because of the different ways of aliasing. In Calcite, 
_UNNEST(a_array) as t_ will alias _*a_array*_ as _*t*_, and *_UNNEST(a_array) 
as t(a)_* implies element type of *a_array* is a structured type with one 
property which is aliased as _*a*_

 

**And confirm that Uncollect supports multiple fields

{code:java}
@Test public void testUnnestArrayPlan() {
 final String sql = "select d.deptno\n"
 + "from dept_nested as d CROSS JOIN\n"
 + " UNNEST(d.employees, d.employees) e2";
 sql(sql).with(getExtendedTester()).ok();
}
{code}
will yield
{code}

LogicalProject(DEPTNO=[$0])
  LogicalCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{6}])
    LogicalProject(DEPTNO=[$0], NAME=[$1], TYPE=[$2.TYPE], DESC=[$2.DESC], 
A=[$2.OTHERS.A], B=[$2.OTHERS.B], EMPLOYEES=[$3])
      LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]])
    Uncollect
      LogicalProject(EMPLOYEES=[$cor0.EMPLOYEES_6], 
EMPLOYEES1=[$cor0.EMPLOYEES_6])
        LogicalValues(tuples=[[{ 0 }]])
{code}

But there could be some bug because if I do
{code:java}
@Test public void testUnnestArrayPlan() {
 final String sql = "select d.deptno, e2.*\n"
 + "from dept_nested as d CROSS JOIN\n"
 + " UNNEST(d.employees, d.employees) e2";
 sql(sql).with(getExtendedTester()).ok();
}
{code}
Calcite will throw exceptions about ambiguous column name. So anyway it seems 
like we need to find an aliasing strategy to alias array columns in UNNEST. 
 

> Revisit Sql To Rel for a chained UNNEST
> ---------------------------------------
>
>                 Key: CALCITE-3787
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3787
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Will Yu
>            Priority: Trivial
>
> Need to revisit the behavior of a chained UNNEST to unnest two array columns 
> at the same time.
> Currently it seems to yield a *cartesian product*:
> {code:java}
> @Test public void testUnnestArrayPlan() {
>     final String sql = "select d.deptno, e2.empno, e3.detail\n"
>         + "from dept_nested as d,\n"
>         + " UNNEST(d.employees) e2, UNNEST(d.employees) e3";
>     sql(sql).with(getExtendedTester()).ok();
>   }
> {code}
> which yield:
> {code:java}
> LogicalProject(DEPTNO=[$0], EMPNO=[$7], DETAIL=[ROW($12)])
>   LogicalCorrelate(correlation=[$cor1], joinType=[inner], 
> requiredColumns=[{6}])
>     LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{6}])
>       LogicalProject(DEPTNO=[$0], NAME=[$1], TYPE=[$2.TYPE], DESC=[$2.DESC], 
> A=[$2.OTHERS.A], B=[$2.OTHERS.B], EMPLOYEES=[$3])
>         LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]])
>       Uncollect
>         LogicalProject(EMPLOYEES=[$cor0.EMPLOYEES_6])
>           LogicalValues(tuples=[[{ 0 }]])
>     Uncollect
>       LogicalProject(EMPLOYEES=[$cor1.EMPLOYEES_6])
>         LogicalValues(tuples=[[{ 0 }]])
> {code}
> Another option is to achieve something similar to Presto UNNEST(array_1, 
> array_2) as t(a1, a2)
> {code:sql}
> SELECT numbers, animals, n, a
>  FROM (
>  VALUES
>  (ARRAY[2, 5], ARRAY['dog', 'cat', 'bird']),
>  (ARRAY[7, 8, 9], ARRAY['cow', 'pig'])
>  ) AS x (numbers, animals)
>  CROSS JOIN UNNEST(numbers, animals) AS t (n, a){code}
> {code}
>  numbers | animals | n | a
>  -----------++------------------------+------
>  [2, 5] | [dog, cat, bird] | 2 | dog
>  [2, 5] | [dog, cat, bird] | 5 | cat
>  [2, 5] | [dog, cat, bird] | NULL | bird
>  [7, 8, 9] | [cow, pig] | 7 | cow
>  [7, 8, 9] | [cow, pig] | 8 | pig
>  [7, 8, 9] | [cow, pig] | 9 | NULL
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to