[
https://issues.apache.org/jira/browse/CALCITE-3787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036607#comment-17036607
]
Will Yu commented on CALCITE-3787:
----------------------------------
[~julianhyde] Yea, agree on the current behavior (cartesian product) is
expected.
If *_UNNEST(a_array, b_array)_* behaves like Presto behavior, which UNNEST two
arrays together under condition(row count == array index), a rule to merge
*_UNNEST(a_array) CROSS JOIN UNNEST(b_array)_* to *_UNNEST(a_array, b_array)_*
will not be the expected behavior.
To support such semantics (*_UNNEST(a_array, b_array)_* ), we need to a way to
alias elements of a_array and b_array. One way is to adopt something like
*_UNNEST(a_array, b_array) as t(a_element, b_element)_*, but it does not fit
into current Calcite SQL because of the different ways of aliasing. In Calcite,
_UNNEST(a_array) as t_ will alias _*a_array*_ as _*t*_, and *_UNNEST(a_array)
as t(a)_* implies element type of *a_array* is a structured type with one
property which is aliased as _*a*_
**And confirm that Uncollect supports multiple fields
{code:java}
@Test public void testUnnestArrayPlan() {
final String sql = "select d.deptno\n"
+ "from dept_nested as d CROSS JOIN\n"
+ " UNNEST(d.employees, d.employees) e2";
sql(sql).with(getExtendedTester()).ok();
}
{code}
will yield
{code}
LogicalProject(DEPTNO=[$0])
LogicalCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{6}])
LogicalProject(DEPTNO=[$0], NAME=[$1], TYPE=[$2.TYPE], DESC=[$2.DESC],
A=[$2.OTHERS.A], B=[$2.OTHERS.B], EMPLOYEES=[$3])
LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]])
Uncollect
LogicalProject(EMPLOYEES=[$cor0.EMPLOYEES_6],
EMPLOYEES1=[$cor0.EMPLOYEES_6])
LogicalValues(tuples=[[{ 0 }]])
{code}
But there could be some bug because if I do
{code:java}
@Test public void testUnnestArrayPlan() {
final String sql = "select d.deptno, e2.*\n"
+ "from dept_nested as d CROSS JOIN\n"
+ " UNNEST(d.employees, d.employees) e2";
sql(sql).with(getExtendedTester()).ok();
}
{code}
Calcite will throw exceptions about ambiguous column name. So anyway it seems
like we need to find an aliasing strategy to alias array columns in UNNEST.
> Revisit Sql To Rel for a chained UNNEST
> ---------------------------------------
>
> Key: CALCITE-3787
> URL: https://issues.apache.org/jira/browse/CALCITE-3787
> Project: Calcite
> Issue Type: Bug
> Components: core
> Reporter: Will Yu
> Priority: Trivial
>
> Need to revisit the behavior of a chained UNNEST to unnest two array columns
> at the same time.
> Currently it seems to yield a *cartesian product*:
> {code:java}
> @Test public void testUnnestArrayPlan() {
> final String sql = "select d.deptno, e2.empno, e3.detail\n"
> + "from dept_nested as d,\n"
> + " UNNEST(d.employees) e2, UNNEST(d.employees) e3";
> sql(sql).with(getExtendedTester()).ok();
> }
> {code}
> which yield:
> {code:java}
> LogicalProject(DEPTNO=[$0], EMPNO=[$7], DETAIL=[ROW($12)])
> LogicalCorrelate(correlation=[$cor1], joinType=[inner],
> requiredColumns=[{6}])
> LogicalCorrelate(correlation=[$cor0], joinType=[inner],
> requiredColumns=[{6}])
> LogicalProject(DEPTNO=[$0], NAME=[$1], TYPE=[$2.TYPE], DESC=[$2.DESC],
> A=[$2.OTHERS.A], B=[$2.OTHERS.B], EMPLOYEES=[$3])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]])
> Uncollect
> LogicalProject(EMPLOYEES=[$cor0.EMPLOYEES_6])
> LogicalValues(tuples=[[{ 0 }]])
> Uncollect
> LogicalProject(EMPLOYEES=[$cor1.EMPLOYEES_6])
> LogicalValues(tuples=[[{ 0 }]])
> {code}
> Another option is to achieve something similar to Presto UNNEST(array_1,
> array_2) as t(a1, a2)
> {code:sql}
> SELECT numbers, animals, n, a
> FROM (
> VALUES
> (ARRAY[2, 5], ARRAY['dog', 'cat', 'bird']),
> (ARRAY[7, 8, 9], ARRAY['cow', 'pig'])
> ) AS x (numbers, animals)
> CROSS JOIN UNNEST(numbers, animals) AS t (n, a){code}
> {code}
> numbers | animals | n | a
> -----------++------------------------+------
> [2, 5] | [dog, cat, bird] | 2 | dog
> [2, 5] | [dog, cat, bird] | 5 | cat
> [2, 5] | [dog, cat, bird] | NULL | bird
> [7, 8, 9] | [cow, pig] | 7 | cow
> [7, 8, 9] | [cow, pig] | 8 | pig
> [7, 8, 9] | [cow, pig] | 9 | NULL
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)