Jungtaek Lim created STORM-2115:
-----------------------------------

             Summary: [Storm SQL] 'IN' with subquery making implicit aggregate 
calls which is having no name
                 Key: STORM-2115
                 URL: https://issues.apache.org/jira/browse/STORM-2115
             Project: Apache Storm
          Issue Type: Bug
          Components: storm-sql
            Reporter: Jungtaek Lim


"SELECT ID FROM FOO WHERE ID NOT IN (SELECT 1 AS ID FROM FOO)" throws 
duplicated field 'null'.

Here is logical plan from Calcite.
{code}
LogicalFilter(condition=[NOT(CASE(=($3, 0), false, IS NOT NULL($7), true, IS 
NULL($5), null, <($4, $3), null, false))]): rowcount = 1.0, cumulative cost = 
{10.375 rows, 16.0 cpu, 0.0 io}, id = 24
  LogicalJoin(condition=[=($5, $6)], joinType=[left]): rowcount = 1.0, 
cumulative cost = {9.375 rows, 15.0 cpu, 0.0 io}, id = 23
    LogicalProject($f0=[$0], $f1=[$1], $f2=[$2], $f3=[$3], $f4=[$4], $f5=[$0]): 
rowcount = 1.0, cumulative cost = {5.25 rows, 11.0 cpu, 0.0 io}, id = 18
      LogicalJoin(condition=[true], joinType=[inner]): rowcount = 1.0, 
cumulative cost = {4.25 rows, 5.0 cpu, 0.0 io}, id = 17
        EnumerableTableScan(table=[[FOO]]): rowcount = 1.0, cumulative cost = 
{0.0 rows, 1.0 cpu, 0.0 io}, id = 12
        LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)]): 
rowcount = 1.0, cumulative cost = {3.25 rows, 4.0 cpu, 0.0 io}, id = 16
          LogicalProject($f0=[$0], $f1=[true]): rowcount = 1.0, cumulative cost 
= {2.0 rows, 4.0 cpu, 0.0 io}, id = 15
            LogicalProject(ID=[1]): rowcount = 1.0, cumulative cost = {1.0 
rows, 2.0 cpu, 0.0 io}, id = 14
              EnumerableTableScan(table=[[FOO]]): rowcount = 1.0, cumulative 
cost = {0.0 rows, 1.0 cpu, 0.0 io}, id = 13
    LogicalAggregate(group=[{0}], agg#0=[MIN($1)]): rowcount = 1.0, cumulative 
cost = {3.125 rows, 4.0 cpu, 0.0 io}, id = 22
      LogicalProject($f0=[$0], $f1=[true]): rowcount = 1.0, cumulative cost = 
{2.0 rows, 4.0 cpu, 0.0 io}, id = 21
        LogicalProject(ID=[1]): rowcount = 1.0, cumulative cost = {1.0 rows, 
2.0 cpu, 0.0 io}, id = 20
          EnumerableTableScan(table=[[FOO]]): rowcount = 1.0, cumulative cost = 
{0.0 rows, 1.0 cpu, 0.0 io}, id = 19
{code}

In this case AggregateCall.name could be null, so there could be duplicated 
fields in trident tuple which are having 'null' as name.

We should refer the RowType of LogicalAggregate, but another issue is that its 
name could be same as upstream's output field name, so it makes another 
duplication.

One way to resolve this is assigning temporal field names while aggregating, 
and finally replace them with fields name in RowType of LocalAggregate.

We can achieve this easier when STORM-2072 will be merged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to