[ 
https://issues.apache.org/jira/browse/STORM-2115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved STORM-2115.
---------------------------------
       Resolution: Fixed
    Fix Version/s: 1.1.0
                   2.0.0

Resolved as part of STORM-2125.

> [Storm SQL] 'IN' with subquery making implicit aggregate calls which is 
> having 'null' as name
> ---------------------------------------------------------------------------------------------
>
>                 Key: STORM-2115
>                 URL: https://issues.apache.org/jira/browse/STORM-2115
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-sql
>            Reporter: Jungtaek Lim
>             Fix For: 2.0.0, 1.1.0
>
>
> "SELECT ID FROM FOO WHERE ID NOT IN (SELECT 1 AS ID FROM FOO)" throws 
> duplicated field 'null'.
> Here is logical plan from Calcite.
> {code}
> LogicalFilter(condition=[NOT(CASE(=($3, 0), false, IS NOT NULL($7), true, IS 
> NULL($5), null, <($4, $3), null, false))]): rowcount = 1.0, cumulative cost = 
> {10.375 rows, 16.0 cpu, 0.0 io}, id = 24
>   LogicalJoin(condition=[=($5, $6)], joinType=[left]): rowcount = 1.0, 
> cumulative cost = {9.375 rows, 15.0 cpu, 0.0 io}, id = 23
>     LogicalProject($f0=[$0], $f1=[$1], $f2=[$2], $f3=[$3], $f4=[$4], 
> $f5=[$0]): rowcount = 1.0, cumulative cost = {5.25 rows, 11.0 cpu, 0.0 io}, 
> id = 18
>       LogicalJoin(condition=[true], joinType=[inner]): rowcount = 1.0, 
> cumulative cost = {4.25 rows, 5.0 cpu, 0.0 io}, id = 17
>         EnumerableTableScan(table=[[FOO]]): rowcount = 1.0, cumulative cost = 
> {0.0 rows, 1.0 cpu, 0.0 io}, id = 12
>         LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)]): 
> rowcount = 1.0, cumulative cost = {3.25 rows, 4.0 cpu, 0.0 io}, id = 16
>           LogicalProject($f0=[$0], $f1=[true]): rowcount = 1.0, cumulative 
> cost = {2.0 rows, 4.0 cpu, 0.0 io}, id = 15
>             LogicalProject(ID=[1]): rowcount = 1.0, cumulative cost = {1.0 
> rows, 2.0 cpu, 0.0 io}, id = 14
>               EnumerableTableScan(table=[[FOO]]): rowcount = 1.0, cumulative 
> cost = {0.0 rows, 1.0 cpu, 0.0 io}, id = 13
>     LogicalAggregate(group=[{0}], agg#0=[MIN($1)]): rowcount = 1.0, 
> cumulative cost = {3.125 rows, 4.0 cpu, 0.0 io}, id = 22
>       LogicalProject($f0=[$0], $f1=[true]): rowcount = 1.0, cumulative cost = 
> {2.0 rows, 4.0 cpu, 0.0 io}, id = 21
>         LogicalProject(ID=[1]): rowcount = 1.0, cumulative cost = {1.0 rows, 
> 2.0 cpu, 0.0 io}, id = 20
>           EnumerableTableScan(table=[[FOO]]): rowcount = 1.0, cumulative cost 
> = {0.0 rows, 1.0 cpu, 0.0 io}, id = 19
> {code}
> In this case AggregateCall.name could be null, so there could be duplicated 
> fields in trident tuple which are having 'null' as name.
> We should refer the RowType of LogicalAggregate, but another issue is that 
> its name could be same as upstream's output field name, so it makes another 
> duplication.
> One way to resolve this is assigning temporal field names while aggregating, 
> and finally replace them with fields name in RowType of LocalAggregate.
> We can achieve this easier when STORM-2072 will be merged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to