[ 
https://issues.apache.org/jira/browse/FLINK-12263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939104#comment-16939104
 ] 

Jingsong Lee commented on FLINK-12263:
--------------------------------------

SINGLE_VALUE is useful for "select * from A where a = (select a from B)". So we 
can not remove it.

> Remove SINGLE_VALUE aggregate function from physical plan
> ---------------------------------------------------------
>
>                 Key: FLINK-12263
>                 URL: https://issues.apache.org/jira/browse/FLINK-12263
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table SQL / Planner
>            Reporter: Jark Wu
>            Priority: Major
>
>  SINGLE_VALUE is an aggregate function which only accepts one row, and throws 
> exception when received more than one row.
>  
> For example: 
> {code:sql}
> SELECT a2, SUM(a1) FROM A GROUP BY a2 HAVING SUM(a1) > (SELECT SUM(a1) * 0.1 
> FROM A)
> {code}
> will get a physical plan contains SINGLE_VALUE:
> {code:sql}
> +- NestedLoopJoin(joinType=[InnerJoin], where=[>(EXPR$1, $f0)], select=[a2, 
> EXPR$1, $f0], build=[right], singleRowJoin=[true])
>    :- HashAggregate(isMerge=[true], groupBy=[a2], select=[a2, 
> Final_SUM(sum$0) AS EXPR$1])
>    :  +- Exchange(distribution=[hash[a2]])
>    :     +- LocalHashAggregate(groupBy=[a2], select=[a2, Partial_SUM(a1) AS 
> sum$0])
>    :        +- TableSourceScan(table=[[A, source: [TestTableSource(a1, 
> a2)]]], fields=[a1, a2])
>    +- Exchange(distribution=[broadcast])
>       +- HashAggregate(isMerge=[true], select=[Final_SINGLE_VALUE(value$0, 
> count$1) AS $f0])
>          +- Exchange(distribution=[single])
>             +- LocalHashAggregate(select=[Partial_SINGLE_VALUE(EXPR$0) AS 
> (value$0, count$1)])
>                +- Calc(select=[*($f0, 0.1) AS EXPR$0])
>                   +- HashAggregate(isMerge=[true], select=[Final_SUM(sum$0) 
> AS $f0])
>                      +- Exchange(distribution=[single])
>                         +- LocalHashAggregate(select=[Partial_SUM(a1) AS 
> sum$0])
>                            +- Calc(select=[a1])
>                               +- TableSourceScan(table=[[A, source: 
> [TestTableSource(a1, a2)]]], fields=[a1, a2])
> {code}
> But SINGLE_VALUE is a bit wired in physical plan because the logical plan can 
> make sure there is only one input row. Moreover it it also introduces 
> additional overhead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to