[
https://issues.apache.org/jira/browse/SPARK-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Reynold Xin updated SPARK-9257:
-------------------------------
Assignee: Yin Huai
> Fix the false negative of Aggregate2Sort and FinalAndCompleteAggregate2Sort's
> missingInput
> ------------------------------------------------------------------------------------------
>
> Key: SPARK-9257
> URL: https://issues.apache.org/jira/browse/SPARK-9257
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Reporter: Yin Huai
> Assignee: Yin Huai
> Priority: Minor
>
> {code}
> sqlContext.sql(
> """
> |SELECT sum(value)
> |FROM agg1
> |GROUP BY key
> """.stripMargin).explain()
> == Physical Plan ==
> Aggregate2Sort Some(List(key#510)), [key#510], [(sum(CAST(value#511,
> LongType))2,mode=Final,isDistinct=false)], [sum(CAST(value#511,
> LongType))#1435L], [sum(CAST(value#511, LongType))#1435L AS _c0#1426L]
> ExternalSort [key#510 ASC], false
> Exchange hashpartitioning(key#510)
> Aggregate2Sort None, [key#510], [(sum(CAST(value#511,
> LongType))2,mode=Partial,isDistinct=false)], [currentSum#1433L],
> [key#510,currentSum#1433L]
> ExternalSort [key#510 ASC], false
> PhysicalRDD [key#510,value#511], MapPartitionsRDD[97] at apply at
> Transformer.scala:22
> sqlContext.sql(
> """
> |SELECT sum(distinct value)
> |FROM agg1
> |GROUP BY key
> """.stripMargin).explain()
> == Physical Plan ==
> !FinalAndCompleteAggregate2Sort [key#510,CAST(value#511, LongType)#1446L],
> [key#510], [(sum(CAST(value#511,
> LongType)#1446L)2,mode=Complete,isDistinct=false)], [sum(CAST(value#511,
> LongType))#1445L], [sum(CAST(value#511, LongType))#1445L AS _c0#1438L]
> Aggregate2Sort Some(List(key#510)), [key#510,CAST(value#511,
> LongType)#1446L], [key#510,CAST(value#511, LongType)#1446L]
> ExternalSort [key#510 ASC,CAST(value#511, LongType)#1446L ASC], false
> Exchange hashpartitioning(key#510)
> !Aggregate2Sort None, [key#510,CAST(value#511, LongType) AS
> CAST(value#511, LongType)#1446L], [key#510,CAST(value#511, LongType)#1446L]
> ExternalSort [key#510 ASC,CAST(value#511, LongType) AS CAST(value#511,
> LongType)#1446L ASC], false
> PhysicalRDD [key#510,value#511], MapPartitionsRDD[102] at apply at
> Transformer.scala:22
> {code}
> For examples shown above, you can see there is a {{!}} at the bingeing of the
> operator's {{simpleString}}), which indicates that its {{missingInput}} is
> not empty. Actually, it is a false negative and we need to fix it.
> Also, it will be good to make these two operators' {{simpleString}} more
> reader friendly (people can tell what are grouping expressions, what are
> aggregate functions, and what is the mode of an aggregate function).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]