[ https://issues.apache.org/jira/browse/SPARK-11027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-11027: ------------------------------------ Assignee: (was: Apache Spark) > Better group distinct columns in query compilation > -------------------------------------------------- > > Key: SPARK-11027 > URL: https://issues.apache.org/jira/browse/SPARK-11027 > Project: Spark > Issue Type: Sub-task > Components: SQL > Reporter: Yin Huai > > In AggregationQuerySuite, we have a test > {code} > checkAnswer( > sqlContext.sql( > """ > |SELECT sum(distinct value1), kEY - 100, count(distinct value1) > |FROM agg2 > |GROUP BY Key - 100 > """.stripMargin), > Row(40, -99, 2) :: Row(0, -98, 2) :: Row(null, -97, 0) :: Row(30, null, > 3) :: Nil) > {code} > We will treat it as having two distinct columns because sum causes a cast on > value1. Maybe we can ignore the cast when we group distinct columns. So, it > will not be treated as having two distinct columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org