Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r162047382
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
@@ -333,16 +339,19 @@ abstract class SparkStrategies extends
QueryPlanner[SparkPlan] {
*/
object Aggregation extends Strategy {
def apply(plan: LogicalPlan): Seq[SparkPlan] = plan match {
- case PhysicalAggregation(
- groupingExpressions, aggregateExpressions, resultExpressions,
child) =>
+ case PhysicalAggregation(groupingExpressions, aggExpressions,
resultExpressions, child)
+ if aggExpressions.forall(expr =>
expr.isInstanceOf[AggregateExpression]) =>
+
+ val aggregateExpressions = aggExpressions.map(expr =>
+ expr.asInstanceOf[AggregateExpression])
val (functionsWithDistinct, functionsWithoutDistinct) =
aggregateExpressions.partition(_.isDistinct)
if
(functionsWithDistinct.map(_.aggregateFunction.children).distinct.length > 1) {
// This is a sanity check. We should not reach here when we have
multiple distinct
// column sets. Our MultipleDistinctRewriter should take care
this case.
sys.error("You hit a query analyzer bug. Please report your
query to " +
- "Spark user mailing list.")
+ "Spark user mailing list.")
--- End diff --
I can't believe I am nitpicking this again but let's maybe revert this
change back ...
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]