cloud-fan commented on code in PR #49248:
URL: https://github.com/apache/spark/pull/49248#discussion_r1893560686


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala:
##########
@@ -6183,12 +6183,30 @@ class AstBuilder extends DataTypeAstBuilder
           a.aggregateExpressions.foreach(visit)
           // Prepend grouping keys to the list of aggregate functions, since 
operator pipe AGGREGATE
           // clause returns the GROUP BY expressions followed by the list of 
aggregate functions.
-          val namedGroupingExpressions: Seq[NamedExpression] =
-            a.groupingExpressions.map {
-              case n: NamedExpression => n
-              case e: Expression => UnresolvedAlias(e, None)
-            }
-          a.copy(aggregateExpressions = namedGroupingExpressions ++ 
a.aggregateExpressions)
+          val newGroupingExpressions = ArrayBuffer.empty[Expression]
+          val newAggregateExpressions = ArrayBuffer.empty[NamedExpression]
+          var numGroupByOrdinalReferences = 0
+          a.groupingExpressions.foreach {
+            case n: NamedExpression =>
+              newGroupingExpressions += n
+              newAggregateExpressions += n
+            // If the grouping expression is an integer literal, create 
[[UnresolvedOrdinal]] and
+            // [[UnresolvedPipeAggregateOrdinal]] expressions to represent it 
in the final grouping
+            // and aggregate expressions, respectively. This will let the
+            // [[ResolveOrdinalInOrderByAndGroupBy]] rule detect the ordinal 
in the aggregate list
+            // and replace it with the corresponding attribute from the child 
operator.
+            case Literal(v: Int, IntegerType) if conf.groupByOrdinal =>
+              numGroupByOrdinalReferences += 1
+              newGroupingExpressions += 
UnresolvedOrdinal(numGroupByOrdinalReferences)

Review Comment:
   What if we have mixed group by col and group by ordinal? e.g.
   ```
   a plan with output [c1, c2, c3, c4, value]
   |> aggregate sum(value) group by c1, c2, c3, 4
   ```
   This should be identical to `group by c1, c2, c3, c4` but with the current 
implementation it becomes `group by c1, c2, c3, c1`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to