gaoyajun02 commented on a change in pull request #33574:
URL: https://github.com/apache/spark/pull/33574#discussion_r679110020



##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
##########
@@ -580,15 +580,15 @@ class Analyzer(override val catalogManager: 
CatalogManager)
         aggregations: Seq[NamedExpression],
         groupByAliases: Seq[Alias],
         groupingAttrs: Seq[Expression],
-        gid: Attribute): Seq[NamedExpression] = aggregations.map {
+        gid: Attribute): Seq[NamedExpression] = aggregations.map { agg =>
       // collect all the found AggregateExpression, so we can check an 
expression is part of
       // any AggregateExpression or not.
       val aggsBuffer = ArrayBuffer[Expression]()
       // Returns whether the expression belongs to any expressions in 
`aggsBuffer` or not.
       def isPartOfAggregation(e: Expression): Boolean = {
         aggsBuffer.exists(a => a.find(_ eq e).isDefined)
       }
-      replaceGroupingFunc(_, groupByExprs, gid).transformDown {
+      replaceGroupingFunc(agg, groupByExprs, gid).transformDown {

Review comment:
       If map is underscored, aggsBuffer is outside the scope of the map 
function at runtime and it will save the results of all elements.
   If the anonymous function is not used, the aggsBuffer will only be recreated 
each time inside the map function loop.
   
   I suspect that Scala syntactic sugar in the conversion of the code made 
changes to cause, I also debugged this code many times before I found this 
difference, here is a simplified code to test separately.
   
   ```
       def part(e: Int): Int = e
       def testMap(seq: Seq[Int]): Seq[Int] = {
         seq.map {
           val buf = ArrayBuffer[Int]()
           part(_) match {
             case e: Int if e < 1 =>
               val r = e + 1
               println(s"add to buf: " + r)
               buf += r
               r
             case e: Int if buf.contains(e) =>
               println("already in buf")
               0
             case e =>
               println("not in buf")
               e
           }
         }
       }
   
       testMap(Seq(0, 1))
   
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to