cloud-fan commented on a change in pull request #27627:
URL: https://github.com/apache/spark/pull/27627#discussion_r416398914



##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala
##########
@@ -62,38 +62,113 @@ case class Sum(child: Expression) extends 
DeclarativeAggregate with ImplicitCast
 
   private lazy val sum = AttributeReference("sum", sumDataType)()
 
+  private lazy val isEmptyOrNulls = AttributeReference("isEmptyOrNulls", 
BooleanType, false)()
+
   private lazy val zero = Literal.default(sumDataType)
 
-  override lazy val aggBufferAttributes = sum :: Nil
+  override lazy val aggBufferAttributes = sum :: isEmptyOrNulls :: Nil
 
   override lazy val initialValues: Seq[Expression] = Seq(
-    /* sum = */ Literal.create(null, sumDataType)
+    /* sum = */  zero,
+    /* isEmptyOrNulls = */ Literal.create(true, BooleanType)
   )
 
+  /**
+   * For decimal types and when child is nullable:
+   * isEmptyOrNulls flag is a boolean to represent if there are no rows or if 
all rows that
+   * have been seen are null.  This will be used to identify if the end result 
of sum in
+   * evaluateExpression should be null or not.
+   *
+   * Update of the isEmptyOrNulls flag:
+   * If this flag is false, then keep it as is.
+   * If this flag is true, then check if the incoming value is null and if it 
is null, keep it
+   * as true else update it to false.
+   * Once this flag is switched to false, it will remain false.
+   *
+   * The update of the sum is as follows:
+   * If sum is null, then we have a case of overflow, so keep sum as is.
+   * If sum is not null, and the incoming value is not null, then perform the 
addition along
+   * with the overflow checking. Note, that if overflow occurs, then sum will 
be null here.

Review comment:
       Is it really necessary? We can let it overflow, and it will become null 
when we write it out to shuffle files.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to