WangGuangxin commented on a change in pull request #26899: [SPARK-28332][SQL] 
Reserve init value -1 only when do min max statistics in SQLMetrics
URL: https://github.com/apache/spark/pull/26899#discussion_r358044877
 
 

 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala
 ##########
 @@ -173,7 +173,12 @@ class SQLAppStatusListener(
       event.taskMetrics.externalAccums.flatMap { a =>
         // This call may fail if the accumulator is gc'ed, so account for that.
         try {
-          Some(a.toInfo(Some(a.value), None))
+          val accumValue = if (a.isInstanceOf[SQLMetric]) {
 
 Review comment:
   As explained in https://issues.apache.org/jira/browse/SPARK-11013
   
   > when we call RDD API inside SparkPlan, we are very likely to reference the 
SparkPlan in the closure and thus serialize and transfer a SparkPlan tree to 
executor side. When we deserialize it, the accumulators in child SparkPlan are 
also deserialized and registered, and always report zero value.
   > This is not a problem currently because we only have one operation to 
aggregate the accumulators: add. However, if we wanna support more complex 
metric like min, the extra zero values will lead to wrong result.
   
   Use -1 as init value here is want to distinguish whether a SQLMetric is 
initialized or not, so that we can filter out those irrelevant/uninitialized 
metrics when do min/max statistics. 
   If we use 0 as init value, we cannot tell whether it is uninitialized or it 
is zero as it is

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to