[GitHub] [spark] LuciferYang commented on a change in pull request #30026: [SPARK-32978][SQL] Make sure the number of dynamic part metric is correct

GitBox Fri, 16 Oct 2020 01:06:24 -0700


LuciferYang commented on a change in pull request #30026:
URL: https://github.com/apache/spark/pull/30026#discussion_r506148242




##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/BasicWriteStatsTracker.scala
##########
@@ -139,20 +142,22 @@ class BasicWriteJobStatsTracker(
 
   override def processStats(stats: Seq[WriteTaskStats]): Unit = {
     val sparkContext = SparkContext.getActive.get
-    var numPartitions: Long = 0L
+    var partitionsSet: mutable.Set[InternalRow] = mutable.HashSet.empty
     var numFiles: Long = 0L
     var totalNumBytes: Long = 0L
     var totalNumOutput: Long = 0L
 
     val basicStats = stats.map(_.asInstanceOf[BasicWriteTaskStats])
 
     basicStats.foreach { summary =>
-      numPartitions += summary.numPartitions
+      partitionsSet ++= summary.partitions

Review comment:
       ditto, `partitionsSet.addAll(summary.partitions)` can only be used in 
Scala 2.13 too.

##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/BasicWriteStatsTracker.scala
##########
@@ -76,7 +79,7 @@ class BasicWriteTaskStatsTracker(hadoopConf: Configuration)
 
 
   override def newPartition(partitionValues: InternalRow): Unit = {
-    numPartitions += 1
+    partitions = partitions :+ partitionValues

Review comment:
       `partitions.appended(partitionValues)` can only be used in Scala 2.13




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] LuciferYang commented on a change in pull request #30026: [SPARK-32978][SQL] Make sure the number of dynamic part metric is correct

Reply via email to