manuzhang commented on a change in pull request #26339: 
[SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition 
overwrite a task would conflict with its speculative task
URL: https://github.com/apache/spark/pull/26339#discussion_r407819527
 
 

 ##########
 File path: 
core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala
 ##########
 @@ -83,13 +83,38 @@ class HadoopMapReduceCommitProtocol(
    * e.g. a=1/b=2. Files under these partitions will be saved into staging 
directory and moved to
    * destination directory at the end, if `dynamicPartitionOverwrite` is true.
    */
-  @transient private var partitionPaths: mutable.Set[String] = null
+  @transient private[spark] var partitionPaths: mutable.Set[String] = null
 
   /**
    * The staging directory of this write job. Spark uses it to deal with files 
with absolute output
    * path, or writing data into partitioned directory with 
dynamicPartitionOverwrite=true.
    */
-  private def stagingDir = new Path(path, ".spark-staging-" + jobId)
+  private[spark] def stagingDir = new Path(path, ".spark-staging-" + jobId)
+
+  /**
+   * Tracks the staging task files with dynamicPartitionOverwrite=true.
+   */
+  @transient private[spark] var dynamicStagingTaskFiles: mutable.Set[Path] = 
null
 
 Review comment:
   I thinks this can be an immutable `Set`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to