turboFei commented on a change in pull request #26086: [SPARK-29302] Make the
file name of a task for dynamic partition overwrite be unique
URL: https://github.com/apache/spark/pull/26086#discussion_r340380432
##########
File path:
core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala
##########
@@ -142,7 +142,12 @@ class HadoopMapReduceCommitProtocol(
// Note that %05d does not truncate the split number, so if we have more
than 100000 tasks,
// the file name is fine and won't overflow.
val split = taskContext.getTaskAttemptID.getTaskID.getId
- f"part-$split%05d-$jobId$ext"
+ val attemptId = taskContext.getTaskAttemptID.getId
+ if (dynamicPartitionOverwrite) {
+ f"part-$split%05d-$attemptId%05d-$jobId$ext"
Review comment:
@viirya I only fix this issue for dynamic partition overwrite.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]