turboFei commented on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the 
issue that for dynamic partition overwrite a task would conflict with its 
speculative task
URL: https://github.com/apache/spark/pull/26339#issuecomment-610700346
 
 
   
https://github.com/apache/spark/blob/8ab2a0c5f23a59c00a9b4191afd976af50d913ba/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala#L104
   
   
https://github.com/apache/spark/blob/8ab2a0c5f23a59c00a9b4191afd976af50d913ba/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala#L127
   
   In fact, there are three cases.
   - dynamic partition overwrite
   - withAbsOutputPath
   - non-dynamic partition overwrite
   
   
   As mentioned above, for non-dynamic partition overwrite, each task has an 
unique working directory.
   
   For the case with abs output path, the task output file name is also unique.
   
https://github.com/apache/spark/blob/8ab2a0c5f23a59c00a9b4191afd976af50d913ba/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala#L134
   
   So, this is an issue only for dynamic partition overwrite.
   
   @Ngone51  @venkata91 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to