viirya commented on issue #26086: [SPARK-29302] Make the file name of a task 
for dynamic partition overwrite be unique
URL: https://github.com/apache/spark/pull/26086#issuecomment-547709165
 
 
   
   1. addedAbsPathFiles is safe. addedAbsPathFiles records unique output path 
(including UUID), it is already free from duplicate content or file already 
existing issue, even without this fix.
   
   2. Dynamic partition overwrite could use partitionPaths. It records the 
partition paths (like "a=1/b=2") in allPartitionPaths, not written filenames 
(getFilename you changed here). So even you make getFilename to produce unique 
filename, all written files under partition paths will be moved to final path. 
So it still has duplicate result problem, even you change getFilename to be 
unique filename.
   
   I hope it clarifies this thing this time.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to