viirya commented on issue #26086: [SPARK-29302] Make the file name of a task for dynamic partition overwrite be unique URL: https://github.com/apache/spark/pull/26086#issuecomment-547709165 1. addedAbsPathFiles is safe. addedAbsPathFiles records unique output path (including UUID), it is already free from duplicate content or file already existing issue, even without this fix. 2. Dynamic partition overwrite could use partitionPaths. It records the partition paths (like "a=1/b=2") in allPartitionPaths, not written filenames (getFilename you changed here). So even you make getFilename to produce unique filename, all written files under partition paths will be moved to final path. So it still has duplicate result problem, even you change getFilename to be unique filename. I hope it clarifies this thing this time.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
