cxzl25 commented on a change in pull request #30036:
URL: https://github.com/apache/spark/pull/30036#discussion_r583521775
##########
File path: core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala
##########
@@ -169,12 +169,13 @@ private[deploy] object DependencyUtils extends Logging {
}
/**
- * Merge a sequence of comma-separated file lists, some of which may be null
to indicate
- * no files, into a single comma-separated string.
+ * Merge and de-duplicate a sequence of comma-separated file lists,
+ * some of which may be null to indicate no files,
+ * into a single comma-separated string.
*/
def mergeFileLists(lists: String*): String = {
val merged = lists.filterNot(StringUtils.isBlank)
- .flatMap(Utils.stringToSeq)
+ .flatMap(Utils.stringToSeq).distinct
Review comment:
Avoid uploading the same file
https://github.com/apache/spark/blob/5c7d019b609c87a9427fa9309f3aa03d02f61878/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L455-L460
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]