Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4881#discussion_r25748314
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -632,8 +633,9 @@ private[spark] object Utils extends Logging {
           fs: FileSystem,
           conf: SparkConf,
           hadoopConf: Configuration,
    -      fileOverwrite: Boolean): Unit = {
    -    if (!targetDir.mkdir()) {
    +      fileOverwrite: Boolean,
    +      filename: Option[String] = None): Unit = {
    +    if (!targetDir.exists() && !targetDir.mkdir()) {
           throw new IOException(s"Failed to create directory 
${targetDir.getPath}")
         }
         fs.listStatus(path).foreach { fileStatus =>
    --- End diff --
    
    To revive the discussion of the previows PR...
    
    Both the problems @andrewor14 and I mentioned exist because of this line. 
`listStatus` behaves differently for files and directories, so when `path` is a 
directory here you'll get both unwanted behaviors:
    
    - children will be copied to `targetDir` instead of 
`$targetDir/${filename.getOrElse(path.getName())}`
    - the code will try to copy all children to the target directory with the 
same `filename` in the first call from `doFetchFile`
    
    So this code needs to behave differently depending on whether `path` is a 
directory or not. The code Andrew posted 
[here](https://github.com/apache/spark/pull/4880/files#r25740726) is pretty 
close to what needs to be done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to