Github user sryza commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4142#discussion_r26064555
  
    --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -263,6 +269,45 @@ private[spark] class Client(
           }
         }
     
    +    // Distribute the Hadoop config files. These are only really used by 
the AM, since executors
    +    // will use the configuration object broadcast by the driver. But this 
is the easiest way to
    +    // make sure the files are available for the AM. The files are placed 
in a subdirectory so
    +    // that they do not clash with other user files. This directory is 
then added to the classpath
    +    // of all processes (both the AM and all the executors), just to make 
sure that everybody is
    +    // using the same default config.
    +    //
    +    // This follows the order of precedence set by the startup scripts, in 
which HADOOP_CONF_DIR
    +    // shows up in the classpath before YARN_CONF_DIR.
    +    //
    +    // Currently this makes a shallow copy of the conf directory. If there 
are cases where a
    +    // Hadoop config directory contains subdirectories, this code will 
have to be fixed.
    +    val hadoopConfFiles = new HashMap[String, File]()
    +    Seq("HADOOP_CONF_DIR", "YARN_CONF_DIR").foreach { envKey =>
    +      sys.env.get(envKey).foreach { path =>
    +        val dir = new File(path)
    +        if (dir.isDirectory()) {
    +          dir.listFiles().foreach { file =>
    +            if (!hadoopConfFiles.contains(file.getName())) {
    +              hadoopConfFiles(file.getName()) = file
    +            }
    +          }
    +        }
    +      }
    +    }
    +
    +    val hadoopConfPath = new Path(dst, HADOOP_CONF_DIR)
    +    fs.mkdirs(hadoopConfPath)
    +
    +    hadoopConfFiles.foreach { case (name, file) =>
    +      addDistributedUri(file.toURI())
    +      val destPath = copyFileToRemote(hadoopConfPath, new 
Path(file.toURI()), replication)
    +      distCacheMgr.addResource(fs, hadoopConf, destPath,
    --- End diff --
    
    Arguments should each have their own line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to