[GitHub] spark issue #15669: [SPARK-18160][CORE][YARN] SparkContext.addFile doesn't w...

zjffdu Mon, 31 Oct 2016 01:54:57 -0700

Github user zjffdu commented on the issue:

    https://github.com/apache/spark/pull/15669
  
    @jerryshao spark.files is always passed to driver so SparkContext.addFile 
is called in yarn-cluster. 
    
https://github.com/apache/spark/blob/7bf8a4049866b2ec7fdf0406b1ad0c3a12488645/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L609
    ```
        // Load any properties specified through --conf and the default 
properties file
        for ((k, v) <- args.sparkProperties) {
          sysProps.getOrElseUpdate(k, v)
        }
    ```
    
    Seems the issue is that spark.files don't need to be passed to driver in 
yarn-cluster mode. In that case it can be fixed in SparkSubmit.scala, another 
thing is that I notice some suspicious code in SparkContext.addJar, Is the 
following code still needed ? 
https://github.com/apache/spark/blob/39e2bad6a866d27c3ca594d15e574a1da3ee84cc/core/src/main/scala/org/apache/spark/SparkContext.scala#L1710
    ```
     if (master == "yarn" && deployMode == "cluster") {
                  // In order for this to work in yarn cluster mode the user 
must specify the
                  // --addJars option to the client to upload the file into the 
distributed cache
                  // of the AM to make it show up in the current working 
directory.
                  val fileName = new Path(uri.getPath).getName()
                  try {
                    env.rpcEnv.fileServer.addJar(new File(fileName))
                  } catch {
                    case e: Exception =>
                      // For now just log an error but allow to go through so 
spark examples work.
                      // The spark examples don't really need the jar 
distributed since its also
                      // the app jar.
                      logError("Error adding jar (" + e + "), was the --addJars 
option used?")
                      null
                  }
                } else {
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #15669: [SPARK-18160][CORE][YARN] SparkContext.addFile doesn't w...

Reply via email to