[GitHub] spark pull request: [SPARK-14908] [YARN] Provide support HDFS-loca...

tgravescs Wed, 27 Apr 2016 14:18:06 -0700

Github user tgravescs commented on the pull request:

    https://github.com/apache/spark/pull/12678#issuecomment-215232131
  
    No those are internal Spark configs.    I mean either use the 
--jars/--files/--archives options to spark-submit or use the corresponding 
configs spark.yarn.dist.archives, spark.yarn.dist.files, spark.jars config 
options. See http://spark.apache.org/docs/latest/running-on-yarn.html for 
further descriptions of the configs.  Or run spark-submit --help.
    
    On YARN that will cause whatever files you specify to be downloaded on each 
of the driver/am/executor node and will be put in ./.  ./ is including in the 
classpath so if its a file or jar you don't have to do anything else.  If its 
an archive it will be extracted and if the file you want in classpath is under 
a sub directory you need to modify the extraClasspath.  It properly handles 
things in hdfs:// or in file://.  If you specify something as file:// it looked 
locally on your launcher box, uploads it to the hdfs staging directory and then 
it gets downloaded onto the node.  If its already in hdfs, YARN simply 
downloaded it to the executor before launching.
    
    Note the important note at the bottom of that page:
    The --files and --archives options support specifying file names with the # 
similar to Hadoop. For example you can specify: --files 
localtest.txt#appSees.txt and this will upload the file you have locally named 
localtest.txt into HDFS but this will be linked to by the name appSees.txt, and 
your application should use the name as appSees.txt to reference it when 
running on YARN.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-14908] [YARN] Provide support HDFS-loca...

Reply via email to