Github user tgravescs commented on the pull request:

    https://github.com/apache/spark/pull/969#issuecomment-46039256
  
    I did actually think of another issue with this though and that is that if 
I specify spark.yarn.dist.* in the config in yarn-client mode its going to 
default to hdfs:// and in yarn-cluster mode its going to default to file:// so 
we will have to fix that while leaving the env variable in hdfs://. 
    
    I think we can simply the logic for this in SparkSubmit in the 
OptionAssigner table to just use the config now for both cluster and client 
mode. Change YarnClientSchedulerBackend to only specify the --archives flag if 
the ENV variables is set to keep backwards compatibility.  Everything else 
falls though to the config check in ClientArguments.
    
    It would also be nice if we could add a Unit test for this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to