[jira] [Commented] (SPARK-21888) Cannot add stuff to Client Classpath for Yarn Cluster Mode
[ https://issues.apache.org/jira/browse/SPARK-21888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16239930#comment-16239930 ] Apache Spark commented on SPARK-21888: -- User 'yaooqinn' has created a pull request for this issue: https://github.com/apache/spark/pull/19663 > Cannot add stuff to Client Classpath for Yarn Cluster Mode > -- > > Key: SPARK-21888 > URL: https://issues.apache.org/jira/browse/SPARK-21888 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Parth Gandhi >Priority: Minor > > While running Spark on Yarn in cluster mode, currently there is no way to add > any config files to Client classpath. An example for this is that suppose you > want to run an application that uses hbase. Then, unless and until we do not > copy the necessary config files required by hbase to Spark Config folder, we > cannot specify or set their exact locations in classpath on Client end which > we could do so earlier by setting the environment variable "SPARK_CLASSPATH". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21888) Cannot add stuff to Client Classpath for Yarn Cluster Mode
[ https://issues.apache.org/jira/browse/SPARK-21888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154006#comment-16154006 ] Thomas Graves commented on SPARK-21888: --- the client needs to get the hbase credentials for secure hbase to send along with the job run in cluster mode. If it doesn't load the jars and the hbase-site.xml it won't get the credentials to send along and driver/executors won't be able to talk to hbase. > Cannot add stuff to Client Classpath for Yarn Cluster Mode > -- > > Key: SPARK-21888 > URL: https://issues.apache.org/jira/browse/SPARK-21888 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Parth Gandhi >Priority: Minor > > While running Spark on Yarn in cluster mode, currently there is no way to add > any config files to Client classpath. An example for this is that suppose you > want to run an application that uses hbase. Then, unless and until we do not > copy the necessary config files required by hbase to Spark Config folder, we > cannot specify or set their exact locations in classpath on Client end which > we could do so earlier by setting the environment variable "SPARK_CLASSPATH". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21888) Cannot add stuff to Client Classpath for Yarn Cluster Mode
[ https://issues.apache.org/jira/browse/SPARK-21888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16153888#comment-16153888 ] Marco Gaido commented on SPARK-21888: - [~tgraves] Sorry, I misread. Of course, this doesn't add it to the client, only to the driver and the executors. But in the example you made, ie. writing to HBase, I can't see why you would need it: it is enough to load the conf in driver and the executors. > Cannot add stuff to Client Classpath for Yarn Cluster Mode > -- > > Key: SPARK-21888 > URL: https://issues.apache.org/jira/browse/SPARK-21888 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Parth Gandhi >Priority: Minor > > While running Spark on Yarn in cluster mode, currently there is no way to add > any config files to Client classpath. An example for this is that suppose you > want to run an application that uses hbase. Then, unless and until we do not > copy the necessary config files required by hbase to Spark Config folder, we > cannot specify or set their exact locations in classpath on Client end which > we could do so earlier by setting the environment variable "SPARK_CLASSPATH". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21888) Cannot add stuff to Client Classpath for Yarn Cluster Mode
[ https://issues.apache.org/jira/browse/SPARK-21888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16153744#comment-16153744 ] Thomas Graves commented on SPARK-21888: --- also note that you can do this in client mode by using the driver extra classpath option. > Cannot add stuff to Client Classpath for Yarn Cluster Mode > -- > > Key: SPARK-21888 > URL: https://issues.apache.org/jira/browse/SPARK-21888 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Parth Gandhi >Priority: Minor > > While running Spark on Yarn in cluster mode, currently there is no way to add > any config files to Client classpath. An example for this is that suppose you > want to run an application that uses hbase. Then, unless and until we do not > copy the necessary config files required by hbase to Spark Config folder, we > cannot specify or set their exact locations in classpath on Client end which > we could do so earlier by setting the environment variable "SPARK_CLASSPATH". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21888) Cannot add stuff to Client Classpath for Yarn Cluster Mode
[ https://issues.apache.org/jira/browse/SPARK-21888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16153739#comment-16153739 ] Thomas Graves commented on SPARK-21888: --- [~mgaido] I don't think that is true unless something went into master that I'm not aware of. It doesn't work with 2.2 for sure. We need the file/directory to get into the classpath of the client submitting the application. --files does work to get the driver and executors to load it. if I'm missing something please let me know. > Cannot add stuff to Client Classpath for Yarn Cluster Mode > -- > > Key: SPARK-21888 > URL: https://issues.apache.org/jira/browse/SPARK-21888 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Parth Gandhi >Priority: Minor > > While running Spark on Yarn in cluster mode, currently there is no way to add > any config files to Client classpath. An example for this is that suppose you > want to run an application that uses hbase. Then, unless and until we do not > copy the necessary config files required by hbase to Spark Config folder, we > cannot specify or set their exact locations in classpath on Client end which > we could do so earlier by setting the environment variable "SPARK_CLASSPATH". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21888) Cannot add stuff to Client Classpath for Yarn Cluster Mode
[ https://issues.apache.org/jira/browse/SPARK-21888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151128#comment-16151128 ] Marco Gaido commented on SPARK-21888: - It is enough to add {{hbase-site.xml}} using {{--files}} in cluster mode to have it interpreted. The problem is in client mode: in this case it should be added to the Spark conf dir to be added to the classpath. > Cannot add stuff to Client Classpath for Yarn Cluster Mode > -- > > Key: SPARK-21888 > URL: https://issues.apache.org/jira/browse/SPARK-21888 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Parth Gandhi >Priority: Minor > > While running Spark on Yarn in cluster mode, currently there is no way to add > any config files, jars etc. to Client classpath. An example for this is that > suppose you want to run an application that uses hbase. Then, unless and > until we do not copy the necessary config files required by hbase to Spark > Config folder, we cannot specify or set their exact locations in classpath on > Client end which we could do so earlier by setting the environment variable > "SPARK_CLASSPATH". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21888) Cannot add stuff to Client Classpath for Yarn Cluster Mode
[ https://issues.apache.org/jira/browse/SPARK-21888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150530#comment-16150530 ] Thomas Graves commented on SPARK-21888: --- Putting things into SPARK_CONF_DIR will work, the question is more about convenience for users. In hosted/multitenant environments there is probably a generic SPARK_CONF_DIR shared by everyone (at least this is how our env works), for the user to add hbase-site.xml they would have to copy, add files and then export SPARK_CONF_DIR. If that user continues to use the copied version they might miss changes to the cluster version, etc. Previously they didn't have to do this, they just had to set SPARK_CLASSPATH, of course even that doesn't always work if your cluster env (spark_env.sh) had SPARK_CLASSPATH set in it. So the question is more of what we think about this for convenience for users. Personally I think it would be nice to have a config that would allow users to set an extra classpath on the client side without having to modify the SPARK_CONF_DIR? I think we can move this to an improvement jira, if other people here don't agree or see the usefulness then we can just close. > Cannot add stuff to Client Classpath for Yarn Cluster Mode > -- > > Key: SPARK-21888 > URL: https://issues.apache.org/jira/browse/SPARK-21888 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Parth Gandhi >Priority: Minor > > While running Spark on Yarn in cluster mode, currently there is no way to add > any config files, jars etc. to Client classpath. An example for this is that > suppose you want to run an application that uses hbase. Then, unless and > until we do not copy the necessary config files required by hbase to Spark > Config folder, we cannot specify or set their exact locations in classpath on > Client end which we could do so earlier by setting the environment variable > "SPARK_CLASSPATH". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21888) Cannot add stuff to Client Classpath for Yarn Cluster Mode
[ https://issues.apache.org/jira/browse/SPARK-21888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150258#comment-16150258 ] Saisai Shao commented on SPARK-21888: - Jars added by "--jars" will be added to client classpath in yarn-cluster mode. In your case the only problem is about hbase-site.xml, normally we will put this file in SPARK_CONF_DIR as well as hive-site.xml, doesn't it work for your? > Cannot add stuff to Client Classpath for Yarn Cluster Mode > -- > > Key: SPARK-21888 > URL: https://issues.apache.org/jira/browse/SPARK-21888 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Parth Gandhi >Priority: Minor > > While running Spark on Yarn in cluster mode, currently there is no way to add > any config files, jars etc. to Client classpath. An example for this is that > suppose you want to run an application that uses hbase. Then, unless and > until we do not copy the necessary config files required by hbase to Spark > Config folder, we cannot specify or set their exact locations in classpath on > Client end which we could do so earlier by setting the environment variable > "SPARK_CLASSPATH". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21888) Cannot add stuff to Client Classpath for Yarn Cluster Mode
[ https://issues.apache.org/jira/browse/SPARK-21888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149350#comment-16149350 ] Parth Gandhi commented on SPARK-21888: -- The spark job runs successfully only if hbase-site.xml is placed in SPARK_CONF_DIR. If I add the xml file to --jars then it gets added to the driver classmate which is required but hbase fails to get a valid Kerberos token as the xml file is not found in the system classpath on the gateway where I launch the application. > Cannot add stuff to Client Classpath for Yarn Cluster Mode > -- > > Key: SPARK-21888 > URL: https://issues.apache.org/jira/browse/SPARK-21888 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Parth Gandhi >Priority: Minor > > While running Spark on Yarn in cluster mode, currently there is no way to add > any config files, jars etc. to Client classpath. An example for this is that > suppose you want to run an application that uses hbase. Then, unless and > until we do not copy the necessary config files required by hbase to Spark > Config folder, we cannot specify or set their exact locations in classpath on > Client end which we could do so earlier by setting the environment variable > "SPARK_CLASSPATH". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21888) Cannot add stuff to Client Classpath for Yarn Cluster Mode
[ https://issues.apache.org/jira/browse/SPARK-21888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149315#comment-16149315 ] Sean Owen commented on SPARK-21888: --- For your case specifically, shouldn't hbase-site.xml be available from the cluster environment, given its name? What happens if you add the file to --jars anyway; I'd not be surprised if it just ends up on the classpath too. Or, build it into your app jar? > Cannot add stuff to Client Classpath for Yarn Cluster Mode > -- > > Key: SPARK-21888 > URL: https://issues.apache.org/jira/browse/SPARK-21888 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Parth Gandhi >Priority: Minor > > While running Spark on Yarn in cluster mode, currently there is no way to add > any config files, jars etc. to Client classpath. An example for this is that > suppose you want to run an application that uses hbase. Then, unless and > until we do not copy the necessary config files required by hbase to Spark > Config folder, we cannot specify or set their exact locations in classpath on > Client end which we could do so earlier by setting the environment variable > "SPARK_CLASSPATH". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21888) Cannot add stuff to Client Classpath for Yarn Cluster Mode
[ https://issues.apache.org/jira/browse/SPARK-21888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149263#comment-16149263 ] Parth Gandhi commented on SPARK-21888: -- Sorry I forgot to mention that, --jars certainly adds jar files to client classpath, but not config files like hbase-site.xml. > Cannot add stuff to Client Classpath for Yarn Cluster Mode > -- > > Key: SPARK-21888 > URL: https://issues.apache.org/jira/browse/SPARK-21888 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Parth Gandhi >Priority: Minor > > While running Spark on Yarn in cluster mode, currently there is no way to add > any config files, jars etc. to Client classpath. An example for this is that > suppose you want to run an application that uses hbase. Then, unless and > until we do not copy the necessary config files required by hbase to Spark > Config folder, we cannot specify or set their exact locations in classpath on > Client end which we could do so earlier by setting the environment variable > "SPARK_CLASSPATH". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21888) Cannot add stuff to Client Classpath for Yarn Cluster Mode
[ https://issues.apache.org/jira/browse/SPARK-21888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149217#comment-16149217 ] Sean Owen commented on SPARK-21888: --- You haven't said what you tried. --jars does this. > Cannot add stuff to Client Classpath for Yarn Cluster Mode > -- > > Key: SPARK-21888 > URL: https://issues.apache.org/jira/browse/SPARK-21888 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Parth Gandhi >Priority: Minor > > While running Spark on Yarn in cluster mode, currently there is no way to add > any config files, jars etc. to Client classpath. An example for this is that > suppose you want to run an application that uses hbase. Then, unless and > until we do not copy the necessary config files required by hbase to Spark > Config folder, we cannot specify or set their exact locations in classpath on > Client end which we could do so earlier by setting the environment variable > "SPARK_CLASSPATH". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org