Re: SPARK_YARN_APP_JAR, SPARK_CLASSPATH and ADD_JARS in a spark-shell on YARN
Good to know, thanks for pointing this out to me! On 23/04/2014 19:55, Sandy Ryza wrote: Ah, you're right about SPARK_CLASSPATH and ADD_JARS. My bad. SPARK_YARN_APP_JAR is going away entirely - https://issues.apache.org/jira/browse/SPARK-1053 On Wed, Apr 23, 2014 at 8:07 AM, Christophe Préaud mailto:christophe.pre...@kelkoo.com>> wrote: Hi Sandy, Thanks for your reply ! I thought adding the jars in both SPARK_CLASSPATH and ADD_JARS was only required as a temporary workaround in spark 0.9.0 (see https://issues.apache.org/jira/browse/SPARK-1089), and that it was not necessary anymore in 0.9.1 As for SPARK_YARN_APP_JAR, is it really useful, or is it planned to be removed in future versions of Spark? I personally always set it to /dev/null when launching a spark-shell in yarn-client mode. Thanks again for your time! Christophe. On 21/04/2014 19:16, Sandy Ryza wrote: Hi Christophe, Adding the jars to both SPARK_CLASSPATH and ADD_JARS is required. The former makes them available to the spark-shell driver process, and the latter tells Spark to make them available to the executor processes running on the cluster. -Sandy On Wed, Apr 16, 2014 at 9:27 AM, Christophe Préaud mailto:christophe.pre...@kelkoo.com>> wrote: Hi, I am running Spark 0.9.1 on a YARN cluster, and I am wondering which is the correct way to add external jars when running a spark shell on a YARN cluster. Packaging all this dependencies in an assembly which path is then set in SPARK_YARN_APP_JAR (as written in the doc: http://spark.apache.org/docs/latest/running-on-yarn.html) does not work in my case: it pushes the jar on HDFS in .sparkStaging/application_XXX, but the spark-shell is still unable to find it (unless ADD_JARS and/or SPARK_CLASSPATH is defined) Defining all the dependencies (either in an assembly, or separately) in ADD_JARS or SPARK_CLASSPATH works (even if SPARK_YARN_APP_JAR is set to /dev/null), but defining some dependencies in ADD_JARS and the rest in SPARK_CLASSPATH does not! Hence I'm still wondering which are the differences between ADD_JARS and SPARK_CLASSPATH, and the purpose of SPARK_YARN_APP_JAR. Thanks for any insights! Christophe. Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur. Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur. Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: SPARK_YARN_APP_JAR, SPARK_CLASSPATH and ADD_JARS in a spark-shell on YARN
Ah, you're right about SPARK_CLASSPATH and ADD_JARS. My bad. SPARK_YARN_APP_JAR is going away entirely - https://issues.apache.org/jira/browse/SPARK-1053 On Wed, Apr 23, 2014 at 8:07 AM, Christophe Préaud < christophe.pre...@kelkoo.com> wrote: > Hi Sandy, > > Thanks for your reply ! > > I thought adding the jars in both SPARK_CLASSPATH and ADD_JARS was only > required as a temporary workaround in spark 0.9.0 (see > https://issues.apache.org/jira/browse/SPARK-1089), and that it was not > necessary anymore in 0.9.1 > > As for SPARK_YARN_APP_JAR, is it really useful, or is it planned to be > removed in future versions of Spark? I personally always set it to > /dev/null when launching a spark-shell in yarn-client mode. > > Thanks again for your time! > Christophe. > > > On 21/04/2014 19:16, Sandy Ryza wrote: > > Hi Christophe, > > Adding the jars to both SPARK_CLASSPATH and ADD_JARS is required. The > former makes them available to the spark-shell driver process, and the > latter tells Spark to make them available to the executor processes running > on the cluster. > > -Sandy > > > On Wed, Apr 16, 2014 at 9:27 AM, Christophe Préaud < > christophe.pre...@kelkoo.com> wrote: > >> Hi, >> >> I am running Spark 0.9.1 on a YARN cluster, and I am wondering which is >> the >> correct way to add external jars when running a spark shell on a YARN >> cluster. >> >> Packaging all this dependencies in an assembly which path is then set in >> SPARK_YARN_APP_JAR (as written in the doc: >> http://spark.apache.org/docs/latest/running-on-yarn.html) does not work >> in my >> case: it pushes the jar on HDFS in .sparkStaging/application_XXX, but the >> spark-shell is still unable to find it (unless ADD_JARS and/or >> SPARK_CLASSPATH >> is defined) >> >> Defining all the dependencies (either in an assembly, or separately) in >> ADD_JARS >> or SPARK_CLASSPATH works (even if SPARK_YARN_APP_JAR is set to >> /dev/null), but >> defining some dependencies in ADD_JARS and the rest in SPARK_CLASSPATH >> does not! >> >> Hence I'm still wondering which are the differences between ADD_JARS and >> SPARK_CLASSPATH, and the purpose of SPARK_YARN_APP_JAR. >> >> Thanks for any insights! >> Christophe. >> >> >> >> Kelkoo SAS >> Société par Actions Simplifiée >> Au capital de EURO 4.168.964,30 >> Siège social : 8, rue du Sentier 75002 Paris >> 425 093 069 RCS Paris >> >> Ce message et les pièces jointes sont confidentiels et établis à >> l'attention exclusive de leurs destinataires. Si vous n'êtes pas le >> destinataire de ce message, merci de le détruire et d'en avertir >> l'expéditeur. >> > > > > -- > Kelkoo SAS > Société par Actions Simplifiée > Au capital de EURO 4.168.964,30 > Siège social : 8, rue du Sentier 75002 Paris > 425 093 069 RCS Paris > > Ce message et les pièces jointes sont confidentiels et établis à > l'attention exclusive de leurs destinataires. Si vous n'êtes pas le > destinataire de ce message, merci de le détruire et d'en avertir > l'expéditeur. >
Re: SPARK_YARN_APP_JAR, SPARK_CLASSPATH and ADD_JARS in a spark-shell on YARN
Hi Sandy, Thanks for your reply ! I thought adding the jars in both SPARK_CLASSPATH and ADD_JARS was only required as a temporary workaround in spark 0.9.0 (see https://issues.apache.org/jira/browse/SPARK-1089), and that it was not necessary anymore in 0.9.1 As for SPARK_YARN_APP_JAR, is it really useful, or is it planned to be removed in future versions of Spark? I personally always set it to /dev/null when launching a spark-shell in yarn-client mode. Thanks again for your time! Christophe. On 21/04/2014 19:16, Sandy Ryza wrote: Hi Christophe, Adding the jars to both SPARK_CLASSPATH and ADD_JARS is required. The former makes them available to the spark-shell driver process, and the latter tells Spark to make them available to the executor processes running on the cluster. -Sandy On Wed, Apr 16, 2014 at 9:27 AM, Christophe Préaud mailto:christophe.pre...@kelkoo.com>> wrote: Hi, I am running Spark 0.9.1 on a YARN cluster, and I am wondering which is the correct way to add external jars when running a spark shell on a YARN cluster. Packaging all this dependencies in an assembly which path is then set in SPARK_YARN_APP_JAR (as written in the doc: http://spark.apache.org/docs/latest/running-on-yarn.html) does not work in my case: it pushes the jar on HDFS in .sparkStaging/application_XXX, but the spark-shell is still unable to find it (unless ADD_JARS and/or SPARK_CLASSPATH is defined) Defining all the dependencies (either in an assembly, or separately) in ADD_JARS or SPARK_CLASSPATH works (even if SPARK_YARN_APP_JAR is set to /dev/null), but defining some dependencies in ADD_JARS and the rest in SPARK_CLASSPATH does not! Hence I'm still wondering which are the differences between ADD_JARS and SPARK_CLASSPATH, and the purpose of SPARK_YARN_APP_JAR. Thanks for any insights! Christophe. Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur. Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: SPARK_YARN_APP_JAR, SPARK_CLASSPATH and ADD_JARS in a spark-shell on YARN
Hi Christophe, Adding the jars to both SPARK_CLASSPATH and ADD_JARS is required. The former makes them available to the spark-shell driver process, and the latter tells Spark to make them available to the executor processes running on the cluster. -Sandy On Wed, Apr 16, 2014 at 9:27 AM, Christophe Préaud < christophe.pre...@kelkoo.com> wrote: > Hi, > > I am running Spark 0.9.1 on a YARN cluster, and I am wondering which is the > correct way to add external jars when running a spark shell on a YARN > cluster. > > Packaging all this dependencies in an assembly which path is then set in > SPARK_YARN_APP_JAR (as written in the doc: > http://spark.apache.org/docs/latest/running-on-yarn.html) does not work > in my > case: it pushes the jar on HDFS in .sparkStaging/application_XXX, but the > spark-shell is still unable to find it (unless ADD_JARS and/or > SPARK_CLASSPATH > is defined) > > Defining all the dependencies (either in an assembly, or separately) in > ADD_JARS > or SPARK_CLASSPATH works (even if SPARK_YARN_APP_JAR is set to /dev/null), > but > defining some dependencies in ADD_JARS and the rest in SPARK_CLASSPATH > does not! > > Hence I'm still wondering which are the differences between ADD_JARS and > SPARK_CLASSPATH, and the purpose of SPARK_YARN_APP_JAR. > > Thanks for any insights! > Christophe. > > > > Kelkoo SAS > Société par Actions Simplifiée > Au capital de EURO 4.168.964,30 > Siège social : 8, rue du Sentier 75002 Paris > 425 093 069 RCS Paris > > Ce message et les pièces jointes sont confidentiels et établis à > l'attention exclusive de leurs destinataires. Si vous n'êtes pas le > destinataire de ce message, merci de le détruire et d'en avertir > l'expéditeur. >