Re: SPARK_YARN_APP_JAR, SPARK_CLASSPATH and ADD_JARS in a spark-shell on YARN

2014-04-24 Thread Christophe Préaud

Good to know, thanks for pointing this out to me!

On 23/04/2014 19:55, Sandy Ryza wrote:
Ah, you're right about SPARK_CLASSPATH and ADD_JARS.  My bad.

SPARK_YARN_APP_JAR is going away entirely - 
https://issues.apache.org/jira/browse/SPARK-1053


On Wed, Apr 23, 2014 at 8:07 AM, Christophe Préaud 
mailto:christophe.pre...@kelkoo.com>> wrote:
Hi Sandy,

Thanks for your reply !

I thought adding the jars in both SPARK_CLASSPATH and ADD_JARS was only 
required as a temporary workaround in spark 0.9.0 (see 
https://issues.apache.org/jira/browse/SPARK-1089), and that it was not 
necessary anymore in 0.9.1

As for SPARK_YARN_APP_JAR, is it really useful, or is it planned to be removed 
in future versions of Spark? I personally always set it to /dev/null when 
launching a spark-shell in yarn-client mode.

Thanks again for your time!
Christophe.


On 21/04/2014 19:16, Sandy Ryza wrote:
Hi Christophe,

Adding the jars to both SPARK_CLASSPATH and ADD_JARS is required.  The former 
makes them available to the spark-shell driver process, and the latter tells 
Spark to make them available to the executor processes running on the cluster.

-Sandy


On Wed, Apr 16, 2014 at 9:27 AM, Christophe Préaud 
mailto:christophe.pre...@kelkoo.com>> wrote:
Hi,

I am running Spark 0.9.1 on a YARN cluster, and I am wondering which is the
correct way to add external jars when running a spark shell on a YARN cluster.

Packaging all this dependencies in an assembly which path is then set in
SPARK_YARN_APP_JAR (as written in the doc:
http://spark.apache.org/docs/latest/running-on-yarn.html) does not work in my
case: it pushes the jar on HDFS in .sparkStaging/application_XXX, but the
spark-shell is still unable to find it (unless ADD_JARS and/or SPARK_CLASSPATH
is defined)

Defining all the dependencies (either in an assembly, or separately) in ADD_JARS
or SPARK_CLASSPATH works (even if SPARK_YARN_APP_JAR is set to /dev/null), but
defining some dependencies in ADD_JARS and the rest in SPARK_CLASSPATH does not!

Hence I'm still wondering which are the differences between ADD_JARS and
SPARK_CLASSPATH, and the purpose of SPARK_YARN_APP_JAR.

Thanks for any insights!
Christophe.



Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.




Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.




Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: SPARK_YARN_APP_JAR, SPARK_CLASSPATH and ADD_JARS in a spark-shell on YARN

2014-04-23 Thread Sandy Ryza
Ah, you're right about SPARK_CLASSPATH and ADD_JARS.  My bad.

SPARK_YARN_APP_JAR is going away entirely -
https://issues.apache.org/jira/browse/SPARK-1053


On Wed, Apr 23, 2014 at 8:07 AM, Christophe Préaud <
christophe.pre...@kelkoo.com> wrote:

>  Hi Sandy,
>
> Thanks for your reply !
>
> I thought adding the jars in both SPARK_CLASSPATH and ADD_JARS was only
> required as a temporary workaround in spark 0.9.0 (see
> https://issues.apache.org/jira/browse/SPARK-1089), and that it was not
> necessary anymore in 0.9.1
>
> As for SPARK_YARN_APP_JAR, is it really useful, or is it planned to be
> removed in future versions of Spark? I personally always set it to
> /dev/null when launching a spark-shell in yarn-client mode.
>
> Thanks again for your time!
> Christophe.
>
>
> On 21/04/2014 19:16, Sandy Ryza wrote:
>
> Hi Christophe,
>
>  Adding the jars to both SPARK_CLASSPATH and ADD_JARS is required.  The
> former makes them available to the spark-shell driver process, and the
> latter tells Spark to make them available to the executor processes running
> on the cluster.
>
>  -Sandy
>
>
> On Wed, Apr 16, 2014 at 9:27 AM, Christophe Préaud <
> christophe.pre...@kelkoo.com> wrote:
>
>> Hi,
>>
>> I am running Spark 0.9.1 on a YARN cluster, and I am wondering which is
>> the
>> correct way to add external jars when running a spark shell on a YARN
>> cluster.
>>
>> Packaging all this dependencies in an assembly which path is then set in
>> SPARK_YARN_APP_JAR (as written in the doc:
>> http://spark.apache.org/docs/latest/running-on-yarn.html) does not work
>> in my
>> case: it pushes the jar on HDFS in .sparkStaging/application_XXX, but the
>> spark-shell is still unable to find it (unless ADD_JARS and/or
>> SPARK_CLASSPATH
>> is defined)
>>
>> Defining all the dependencies (either in an assembly, or separately) in
>> ADD_JARS
>> or SPARK_CLASSPATH works (even if SPARK_YARN_APP_JAR is set to
>> /dev/null), but
>> defining some dependencies in ADD_JARS and the rest in SPARK_CLASSPATH
>> does not!
>>
>> Hence I'm still wondering which are the differences between ADD_JARS and
>> SPARK_CLASSPATH, and the purpose of SPARK_YARN_APP_JAR.
>>
>> Thanks for any insights!
>> Christophe.
>>
>>
>>
>> Kelkoo SAS
>> Société par Actions Simplifiée
>> Au capital de EURO 4.168.964,30
>> Siège social : 8, rue du Sentier 75002 Paris
>> 425 093 069 RCS Paris
>>
>> Ce message et les pièces jointes sont confidentiels et établis à
>> l'attention exclusive de leurs destinataires. Si vous n'êtes pas le
>> destinataire de ce message, merci de le détruire et d'en avertir
>> l'expéditeur.
>>
>
>
>
> --
> Kelkoo SAS
> Société par Actions Simplifiée
> Au capital de EURO 4.168.964,30
> Siège social : 8, rue du Sentier 75002 Paris
> 425 093 069 RCS Paris
>
> Ce message et les pièces jointes sont confidentiels et établis à
> l'attention exclusive de leurs destinataires. Si vous n'êtes pas le
> destinataire de ce message, merci de le détruire et d'en avertir
> l'expéditeur.
>


Re: SPARK_YARN_APP_JAR, SPARK_CLASSPATH and ADD_JARS in a spark-shell on YARN

2014-04-23 Thread Christophe Préaud

Hi Sandy,

Thanks for your reply !

I thought adding the jars in both SPARK_CLASSPATH and ADD_JARS was only 
required as a temporary workaround in spark 0.9.0 (see 
https://issues.apache.org/jira/browse/SPARK-1089), and that it was not 
necessary anymore in 0.9.1

As for SPARK_YARN_APP_JAR, is it really useful, or is it planned to be removed 
in future versions of Spark? I personally always set it to /dev/null when 
launching a spark-shell in yarn-client mode.

Thanks again for your time!
Christophe.

On 21/04/2014 19:16, Sandy Ryza wrote:
Hi Christophe,

Adding the jars to both SPARK_CLASSPATH and ADD_JARS is required.  The former 
makes them available to the spark-shell driver process, and the latter tells 
Spark to make them available to the executor processes running on the cluster.

-Sandy


On Wed, Apr 16, 2014 at 9:27 AM, Christophe Préaud 
mailto:christophe.pre...@kelkoo.com>> wrote:
Hi,

I am running Spark 0.9.1 on a YARN cluster, and I am wondering which is the
correct way to add external jars when running a spark shell on a YARN cluster.

Packaging all this dependencies in an assembly which path is then set in
SPARK_YARN_APP_JAR (as written in the doc:
http://spark.apache.org/docs/latest/running-on-yarn.html) does not work in my
case: it pushes the jar on HDFS in .sparkStaging/application_XXX, but the
spark-shell is still unable to find it (unless ADD_JARS and/or SPARK_CLASSPATH
is defined)

Defining all the dependencies (either in an assembly, or separately) in ADD_JARS
or SPARK_CLASSPATH works (even if SPARK_YARN_APP_JAR is set to /dev/null), but
defining some dependencies in ADD_JARS and the rest in SPARK_CLASSPATH does not!

Hence I'm still wondering which are the differences between ADD_JARS and
SPARK_CLASSPATH, and the purpose of SPARK_YARN_APP_JAR.

Thanks for any insights!
Christophe.



Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.




Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: SPARK_YARN_APP_JAR, SPARK_CLASSPATH and ADD_JARS in a spark-shell on YARN

2014-04-21 Thread Sandy Ryza
Hi Christophe,

Adding the jars to both SPARK_CLASSPATH and ADD_JARS is required.  The
former makes them available to the spark-shell driver process, and the
latter tells Spark to make them available to the executor processes running
on the cluster.

-Sandy


On Wed, Apr 16, 2014 at 9:27 AM, Christophe Préaud <
christophe.pre...@kelkoo.com> wrote:

> Hi,
>
> I am running Spark 0.9.1 on a YARN cluster, and I am wondering which is the
> correct way to add external jars when running a spark shell on a YARN
> cluster.
>
> Packaging all this dependencies in an assembly which path is then set in
> SPARK_YARN_APP_JAR (as written in the doc:
> http://spark.apache.org/docs/latest/running-on-yarn.html) does not work
> in my
> case: it pushes the jar on HDFS in .sparkStaging/application_XXX, but the
> spark-shell is still unable to find it (unless ADD_JARS and/or
> SPARK_CLASSPATH
> is defined)
>
> Defining all the dependencies (either in an assembly, or separately) in
> ADD_JARS
> or SPARK_CLASSPATH works (even if SPARK_YARN_APP_JAR is set to /dev/null),
> but
> defining some dependencies in ADD_JARS and the rest in SPARK_CLASSPATH
> does not!
>
> Hence I'm still wondering which are the differences between ADD_JARS and
> SPARK_CLASSPATH, and the purpose of SPARK_YARN_APP_JAR.
>
> Thanks for any insights!
> Christophe.
>
>
>
> Kelkoo SAS
> Société par Actions Simplifiée
> Au capital de EURO 4.168.964,30
> Siège social : 8, rue du Sentier 75002 Paris
> 425 093 069 RCS Paris
>
> Ce message et les pièces jointes sont confidentiels et établis à
> l'attention exclusive de leurs destinataires. Si vous n'êtes pas le
> destinataire de ce message, merci de le détruire et d'en avertir
> l'expéditeur.
>