Hi Elkhan,

Spark submit depends on several things: the launcher jar (1.3.0+ only), the
spark-core jar, and the spark-yarn jar (in your case). Why do you want to
put it in HDFS though? AFAIK you can't execute scripts directly from HDFS;
you need to copy them to a local file system first. I don't see clear
benefits of not just running Spark submit from source or from one of the
distributions.

-Andrew

2015-06-19 10:12 GMT-07:00 Elkhan Dadashov <elkhan8...@gmail.com>:

> Hi all,
>
> If I want to ship spark-submit script to HDFS. and then call it from HDFS
> location for starting Spark job, which other files/folders/jars need to be
> transferred into HDFS with spark-submit script ?
>
> Due to some dependency issues, we can include Spark in our Java
> application, so instead we will allow limited usage of Spark only with
> Python files.
>
> So if I want to put spark-submit script into HDFS, and call it to execute
> Spark job in Yarn cluster, what else need to be put into HDFS with it ?
>
> (Using Spark only for execution Spark jobs written in Python)
>
> Thanks.
>
>

Reply via email to