Hi Elkhan, Spark submit depends on several things: the launcher jar (1.3.0+ only), the spark-core jar, and the spark-yarn jar (in your case). Why do you want to put it in HDFS though? AFAIK you can't execute scripts directly from HDFS; you need to copy them to a local file system first. I don't see clear benefits of not just running Spark submit from source or from one of the distributions.
-Andrew 2015-06-19 10:12 GMT-07:00 Elkhan Dadashov <elkhan8...@gmail.com>: > Hi all, > > If I want to ship spark-submit script to HDFS. and then call it from HDFS > location for starting Spark job, which other files/folders/jars need to be > transferred into HDFS with spark-submit script ? > > Due to some dependency issues, we can include Spark in our Java > application, so instead we will allow limited usage of Spark only with > Python files. > > So if I want to put spark-submit script into HDFS, and call it to execute > Spark job in Yarn cluster, what else need to be put into HDFS with it ? > > (Using Spark only for execution Spark jobs written in Python) > > Thanks. > >