Re: running spark job with fat jar file

2017-07-17 Thread ayan guha
Hi Mitch - YARN uses a specific folder convention comprising application id, container id, attempt number and so on. Once you run a spark-submit using Yarn, you can see your application in Yarn RM UI page. Once the app finishes, you can see all logs using yarn logs -applicationId In this log,

Re: running spark job with fat jar file

2017-07-17 Thread Mich Talebzadeh
great Ayan. Is that local folder on HDFS? Will that be a hidden folder specific to the user executing the spark job? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: running spark job with fat jar file

2017-07-17 Thread ayan guha
Hi Here is my understanding: 1. For each container, there will be a local folder be created and application jar will be copied over there 2. Jars mentioned in --jars switch will be copied over to container to the class path of the application. So to your question, --jars is not required to be

Re: running spark job with fat jar file

2017-07-17 Thread Marcelo Vanzin
Yes. On Mon, Jul 17, 2017 at 10:47 AM, Mich Talebzadeh wrote: > thanks Marcelo. > > are these files distributed through hdfs? > > Dr Mich Talebzadeh > > > > LinkedIn > https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > > > >

Re: running spark job with fat jar file

2017-07-17 Thread Mich Talebzadeh
thanks Marcelo. are these files distributed through hdfs? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmich.wordpress.com

Re: running spark job with fat jar file

2017-07-17 Thread Marcelo Vanzin
The YARN backend distributes all files and jars you submit with your application. On Mon, Jul 17, 2017 at 10:45 AM, Mich Talebzadeh wrote: > thanks guys. > > just to clarify let us assume i am doing spark-submit as below: > > ${SPARK_HOME}/bin/spark-submit \ >

Re: running spark job with fat jar file

2017-07-17 Thread Mich Talebzadeh
thanks guys. just to clarify let us assume i am doing spark-submit as below: ${SPARK_HOME}/bin/spark-submit \ --packages ${PACKAGES} \ --driver-memory 2G \ --num-executors 2 \ --executor-memory 2G \ --executor-cores

Re: running spark job with fat jar file

2017-07-17 Thread ayan guha
Hi Mitch your jar file can be anywhere in the file system, including hdfs. If using yarn, preferably use cluster mode in terms of deployment. Yarn will distribute the jar to each container. Best Ayan On Tue, 18 Jul 2017 at 2:17 am, Marcelo Vanzin wrote: > Spark

Re: running spark job with fat jar file

2017-07-17 Thread Marcelo Vanzin
Spark distributes your application jar for you. On Mon, Jul 17, 2017 at 8:41 AM, Mich Talebzadeh wrote: > hi guys, > > > an uber/fat jar file has been created to run with spark in CDH yarc client > mode. > > As usual job is submitted to the edge node. > > does the jar

running spark job with fat jar file

2017-07-17 Thread Mich Talebzadeh
hi guys, an uber/fat jar file has been created to run with spark in CDH yarc client mode. As usual job is submitted to the edge node. does the jar file has to be placed in the same directory ewith spark is running in the cluster to make it work? Also what will happen if say out of 9 nodes