Re: What is best way to run spark job in "yarn-cluster" mode from java program(servlet container) and NOT using spark-submit command.
Sandy Ryza writes: > Creating a SparkContext and setting master as yarn-cluster unfortunately > will not work. > > SPARK-4924 added APIs for doing this in Spark, but won't be included until > 1.4. > > -Sandy > Did you look into something like [1]? With that you can make rest API call from your java code. Thanks and Regards Noorul [1] https://github.com/spark-jobserver/spark-jobserver? > On Tue, Mar 17, 2015 at 3:19 AM, Akhil Das > wrote: > >> Create SparkContext set master as yarn-cluster then run it as a standalone >> program? >> >> Thanks >> Best Regards >> >> On Tue, Mar 17, 2015 at 1:27 AM, rrussell25 wrote: >> >>> Hi, were you ever able to determine a satisfactory approach for this >>> problem? >>> I have a similar situation and would prefer to execute the job directly >>> from >>> java code within my jms listener and/or servlet container. >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/What-is-best-way-to-run-spark-job-in-yarn-cluster-mode-from-java-program-servlet-container-and-NOT-u-tp21817p22086.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> - >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: What is best way to run spark job in "yarn-cluster" mode from java program(servlet container) and NOT using spark-submit command.
Creating a SparkContext and setting master as yarn-cluster unfortunately will not work. SPARK-4924 added APIs for doing this in Spark, but won't be included until 1.4. -Sandy On Tue, Mar 17, 2015 at 3:19 AM, Akhil Das wrote: > Create SparkContext set master as yarn-cluster then run it as a standalone > program? > > Thanks > Best Regards > > On Tue, Mar 17, 2015 at 1:27 AM, rrussell25 wrote: > >> Hi, were you ever able to determine a satisfactory approach for this >> problem? >> I have a similar situation and would prefer to execute the job directly >> from >> java code within my jms listener and/or servlet container. >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/What-is-best-way-to-run-spark-job-in-yarn-cluster-mode-from-java-program-servlet-container-and-NOT-u-tp21817p22086.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >
Re: What is best way to run spark job in "yarn-cluster" mode from java program(servlet container) and NOT using spark-submit command.
Create SparkContext set master as yarn-cluster then run it as a standalone program? Thanks Best Regards On Tue, Mar 17, 2015 at 1:27 AM, rrussell25 wrote: > Hi, were you ever able to determine a satisfactory approach for this > problem? > I have a similar situation and would prefer to execute the job directly > from > java code within my jms listener and/or servlet container. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/What-is-best-way-to-run-spark-job-in-yarn-cluster-mode-from-java-program-servlet-container-and-NOT-u-tp21817p22086.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >
Re: What is best way to run spark job in "yarn-cluster" mode from java program(servlet container) and NOT using spark-submit command.
Hi, were you ever able to determine a satisfactory approach for this problem? I have a similar situation and would prefer to execute the job directly from java code within my jms listener and/or servlet container. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/What-is-best-way-to-run-spark-job-in-yarn-cluster-mode-from-java-program-servlet-container-and-NOT-u-tp21817p22086.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
What is best way to run spark job in "yarn-cluster" mode from java program(servlet container) and NOT using spark-submit command.
Hello Spark experts I have tried reading spark documentation and searched many posts in this forum but I couldn't find satisfactory answer to my question. I have recently started using spark, so I may be missing something and that's why I am looking for your guidance here. I have a situation where I am running web application in Jetty using Spring boot.My web application receives a REST web service request based on that It needs to trigger spark calculation job in Yarn cluster. Since my job can take longer to run and can access data from HDFS, so I want to run the spark job in yarn-cluster mode and I don't want to keep spark context alive in my web layer. One other reason for this is my application is multi tenant so each tenant can run it's own job, so in yarn-cluster mode each tenant's job can start it's own driver and run in it's own spark cluster. In web app JVM, I assume I can't run multiple spark context in one JVM. I want to trigger spark jobs in yarn-cluster mode grammatically, from java program in the my web application. what is the best way to achieve this. I am exploring various options and looking your guidance on which one is best 1. I can use *org.apache.spark.deploy.yarn.Client* class /submitApplication()/ method. But I assume this class is not a public API and can change between various spark releases.Also I noticed that this class is made private for spark package in spark 1.2. In version 1.1, it was public. So I have risk of breaking my code when I do spark upgrade if I use this method. 2. I can use *spark-submit* command line shell to submit my jobs. But to trigger it from my web application I need to use either Java ProcessBuilder api or some package built on java ProcessBuilder. This has 2 issues. First it doesn't sound like a clean way of doing it. I should have a programatic way of triggering my spark applications in YARN. If YARN api allows it then why we don't have this in Spark? Second problem will be I will loose the capability of monitoring the submitted application and getting it's status.. Only crude way of doing it is reading the output stream of spark-submit shell, which again doesn't sound like good approach. Please suggest, what is best way of doing this with latest version of spark(1.2.1). Later I have plans to deploy this entire application in amazon EMR. So approach should work there also. Thanks in advance -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/What-is-best-way-to-run-spark-job-in-yarn-cluster-mode-from-java-program-servlet-container-and-NOT-u-tp21817.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org