Archit
We are using yarn-cluster mode , and calling spark via Client class
directly from servlet server. It works fine.
To establish a communication channel to give further requests,
It should be possible with yarn client, but not with yarn server. Yarn
client mode, spark driver is outside the yarn cluster; so it can issue more
commands. In yarn cluster, all programs including spark driver is running
inside the yarn cluster. There is no communication channel with the client
until the job finishes.
If you job is to keep spark context alive, and wait for other commands, then
this should wait forever.
I am actually working on some improvements on this and experiment in our
product, I will create PRs when I feel conformable with the solution
1) change Client API to allow the caller to know yarn app resource capacity
before passing arguments
2) add YarnApplicationListener to the Client
3) provide communication channel between application and spark Yarn client in
cluster.
The #1) is not directly related to the communication discussed here
#2) allows the application to have application life cycle call back as to app
start end in progress failure etc with yarn resources allocations
I changed #1 and #2 in forked spark, and it's worked well in cdh5, and I am
testing against 2.0.5-alpha as well.
For #3) I did not change in spark currently, as I am not sure the best approach
yet. I put the change in the application runner which launch the spark yarn
client in the cluster.
The runner in yarn cluster get applications host and port information from the
passed configuration (args), then creates an Akka actor using spark context
actor system, send a hand shake message to the caller outside the cluster,
after that you will have a two way communications
With this approach, I can send spark listener call backs to the app, error
messages, app level messages etc.
The runner inside the cluster can also receive requests from outside cluster
such as stop.
We are not sure Akka approach is the best, so I am still experimenting it. So
far it does what we wants .
Hope this helps
Chester
Sent from my iPhone
> On Aug 29, 2014, at 2:36 AM, Archit Thakur wrote:
>
> including u...@spark.apache.org.
>
>
>> On Fri, Aug 29, 2014 at 2:03 PM, Archit Thakur
>> wrote:
>> Hi,
>>
>> My requirement is to run Spark on Yarn without using the script spark-submit.
>>
>> I have a servlet and a tomcat server. As and when request comes, it creates
>> a new SC and keeps it alive for the further requests, I ma setting my master
>> in sparkConf
>>
>> as sparkConf.setMaster("yarn-cluster")
>>
>> but the request is stuck indefinitely.
>>
>> This works when I set
>> sparkConf.setMaster("yarn-client")
>>
>> I am not sure, why is it not launching job in yarn-cluster mode.
>>
>> Any thoughts?
>>
>> Thanks and Regards,
>> Archit Thakur.
>