Re: Running Spark On Yarn without Spark-Submit

2014-08-29 Thread Chester @work
Archit
 We are using yarn-cluster mode , and calling spark via Client class 
directly from servlet server. It works fine. 
To establish a communication channel to give further requests, 
 It should be possible with yarn client, but not with yarn server. Yarn 
client mode, spark driver is outside the yarn cluster; so it can issue more 
commands. In yarn cluster, all programs including spark driver is running 
inside the yarn cluster. There is no communication channel with the client 
until the job finishes.

If you job is to keep spark context alive, and wait for other commands, then 
this should wait forever. 

I am actually working on some improvements on this and experiment in our 
product, I will create PRs when I feel conformable with the solution

1) change Client API to allow the caller to know yarn app resource capacity 
before passing arguments
2) add YarnApplicationListener to the Client 
3) provide communication channel between application and spark Yarn client in 
cluster. 

The #1) is not directly related to the communication discussed here

#2) allows the application to have application life cycle call back as to app 
start end in progress failure etc with yarn resources allocations 

I changed #1 and #2 in forked spark, and it's worked well in cdh5, and I am 
testing against 2.0.5-alpha as well. 

For #3) I did not change in spark currently, as I am not sure the best approach 
yet. I put the change in the application runner which launch the spark yarn 
client in the cluster. 

The runner in yarn cluster get applications host and port information  from the 
passed configuration (args), then creates an Akka actor using spark context 
actor system, send a hand shake message to the caller outside the cluster, 
after that you will have a two way communications 

With this approach, I can send spark listener call backs to the app, error 
messages, app level messages etc. 

The runner inside the cluster can also receive requests from outside cluster 
such as stop. 

We are not sure Akka approach is the best, so I am still experimenting it. So 
far it does what we wants .

Hope this helps

Chester


Sent from my iPhone

> On Aug 29, 2014, at 2:36 AM, Archit Thakur  wrote:
> 
> including u...@spark.apache.org.
> 
> 
>> On Fri, Aug 29, 2014 at 2:03 PM, Archit Thakur  
>> wrote:
>> Hi,
>> 
>> My requirement is to run Spark on Yarn without using the script spark-submit.
>> 
>> I have a servlet and a tomcat server. As and when request comes, it creates 
>> a new SC and keeps it alive for the further requests, I ma setting my master 
>> in sparkConf
>> 
>> as sparkConf.setMaster("yarn-cluster")
>> 
>> but the request is stuck indefinitely. 
>> 
>> This works when I set
>> sparkConf.setMaster("yarn-client")
>> 
>> I am not sure, why is it not launching job in yarn-cluster mode.
>> 
>> Any thoughts?
>> 
>> Thanks and Regards,
>> Archit Thakur. 
> 


Re: Running Spark On Yarn without Spark-Submit

2014-08-29 Thread Archit Thakur
including u...@spark.apache.org.


On Fri, Aug 29, 2014 at 2:03 PM, Archit Thakur 
wrote:

> Hi,
>
> My requirement is to run Spark on Yarn without using the script
> spark-submit.
>
> I have a servlet and a tomcat server. As and when request comes, it
> creates a new SC and keeps it alive for the further requests, I ma setting
> my master in sparkConf
>
> as sparkConf.setMaster("yarn-cluster")
>
> but the request is stuck indefinitely.
>
> This works when I set
> sparkConf.setMaster("yarn-client")
>
> I am not sure, why is it not launching job in yarn-cluster mode.
>
> Any thoughts?
>
> Thanks and Regards,
> Archit Thakur.
>
>
>
>


Running Spark On Yarn without Spark-Submit

2014-08-29 Thread Archit Thakur
Hi,

My requirement is to run Spark on Yarn without using the script
spark-submit.

I have a servlet and a tomcat server. As and when request comes, it creates
a new SC and keeps it alive for the further requests, I ma setting my
master in sparkConf

as sparkConf.setMaster("yarn-cluster")

but the request is stuck indefinitely.

This works when I set
sparkConf.setMaster("yarn-client")

I am not sure, why is it not launching job in yarn-cluster mode.

Any thoughts?

Thanks and Regards,
Archit Thakur.