Re:Re: Spark hangs after many task failures

Xu Zhongxing Fri, 08 Aug 2014 19:26:31 -0700

I will try to catch the exception in my spark driver.
Last time I try spark standalone cluster there were some problem with akka and 
DNS. I will try again.
I am sure there tasks failing four times.
I didn't find a way to throttle the Cassandra load. But I think I cannot know 
the load at any time on the Cassandra cluster. Catching the exception and retry 
is the best solution I can think of.








--
发自我的网易邮箱平板适配版



在 2014-08-08 23:07:06，"Martin Weindel" <[email protected]> 写道：

Normally the Spark driver stops by throwing an exception if the same task fails 
four times.
Can you please check in the log if your have these four failures for the same 
task?
I recommend to first take a closer look what's going on in Spark and the 
Cassandra driver before you dive into the scheduling with Mesos.
Or you can run your job using Spark standalone cluster to verify if this 
problem is really related to Mesos.


BTW, your real problem seems to be overloading the Cassandra cluster, so I 
would think about how to throttle the read load your job is producing.



On Fri, Aug 8, 2014 at 9:19 AM, Xu Zhongxing <[email protected]> wrote:

I am using spark + mesos to read data from cassandra. After some time, 
cassandra driver throws exception due to high load on the cluster. And Mesos UI 
shows many task failures. The Spark driver hangs there. I would like the spark 
driver to exit so that I could know that the job fails. Why the spark driver 
program does not exit when mesos stops after many failures?


The detailed log is at 
https://github.com/datastax/spark-cassandra-connector/issues/134

Re:Re: Spark hangs after many task failures

Reply via email to