Can we allow executor to exit when tasks fail too many time?

Tao Li Sun, 05 Jul 2015 21:26:49 -0700

I have a long live spark application running on YARN.

In some nodes, it try to write to the shuffle path in the shuffle map task.
But the root path /search/hadoop10/yarn_local/usercache/spark/ was deleted,
so the task is failed. So every time when running shuffle map task on this
node, it was always failed due to the root path not existed.


I want to know if can set the executor max task failed num? If the task
failed num exceed the threshold, we can let the exectuor offline and offer
a new executor by driver?

shuffle path :
/search/hadoop10/yarn_local/usercache/spark/appcache/application_1434370929997_155180/spark-local-20150703120414-a376/0e/shuffle_20002_720_0.data

Can we allow executor to exit when tasks fail too many time?

Reply via email to