with long running apps i see this at times: 13/12/21 12:57:59 INFO scheduler.Stage: Stage 1 is now unavailable on executor 10 (0/66, false) 13/12/21 12:58:19 WARN storage.BlockManagerMasterActor: Removing BlockManager BlockManagerId(1, node10, 33734, 0) with no recent heart beats: 50227ms exceeds 45000ms
typically this would be because of a spark service restart. is there a way to detect this programmatically so that the client can take the correct steps to recover?
