Our Spark cluster is configured to write application history event
logging to a directory on HDFS. This all works fine. (I've tested it
with Spark shell.)
However, on a large, long-running job that we ran tonight, one of our
machines at the cloud provider had issues and had to be terminated and
replaced in the middle of the job.
The job completed correctly, and shows in state FINISHED in the
"Completed Applications" section of the Spark GUI. However, when I try
to look at the application's history, the GUI says "Application history
not found" and "Application ... is still in progress".
The reason appears to be the machine that was terminated. When I click
on the executor list for that job, Spark is showing the executor from
the terminated machine as still in state RUNNING.
Any solution/workaround for this? BTW, I'm running Spark v1.3.0.
Thanks,
DR
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org