Spark job terminates with an error message but the results seem to be correct

Meisam Fathi Mon, 04 Nov 2013 16:46:38 -0800

Hi Community,

I checked out Spark v0.7.3 tag from github and I am the using it in
standalone mode on a small cluster with 2 nodes. My Spark application
reads files from hdfs and I writes the results back to hdfs.
Everything seems to be working fine except that at the very end of
execution I get this line on my log files:


ERROR executor.StandaloneExecutorBackend: Driver or worker
disconnected! Shutting down.

The last line in my spark application calls rdd.saveAsTextFile to save
the results to hdfs. It seems that it this call works fine because
hadoop _SUCCESS files are being generated on my hdfs.

When I run the same task in spark-repl on one node, I don't get the
error line. I've compared the output from spark-repl with the output
from my spark application and there is no difference.

Looking at StandaloneExecutorBackend code, it seems that
StandaloneExecutorBackend should receive a StopExecutor message but it
is getting a Terminated or RemoteClientDisconnected or
RemoteClientShutdown message.

Is it normal for Spark applications to terminated abruptly at the end
of their execution? Or am I missing something? or is it this the
intended behavior of Spark with two nodes in standalone mode?

Thanks,
Meisam

Spark job terminates with an error message but the results seem to be correct

Reply via email to