[
https://issues.apache.org/jira/browse/SPARK-4006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Or updated SPARK-4006:
-----------------------------
Fix Version/s: 1.0.3
> Spark Driver crashes whenever an Executor is registered twice
> -------------------------------------------------------------
>
> Key: SPARK-4006
> URL: https://issues.apache.org/jira/browse/SPARK-4006
> Project: Spark
> Issue Type: Bug
> Components: Block Manager, Spark Core
> Affects Versions: 0.9.2, 1.0.2, 1.1.0, 1.2.0
> Environment: Mesos, Coarse Grained
> Reporter: Tal Sliwowicz
> Assignee: Tal Sliwowicz
> Priority: Critical
> Fix For: 1.1.1, 1.2.0, 1.0.3
>
>
> This is a huge robustness issue for us (Taboola), in mission critical , time
> sensitive (real time) spark jobs.
> We have long running spark drivers and even though we have state of the art
> hardware, from time to time executors disconnect. In many cases, the
> RemoveExecutor is not received, and when the new executor registers, the
> driver crashes. In mesos coarse grained, executor ids are fixed.
> The issue is with the System.exit(1) in BlockManagerMasterActor
> {code}
> private def register(id: BlockManagerId, maxMemSize: Long, slaveActor:
> ActorRef) {
> if (!blockManagerInfo.contains(id)) {
> blockManagerIdByExecutor.get(id.executorId) match {
> case Some(manager) =>
> // A block manager of the same executor already exists.
> // This should never happen. Let's just quit.
> logError("Got two different block manager registrations on " +
> id.executorId)
> System.exit(1)
> case None =>
> blockManagerIdByExecutor(id.executorId) = id
> }
> logInfo("Registering block manager %s with %s RAM".format(
> id.hostPort, Utils.bytesToString(maxMemSize)))
> blockManagerInfo(id) =
> new BlockManagerInfo(id, System.currentTimeMillis(), maxMemSize,
> slaveActor)
> }
> listenerBus.post(SparkListenerBlockManagerAdded(id, maxMemSize))
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]