huangtengfei created SPARK-23053:
------------------------------------
Summary: taskBinarySerialization and task partitions calculate in
DagScheduler.submitMissingTasks should keep the same RDD checkpoint status
Key: SPARK-23053
URL: https://issues.apache.org/jira/browse/SPARK-23053
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 2.1.0
Reporter: huangtengfei
When we run concurrent jobs using the same rdd which is marked to do
checkpoint. If one job has finished running the job, and start the process of
RDD.doCheckpoint, while another job is submitted, then submitStage and
submitMissingTasks will be called. In
[submitMissingTasks|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L961],
will serialize taskBinaryBytes and calculate task partitions which are both
affected by the status of checkpoint, if the former is calculated before
doCheckpoint finished, while the latter is calculated after doCheckpoint
finished, when run task, rdd.compute will be called, for some rdds with
particular partition type such as
[MapWithStateRDD|https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/rdd/MapWithStateRDD.scala]
who will do partition type cast, will get a ClassCastException because the
part params is actually a CheckpointRDDPartition.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]