[ 
https://issues.apache.org/jira/browse/SPARK-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247676#comment-14247676
 ] 

Hong Shen commented on SPARK-4838:
----------------------------------

This is the whole stack.
All we can know is it thow from DAGScheduler.submitMissingTasks, when serialize 
stage.rdd.
{code}
var taskBinary: Broadcast[Array[Byte]] = null
    try {
      // For ShuffleMapTask, serialize and broadcast (rdd, shuffleDep).
      // For ResultTask, serialize and broadcast (rdd, func).
      val taskBinaryBytes: Array[Byte] =
        if (stage.isShuffleMap) {
          closureSerializer.serialize((stage.rdd, stage.shuffleDep.get) : 
AnyRef).array()
        } else {
          closureSerializer.serialize((stage.rdd, stage.resultOfJob.get.func) : 
AnyRef).array()
        }
      taskBinary = sc.broadcast(taskBinaryBytes)
    } catch {
      // In the case of a failure during serialization, abort the stage.
      case e: NotSerializableException =>
        abortStage(stage, "Task not serializable: " + e.toString)
        runningStages -= stage
        return
      case NonFatal(e) =>
        abortStage(stage, s"Task serialization failed: 
$e\n${e.getStackTraceString}")
        runningStages -= stage
        return
    }
{code}


> StackOverflowError when serialization task
> ------------------------------------------
>
>                 Key: SPARK-4838
>                 URL: https://issues.apache.org/jira/browse/SPARK-4838
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, SQL
>    Affects Versions: 1.1.0
>            Reporter: Hong Shen
>
> When run a sql with more than 2000 partitions, each partition a  HadoopRDD, 
> it will cause java.lang.StackOverflowError at serialize task.
>  Error message from spark is:Job aborted due to stage failure: Task 
> serialization failed: java.lang.StackOverflowError
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
> ......



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to