GitHub user caneGuy opened a pull request:

    https://github.com/apache/spark/pull/21664

    [SPARK-24678][CORE] NoClassDefFoundError will not be catch up which will 
ca…

    …use job hang
    
    ## What changes were proposed in this pull request?
    When NoClassDefFoundError thrown,it will cause job hang.
    `Exception in thread "dag-scheduler-event-loop" 
java.lang.NoClassDefFoundError: 
Lcom/xxx/data/recommend/aggregator/queue/QueueName;
        at java.lang.Class.getDeclaredFields0(Native Method)
        at java.lang.Class.privateGetDeclaredFields(Class.java:2436)
        at java.lang.Class.getDeclaredField(Class.java:1946)
        at 
java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1659)
        at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
        at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:480)
        at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468)
        at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
        at java.io.ObjectOutputStream.writeClass(ObjectOutputStream.java:1212)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1119)
        at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
        at 
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
        at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
        at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
        at 
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
        at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
        at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
        at 
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
        at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
        at java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1377)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1173)
        at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
        at 
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
        at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
        at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
        at 
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
        at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
        at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
        at 
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
        at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
        at java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1377)`
    
    It is caused by NoClassDefFoundError will not catch up during task 
seriazation.
    `var taskBinary: Broadcast[Array[Byte]] = null
        try {
          // For ShuffleMapTask, serialize and broadcast (rdd, shuffleDep).
          // For ResultTask, serialize and broadcast (rdd, func).
          val taskBinaryBytes: Array[Byte] = stage match {
            case stage: ShuffleMapStage =>
              JavaUtils.bufferToArray(
                closureSerializer.serialize((stage.rdd, stage.shuffleDep): 
AnyRef))
            case stage: ResultStage =>
              JavaUtils.bufferToArray(closureSerializer.serialize((stage.rdd, 
stage.func): AnyRef))
          }
    
          taskBinary = sc.broadcast(taskBinaryBytes)
        } catch {
          // In the case of a failure during serialization, abort the stage.
          case e: NotSerializableException =>
            abortStage(stage, "Task not serializable: " + e.toString, Some(e))
            runningStages -= stage
    
            // Abort execution
            return
          case NonFatal(e) =>
            abortStage(stage, s"Task serialization failed: 
$e\n${Utils.exceptionString(e)}", Some(e))
            runningStages -= stage
            return
        }`
    
    ## How was this patch tested?
    UT


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/caneGuy/spark zhoukang/fix-noclassdeferror

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21664.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21664
    
----
commit 340e07cbc96a31b28eca52d31baf5a337566b15b
Author: zhoukang <zhoukang199191@...>
Date:   2018-06-29T01:45:16Z

    [SPARK][CORE] NoClassDefFoundError will not be catch up which will cause 
job hung

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to