L. C. Hsieh created SPARK-34939:
-----------------------------------

             Summary: Throw fetch failure exception when unable to deserialize 
map statuses
                 Key: SPARK-34939
                 URL: https://issues.apache.org/jira/browse/SPARK-34939
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 3.1.1, 3.0.2, 2.4.7, 3.2.0
            Reporter: L. C. Hsieh


One customer encountered application error. From the log, it is caused by 
accessing non-existing broadcasted value. The broadcasted value is map 
statuses. There is a race-condition.

After map statuses are broadcasted and the executors obtain serialized 
broadcasted map statuses. If any fetch failure happens after, Spark scheduler 
invalidates cached map statuses and destroy broadcasted value of the map 
statuses. Then any executor trying to deserialize serialized broadcasted map 
statuses and access broadcasted value, {{IOException}} will be thrown. 
Currently we don't catch it in {{MapOutputTrackerWorker}} and above exception 
will fail the application.

Normally we should throw a fetch failure exception for such case and let Spark 
scheduler handle this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to