[
https://issues.apache.org/jira/browse/SPARK-34939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
L. C. Hsieh updated SPARK-34939:
--------------------------------
Summary: Throw fetch failure exception when unable to deserialize
broadcasted map statuses (was: Throw fetch failure exception when unable to
deserialize map statuses)
> Throw fetch failure exception when unable to deserialize broadcasted map
> statuses
> ---------------------------------------------------------------------------------
>
> Key: SPARK-34939
> URL: https://issues.apache.org/jira/browse/SPARK-34939
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.4.7, 3.0.2, 3.2.0, 3.1.1
> Reporter: L. C. Hsieh
> Priority: Major
>
> One customer encountered application error. From the log, it is caused by
> accessing non-existing broadcasted value. The broadcasted value is map
> statuses. There is a race-condition.
> After map statuses are broadcasted and the executors obtain serialized
> broadcasted map statuses. If any fetch failure happens after, Spark scheduler
> invalidates cached map statuses and destroy broadcasted value of the map
> statuses. Then any executor trying to deserialize serialized broadcasted map
> statuses and access broadcasted value, {{IOException}} will be thrown.
> Currently we don't catch it in {{MapOutputTrackerWorker}} and above exception
> will fail the application.
> Normally we should throw a fetch failure exception for such case and let
> Spark scheduler handle this.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]