L. C. Hsieh created SPARK-34939:
-----------------------------------
Summary: Throw fetch failure exception when unable to deserialize
map statuses
Key: SPARK-34939
URL: https://issues.apache.org/jira/browse/SPARK-34939
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 3.1.1, 3.0.2, 2.4.7, 3.2.0
Reporter: L. C. Hsieh
One customer encountered application error. From the log, it is caused by
accessing non-existing broadcasted value. The broadcasted value is map
statuses. There is a race-condition.
After map statuses are broadcasted and the executors obtain serialized
broadcasted map statuses. If any fetch failure happens after, Spark scheduler
invalidates cached map statuses and destroy broadcasted value of the map
statuses. Then any executor trying to deserialize serialized broadcasted map
statuses and access broadcasted value, {{IOException}} will be thrown.
Currently we don't catch it in {{MapOutputTrackerWorker}} and above exception
will fail the application.
Normally we should throw a fetch failure exception for such case and let Spark
scheduler handle this.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]