[
https://issues.apache.org/jira/browse/SPARK-41313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean R. Owen updated SPARK-41313:
---------------------------------
Priority: Minor (was: Major)
> Combine fixes for SPARK-3900 and SPARK-21138
> --------------------------------------------
>
> Key: SPARK-41313
> URL: https://issues.apache.org/jira/browse/SPARK-41313
> Project: Spark
> Issue Type: Bug
> Components: Spark Core, YARN
> Affects Versions: 3.4.0
> Reporter: Xing Lin
> Priority: Minor
>
> spark-3900 fixed the illegalStateException in cleanupStagingDir in
> ApplicationMaster's shutdownhook. However, spark-21138 accidentally
> reverted/undid that change when fixing the "Wrong FS" bug. Now, we are seeing
> spark-3900 reported by our users at Linkedin. We need to bring back the fix
> for spark-3900.
> The illegalStateException when creating a new filesystem object is due to the
> limitation in hadoop that we can not register a shutdownhook during shutdown.
> So, when a spark job fails during pre-launch, as part of shutdown,
> cleanupStagingDir would be called. Then, if we attempt to create a new
> filesystem object for the first time, hadoop would try to register a hook to
> shutdown KeyProviderCache when creating a ClientContext for DFSClient. As a
> result, we hit the illegalStateException. We should avoid the creation of a
> new filesystem object in cleanupStagingDir() when it is called in a shutdown
> hook. This was introduced in spark-3900. However, spark-21138 accidentally
> reverted/undid that change. We need to bring back that fix to Spark to avoid
> the illegalStateException.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]