[
https://issues.apache.org/jira/browse/SPARK-43300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Josh Rosen resolved SPARK-43300.
--------------------------------
Fix Version/s: 3.5.0
Resolution: Fixed
> Cascade failure in Guava cache due to fate-sharing
> --------------------------------------------------
>
> Key: SPARK-43300
> URL: https://issues.apache.org/jira/browse/SPARK-43300
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 3.4.0
> Reporter: Ziqi Liu
> Assignee: Ziqi Liu
> Priority: Major
> Fix For: 3.5.0
>
>
> Guava cache is widely used in spark, however, it suffers from fate-sharing
> behavior: If there are multiple requests trying to access the same key in the
> {{cache}} at the same time when the key is not in the cache, Guava cache will
> block all requests and create the object only once. If the creation fails,
> all requests will fail immediately without retry. So we might see task
> failure due to irrelevant failure in other queries due to fate sharing.
> This fate sharing behavior might lead to unexpected results in some situation.
> We can wrap around Guava cache with a KeyLock to synchronize all requests
> with the same key, so they will run individually and fail as if they come one
> at a time.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]