[
https://issues.apache.org/jira/browse/SPARK-55097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Herman van Hövell reassigned SPARK-55097:
-----------------------------------------
Assignee: Pranav Dev
> Re-adding cached local relations using ref-counting drops blocks silently
> --------------------------------------------------------------------------
>
> Key: SPARK-55097
> URL: https://issues.apache.org/jira/browse/SPARK-55097
> Project: Spark
> Issue Type: Bug
> Components: Connect, Spark Core
> Affects Versions: 4.1.0, 4.1.1
> Reporter: Pranav Dev
> Assignee: Pranav Dev
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.2.0
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> After the introduction of the ref-counting logic for cloning sessions
> [[link]|https://github.com/apache/spark/pull/52651], whenever an identical
> cached artifact (same session, same hash) is re-added, it incorrectly leds to
> deletion of the existing block.
> Verified this bug locally using:
> {code:java}
> test("re-adding the same cache artifact should not remove the block") {
> val blockManager = spark.sparkContext.env.blockManager
> val remotePath = Paths.get("cache/duplicate_hash")
> val blockId = CacheId(spark.sessionUUID, "duplicate_hash")
> try {
> // First addition
> withTempPath { path =>
> Files.write(path.toPath, "test".getBytes(StandardCharsets.UTF_8))
> artifactManager.addArtifact(remotePath, path.toPath, None)
> }
> assert(blockManager.getLocalBytes(blockId).isDefined)
> blockManager.releaseLock(blockId) // Second addition with same
> hash - block should still exist
> withTempPath { path =>
> Files.write(path.toPath, "test".getBytes(StandardCharsets.UTF_8))
> artifactManager.addArtifact(remotePath, path.toPath, None)
> }
> assert(blockManager.getLocalBytes(blockId).isDefined,
> "Block should still exist after re-adding the same cache artifact")
> } finally {
> blockManager.releaseLock(blockId)
> blockManager.removeCache(spark.sessionUUID)
> }
> } {code}
> which fails {{`assert(blockManager.getLocalBytes(blockId).isDefined`}} check
> after the second addition with the same hash.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]