Pranav Dev created SPARK-55097:
----------------------------------

             Summary: Re-adding cached local relations using  ref-counting 
drops blocks silently
                 Key: SPARK-55097
                 URL: https://issues.apache.org/jira/browse/SPARK-55097
             Project: Spark
          Issue Type: Bug
          Components: Connect, Spark Core
    Affects Versions: 4.1.1, 4.1.0
            Reporter: Pranav Dev
             Fix For: 4.2.0


After the introduction of the ref-counting logic for cloning sessions 
[[link|https://github.com/apache/spark/pull/52651]], whenever an identical 
cached artifact (same session, same hash) is re-added, it incorrectly leds to 
deletion of the existing block.

Verified this bug locally using:
{code:java}
test("re-adding the same cache artifact should not remove the block") {
    val blockManager = spark.sparkContext.env.blockManager
    val remotePath = Paths.get("cache/duplicate_hash")
    val blockId = CacheId(spark.sessionUUID, "duplicate_hash")
    try {
      // First addition
      withTempPath { path =>
        Files.write(path.toPath, "test".getBytes(StandardCharsets.UTF_8))
        artifactManager.addArtifact(remotePath, path.toPath, None)
      }
      assert(blockManager.getLocalBytes(blockId).isDefined)
      blockManager.releaseLock(blockId)      // Second addition with same hash 
- block should still exist
      withTempPath { path =>
        Files.write(path.toPath, "test".getBytes(StandardCharsets.UTF_8))
        artifactManager.addArtifact(remotePath, path.toPath, None)
      }
      assert(blockManager.getLocalBytes(blockId).isDefined,
        "Block should still exist after re-adding the same cache artifact")
    } finally {
      blockManager.releaseLock(blockId)
      blockManager.removeCache(spark.sessionUUID)
    }
  } {code}
which fails {{`assert(blockManager.getLocalBytes(blockId).isDefined`}} check 
after the second addition with the same hash.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to