dbtsai commented on PR #56377:
URL: https://github.com/apache/spark/pull/56377#issuecomment-4654554381
One more hardening fix (now at `437c783d81a`). cc @viirya @cloud-fan
The async closing-session cleanup previously removed by
`queryCache.remove(queryKey, value)`. Since case-class equality includes
`expiresAtMs`, that removal could fail if the maintenance thread observed the
just-stopped query first and rewrote the entry to `value.copy(expiresAtMs =
Some(...))` -- leaving the closed session strongly referenced in the cache
until the stopped-query inactivity timeout (the query is stopped, so this is
not the "runs forever" leak, just a delayed reference release).
Switched to removing by **query identity**, which is robust to a concurrent
`expiresAtMs` rewrite while still never evicting a later replacement for the
same key:
```scala
val removed = new AtomicBoolean(false)
queryCache.computeIfPresent(queryKey, (_, current) => {
if (current.query eq query) { removed.set(true); null } else current
})
if (removed.get()) {
tags.foreach { tag => removeTaggedQuery(tag, queryKey) }
}
```
No new test: the change is correct by construction (identity match) and the
existing closing-session / stop-failure / race tests still cover the paths; the
specific maintenance-rewrite interleaving isn't deterministically reproducible
in a unit test.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]