dbtsai commented on PR #56377:
URL: https://github.com/apache/spark/pull/56377#issuecomment-4654531119

   Pushed two more hardening fixes (now at `d53de1ca572`) from a follow-up 
lifecycle review. Both touch the `CleanerCache` / cache-removal paths added 
after the approval, so a re-look would be appreciated. cc @viirya @cloud-fan
   
   **1. Listener leak when registering a runner for an already-closing 
session.** `CleanerCache.registerCleanerForQuery` touched the 
`streamingListener` lazy val (which does `session.streams.addListener(...)`) 
before checking `isClosing`. That listener is not tracked in `listenerCache`, 
so `close()`'s `removeAllListeners()` never removes it — registering 
during/after close would leak a listener that keeps the 
`CleanerCache`/`SessionHolder` reachable. Added a fast path before listener 
init:
   
   ```scala
   if (sessionHolder.isClosing) {
     cleaner.close()
     return
   }
   ```
   
   The post-insert `isClosing` re-check remains for the narrow race window 
where the session starts closing mid-registration. Test now also asserts the 
fast path registers no listener.
   
   **2. Async cache removal could drop a replacement entry.** The 
closing-session branch in `SparkConnectStreamingQueryCache` removed 
unconditionally:
   
   ```scala
   queryCache.remove(queryKey)
   ```
   
   Since this runs asynchronously and `compute` explicitly allows replacing an 
entry for the same key, it could evict a later replacement. Switched to a 
conditional remove so only the exact entry we inserted is dropped:
   
   ```scala
   if (queryCache.remove(queryKey, value)) {
     tags.foreach { tag => removeTaggedQuery(tag, queryKey) }
   }
   ```
   
   The common path is unchanged: an active query is inserted with `expiresAtMs 
= None`, which `getCachedQuery` never mutates, so the conditional remove still 
succeeds.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to