JackieTien97 opened a new pull request, #17196:
URL: https://github.com/apache/iotdb/pull/17196

   When SharedTsBlockQueue.add() encounters memory pressure, it registers an
   async listener on a MemoryReservationFuture to add the TsBlock later. If
   the upstream FragmentInstance finishes and calls abort()/close() before the
   listener executes, the following race occurs:
   1. abort() sets closed=true, clears the queue, frees 
bufferRetainedSizeInBytes
   2. deRegisterFragmentInstanceFromMemoryPool removes the upstream FI's
      memory mapping
   3. The async listener fires and adds the TsBlock to the closed queue
   4. The downstream consumer calls remove() -> MemoryPool.free() with the
      upstream FI's IDs, but the mapping no longer exists -> NPE
   Fix: Check the `closed` flag inside the async listener before adding the
   TsBlock. When closed, skip the add (memory was already freed by
   abort/close) and complete channelBlocked to prevent hangs.
   Also add a unit test that reproduces this race condition by using a
   manually-controlled SettableFuture to simulate the blocked-on-memory path.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to