zhztheplayer opened a new pull request, #51849:
URL: https://github.com/apache/spark/pull/51849

   ### What changes were proposed in this pull request?
   
   Turn the field `MemoryConsumer#used` from `long` to `AtomicLong`.
   
   ### Why are the changes needed?
   
   `MemoryConsumer` doesn't provide internal thread-safety so developer should 
add their own lock for concurrent memory allocation in the same task.
   
   Thinking of multiple threads are allocating memory in the same task 
(although it's a special case regarding Spark's memory model), to protect the 
thread-safety of MemoryConsumer, user has to lock the API invocations of it. In 
this case, if one memory consumer spills another concurrently, there's a risk 
of ABBA deadlock. E.g.,
   
   1. In thread 1, consumer A acquires memory from TMM
   2. In thread 2, consumer B acquires memory from TMM and spills consumer A.
   
   Deadlock happens at the moment thread 1 locks consumer A and acquires TMM's 
lock, while consumer B locks TMM then acquires A's lock.
   
   To fix this problem, Spark could ensure MemoryConsumer's thread-safety with 
an atomic `MemoryConsumer#used`, so user doesn't have to add a lock in most 
cases.
   
   ### Does this PR introduce _any_ user-facing change?
   
   A developer change: 
   
   ```
   protected long used;
   ```
   
   will become
   
   ```
   protected final AtomicLong used = new AtomicLong(0L);
   ```
   
   To address this, developers could call `getUsed()` for all Spark versions 
instead (if they need to read the value of `used`), without having to maintain 
a shim layer for this change.
   
   
   ### How was this patch tested?
   
   No need to test from Spark code. But fine to add a case to emulate 
developer's calls if preferred.
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to