sunchao commented on PR #56293:
URL: https://github.com/apache/spark/pull/56293#issuecomment-4628304268

   @viirya Thanks for the careful review. I addressed the two concrete 
maintainability points in `f72fa825533`:
   
   - `MemoryConsumer.spill()` now documents that implementations must release 
memory synchronously on the calling thread and must not acquire task memory, 
directly or from another thread.
   - The PR description now explicitly calls out the extra pointer-capacity 
check on shuffle insertion instead of saying successful allocations are 
entirely unchanged.
   
   I kept the `ThreadLocal` guard rather than switching to a consumer-keyed 
set. The current guard intentionally covers any nested page allocation on the 
synchronous recovery call stack, including allocations by a consumer other than 
the one currently being spilled. A consumer-keyed set would require predicting 
and registering every possible nested consumer, while allowing an asynchronous 
spill worker to acquire task memory would conflict with the spill contract and 
can reintroduce blocking/deadlock behavior. The two guard helpers also cannot 
currently nest: each checks the guard before setting it, and the recovery loop 
invokes only one guarded path at a time.
   
   This update also fixes an adversarial shuffle regression found during 
follow-up validation: with `spark.shuffle.sort.initialBufferSize=1`, the prior 
head could create one spill per record. Successful pointer growth is now 
retained, with a one-time minimum-array fallback if it starves the first data 
page. The 4,097-record test now produces 12 TimSort spills and 14 radix-sort 
spills instead of 4,097, and radix no longer OOMs. The focused and broader 
memory/shuffle suites pass (68 tests), as does Checkstyle.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to