andygrove opened a new pull request, #1626:
URL: https://github.com/apache/datafusion-ballista/pull/1626

   # Which issue does this PR close?
   
   Closes #.
   
   # Rationale for this change
   
   In the sort-based shuffle writer, `spill_count` previously counted batches 
written across all output partitions, so a single memory-pressure flush would 
inflate the count by one per non-empty partition. That makes the metric hard to 
interpret. There was also no log line marking when a spill happened, making it 
hard to correlate spill activity with executor memory pressure.
   
   # What changes are included in this PR?
   
   - `spill_count` metric now counts spill *events* (one per memory-pressure 
flush in `SortShuffleWriterExec`).
   - Renamed `SpillManager::total_spills` to `total_spilled_batches` for 
clarity; it remains the per-batch counter and is included in the existing 
completion log line.
   - Added an INFO log on each spill event with per-event batches/bytes and 
cumulative totals.
   
   # Are there any user-facing changes?
   
   `spill_count` semantics changed from "batches written to spill files" to 
"spill events". The new value is strictly less than or equal to the old one. 
Plan output that prints this metric will show smaller numbers on jobs that 
spill.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to