andygrove opened a new pull request, #1431: URL: https://github.com/apache/datafusion-ballista/pull/1431
## Summary - Keeps one `StreamWriter` open per output partition in `SpillManager`, appending across multiple spill calls instead of creating a new file each time - Reduces file descriptor usage and eliminates redundant IPC header/footer overhead from multiple files - Streams spill data back batch-by-batch via `open_spill_reader()` instead of loading all spilled batches into memory at once ## Test plan - [x] Existing sort shuffle unit tests pass (`cargo test -p ballista-core sort_shuffle` — 15 tests) - [x] `cargo clippy` and `cargo fmt` clean 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
