2010YOUY01 commented on PR #14975: URL: https://github.com/apache/datafusion/pull/14975#issuecomment-2739748374
@alamb Thanks for the review, I agree there are two things we can improve: 1. Create a new struct for spilling related utilities, instead of putting it inside `DiskManager` 2. Easier-to-use builder for `DiskManager` for configuring `max_temp_dicrectory_size` (@Standing-Man already start helping in https://github.com/apache/datafusion/issues/15319, thank you 🙏🏼 ) I also found another limitation in this spill interface, for certain cases we want: ``` let in_progress_temp_file = spiller.create_in_progress_temp_file(); spiller.append(batch); ... spiller.append(batch); let temp_file = spiller.finish(); ``` Instead of always writing batches to a spill file at once. I plan to address point 1 and the above-mentioned limitation in another PR, and after that rework this PR to add disk limit feature. So closed this PR for now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org