ivankelly opened a new issue, #15028: URL: https://github.com/apache/datafusion/issues/15028
### Describe the bug As discussed on discord, here's another external sort usecase that's failing. Repro: https://github.com/ivankelly/df-repro To run: ``` $ bash setup.sh # download the source data $ RUST_LOG=trace cargo run ... Error: Resources exhausted: Failed to allocate additional 1450451 bytes for ParquetSink(ArrowColumnWriter) with 62770337 bytes already allocated for this reservation - 1107184 bytes remain available for the total pool ``` The code reads in a bunch of parquet files (889MB in total) and tries to sort and output to a single parquet file. Memory is limited to 100MB. Different batch sizes and target partitions doesn't help. ### To Reproduce _No response_ ### Expected behavior _No response_ ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org