Re: [I] External sort failing with modest memory limit when writing parquet files [datafusion]

via GitHub Tue, 16 Sep 2025 05:10:26 -0700


alamb commented on issue #15028:
URL: https://github.com/apache/datafusion/issues/15028#issuecomment-3298250229


   @rluvaton  -- I reviewed this issue and agree that the sort problem seems to 
have been solved, so I'll close the issue. Nice work!
   
   
   
   > and it failed with this (no sort this time after always spilling last 
level)
   > 
   > ```
   > Resources exhausted: Additional allocation failed with top memory 
consumers (across reservations) as:
   >   ParquetSink(ArrowColumnWriter)#6(can spill: false) consumed 95.0 MB, 
peak 95.0 MB,
   >   ParquetSink(ArrowColumnWriter)#9(can spill: false) consumed 4.2 MB, peak 
4.2 MB,
   >   ParquetSink(ArrowColumnWriter)#10(can spill: false) consumed 204.7 KB, 
peak 204.7 KB,
   >   ParquetSink(ArrowColumnWriter)#5(can spill: false) consumed 95.2 KB, 
peak 95.2 KB,
   >   ParquetSink(ArrowColumnWriter)#7(can spill: false) consumed 256.0 B, 
peak 256.0 B,
   >   ParquetSink(ArrowColumnWriter)#8(can spill: false) consumed 256.0 B, 
peak 256.0 B,
   >   ParquetSink(SerializedFileWriter)#4(can spill: false) consumed 0.0 B, 
peak 0.0 B.
   > ```
   
   Given this description, it seems like this may be related to memory usage 
while writing to parquet and it is not clear it is a bug or just that 
DataFusion requires more memory to write 8 files in parallel than it was given
   
   I suggest we file a new ticket for optimizing the memory usage for this case 
(writing to parquet) if that is imporatant for anyone. 
   
   There are some hints on memory tuning here: 
https://datafusion.apache.org/user-guide/configs.html#memory-limited-queries 
(basically set `target_partitions` to something lower) that might also help
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] External sort failing with modest memory limit when writing parquet files [datafusion]

Reply via email to