Re: [PR] Add trace of consumers to OOM error messages [datafusion]

via GitHub Mon, 06 Oct 2025 16:24:07 -0700


wiedld commented on code in PR #17943:
URL: https://github.com/apache/datafusion/pull/17943#discussion_r2408421334



##########
datafusion/core/tests/memory_limit/mod.rs:
##########
@@ -374,8 +393,38 @@ async fn oom_parquet_sink() {
             path.to_string_lossy()
         ))
         .with_expected_errors(vec![
-            "Failed to allocate additional",
-            "for ParquetSink(ArrowColumnWriter)",
+            "Resources exhausted: Additional allocation failed for 
ParquetSink(ArrowColumnWriter(col=1)) with top memory consumers (across 
reservations) as:
+  ParquetSink(ArrowColumnWriter(col=8))#ID(can spill: false) consumed x KB, 
peak x KB:
+stack backtrace:
+   0: ParquetSink(ArrowColumnWriter(col=8))#ID(can spill: false) consumed x 
KB, peak x KB
+   1: ParquetSink(ParallelColumnWriters)#ID(can spill: false) consumed x B, 
peak x B
+   2: ParquetSink(ParallelWriter)#ID(can spill: false) consumed x B, peak x B
+,
+  ParquetSink(ArrowColumnWriter(col=14))#ID(can spill: false) consumed x KB, 
peak x KB:
+stack backtrace:
+   0: ParquetSink(ArrowColumnWriter(col=14))#ID(can spill: false) consumed x 
KB, peak x KB
+   1: ParquetSink(ParallelColumnWriters)#ID(can spill: false) consumed x B, 
peak x B
+   2: ParquetSink(ParallelWriter)#ID(can spill: false) consumed x B, peak x B
+,
+  ParquetSink(ArrowColumnWriter(col=0))#ID(can spill: false) consumed x KB, 
peak x KB:
+stack backtrace:
+   0: ParquetSink(ArrowColumnWriter(col=0))#ID(can spill: false) consumed x 
KB, peak x KB
+   1: ParquetSink(ParallelColumnWriters)#ID(can spill: false) consumed x B, 
peak x B
+   2: ParquetSink(ParallelWriter)#ID(can spill: false) consumed x B, peak x B
+,
+  ParquetSink(ArrowColumnWriter(col=2))#ID(can spill: false) consumed x KB, 
peak x KB:
+stack backtrace:
+   0: ParquetSink(ArrowColumnWriter(col=2))#ID(can spill: false) consumed x 
KB, peak x KB
+   1: ParquetSink(ParallelColumnWriters)#ID(can spill: false) consumed x B, 
peak x B
+   2: ParquetSink(ParallelWriter)#ID(can spill: false) consumed x B, peak x B
+,
+  ParquetSink(ArrowColumnWriter(col=1))#ID(can spill: false) consumed x KB, 
peak x KB:
+stack backtrace:
+   0: ParquetSink(ArrowColumnWriter(col=1))#ID(can spill: false) consumed x 
KB, peak x KB
+   1: ParquetSink(ParallelColumnWriters)#ID(can spill: false) consumed x B, 
peak x B
+   2: ParquetSink(ParallelWriter)#ID(can spill: false) consumed x B, peak x B
+.
+Error: Failed to allocate additional x KB for 
ParquetSink(ArrowColumnWriter(col=1)) with x KB already allocated for this 
reservation - x KB remain available for the total pool",

Review Comment:
   This is an example of using the parent/child relationship to build a trace 
of consumers.
   
   Currently, this approach is limited to the current way that memory 
reservations work. Meaning, the parent's bytes (consumed & peak) do NOT include 
cummulative from all the children. If this is desired, we can make this change 
using the snapshot  `ReportedConsumer `(to not hold the lock).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Add trace of consumers to OOM error messages [datafusion]

Reply via email to