wiedld commented on code in PR #17943:
URL: https://github.com/apache/datafusion/pull/17943#discussion_r2408421334
##########
datafusion/core/tests/memory_limit/mod.rs:
##########
@@ -374,8 +393,38 @@ async fn oom_parquet_sink() {
path.to_string_lossy()
))
.with_expected_errors(vec![
- "Failed to allocate additional",
- "for ParquetSink(ArrowColumnWriter)",
+ "Resources exhausted: Additional allocation failed for
ParquetSink(ArrowColumnWriter(col=1)) with top memory consumers (across
reservations) as:
+ ParquetSink(ArrowColumnWriter(col=8))#ID(can spill: false) consumed x KB,
peak x KB:
+stack backtrace:
+ 0: ParquetSink(ArrowColumnWriter(col=8))#ID(can spill: false) consumed x
KB, peak x KB
+ 1: ParquetSink(ParallelColumnWriters)#ID(can spill: false) consumed x B,
peak x B
+ 2: ParquetSink(ParallelWriter)#ID(can spill: false) consumed x B, peak x B
+,
+ ParquetSink(ArrowColumnWriter(col=14))#ID(can spill: false) consumed x KB,
peak x KB:
+stack backtrace:
+ 0: ParquetSink(ArrowColumnWriter(col=14))#ID(can spill: false) consumed x
KB, peak x KB
+ 1: ParquetSink(ParallelColumnWriters)#ID(can spill: false) consumed x B,
peak x B
+ 2: ParquetSink(ParallelWriter)#ID(can spill: false) consumed x B, peak x B
+,
+ ParquetSink(ArrowColumnWriter(col=0))#ID(can spill: false) consumed x KB,
peak x KB:
+stack backtrace:
+ 0: ParquetSink(ArrowColumnWriter(col=0))#ID(can spill: false) consumed x
KB, peak x KB
+ 1: ParquetSink(ParallelColumnWriters)#ID(can spill: false) consumed x B,
peak x B
+ 2: ParquetSink(ParallelWriter)#ID(can spill: false) consumed x B, peak x B
+,
+ ParquetSink(ArrowColumnWriter(col=2))#ID(can spill: false) consumed x KB,
peak x KB:
+stack backtrace:
+ 0: ParquetSink(ArrowColumnWriter(col=2))#ID(can spill: false) consumed x
KB, peak x KB
+ 1: ParquetSink(ParallelColumnWriters)#ID(can spill: false) consumed x B,
peak x B
+ 2: ParquetSink(ParallelWriter)#ID(can spill: false) consumed x B, peak x B
+,
+ ParquetSink(ArrowColumnWriter(col=1))#ID(can spill: false) consumed x KB,
peak x KB:
+stack backtrace:
+ 0: ParquetSink(ArrowColumnWriter(col=1))#ID(can spill: false) consumed x
KB, peak x KB
+ 1: ParquetSink(ParallelColumnWriters)#ID(can spill: false) consumed x B,
peak x B
+ 2: ParquetSink(ParallelWriter)#ID(can spill: false) consumed x B, peak x B
+.
+Error: Failed to allocate additional x KB for
ParquetSink(ArrowColumnWriter(col=1)) with x KB already allocated for this
reservation - x KB remain available for the total pool",
Review Comment:
This is an example of using the parent/child relationship to build a trace
of consumers.
Currently, this approach is limited to the current way that memory
reservations work. Meaning, the parent's bytes (consumed & peak) do NOT include
cummulative from all the children. If this is desired, we can make this change
using the snapshot `ReportedConsumer `(to not hold the lock).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]