wiedld commented on code in PR #17213: URL: https://github.com/apache/datafusion/pull/17213#discussion_r2289603495
########## datafusion-examples/examples/memory_pool_tracking.rs: ########## @@ -0,0 +1,191 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +//! This example demonstrates how to use TrackConsumersPool for memory tracking and debugging. +//! +//! The TrackConsumersPool provides enhanced error messages that show the top memory consumers +//! when memory allocation fails, making it easier to debug memory issues in DataFusion queries. +//! +//! # Examples +//! +//! * [`automatic_usage_example`]: Shows how to use RuntimeEnvBuilder to automatically enable memory tracking +//! * [`manual_tracking_example`]: Shows how to manually create and configure a TrackConsumersPool + +use datafusion::execution::memory_pool::{ + GreedyMemoryPool, MemoryConsumer, MemoryPool, TrackConsumersPool, +}; +use datafusion::execution::runtime_env::RuntimeEnvBuilder; +use datafusion::prelude::*; +use std::num::NonZeroUsize; +use std::sync::Arc; + +#[tokio::main] +async fn main() -> Result<(), Box<dyn std::error::Error>> { Review Comment: Output is: ``` === DataFusion Memory Pool Tracking Example === Example 1: Automatic Usage with RuntimeEnvBuilder ------------------------------------------------ Success case: Normal operation with sufficient memory ✓ Created table with memory tracking enabled ✓ Query executed successfully. Found 1 rows -------------------------------------------------- Error case: Triggering memory limit error with detailed error messages ✓ Expected memory limit error during data processing: Error: Not enough memory to continue external sort. Consider increasing the memory limit, or decreasing sort_spill_reservation_bytes caused by Resources exhausted: Additional allocation failed with top memory consumers (across reservations) as: ExternalSorter[12]#152(can spill: true) consumed 0.0 B, ExternalSorter[9]#143(can spill: true) consumed 0.0 B, HashJoinInput[7]#136(can spill: false) consumed 0.0 B, RepartitionExec[0]#94(can spill: false) consumed 0.0 B, RepartitionExec[7]#126(can spill: false) consumed 0.0 B. Error: Failed to allocate additional 10.0 MB for ExternalSorterMerge[0] with 0.0 B already allocated for this reservation - 4.8 MB remain available for the total pool Note: The error message above shows which memory consumers were using the most memory when the limit was exceeded. ============================================================ Example 2: Manual Memory Consumer Tracking ------------------------------------------ Created TrackConsumersPool with 1KB limit, tracking top 3 consumers BigOperation: reserved 400 bytes MediumOperation: reserved 200 bytes SmallOperation: reserved 100 bytes Total reserved: 700 bytes out of 1000 byte limit Attempting to reserve 500 bytes (would exceed limit)... Expected failure with detailed error message: Resources exhausted: Additional allocation failed with top memory consumers (across reservations) as: BigOperation#157(can spill: false) consumed 400.0 B, MediumOperation#158(can spill: false) consumed 200.0 B, SmallOperation#159(can spill: false) consumed 100.0 B. Error: Failed to allocate additional 500.0 B for FailingOperation with 0.0 B already allocated for this reservation - 300.0 B remain available for the total pool ============================================================ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org