erenavsarogullari opened a new pull request, #20387:
URL: https://github.com/apache/datafusion/pull/20387

   ## Which issue does this PR close?
   - Closes #20386.
   
   ## Rationale for this change
   `memory_limit` (`RuntimeEnvBuilder::new().with_memory_limit()`) 
configuration uses `greedy` memory pool as `default`. However, if `memory_pool` 
(`RuntimeEnvBuilder::new().with_memory_pool()`) is set, it overrides by 
expected `memory_pool` config such as `fair`. Also, if both `memory_limit` and 
`memory_pool` configs are not set, then `unbounded` memory pool is being used 
so it can be useful to expose `ultimately used/selected pool` as part of 
Resources Exhausted error message for the end user awareness and the user may 
need to switch used memory pool (`greedy, fair, unbounded`),
   - Also, [this comparison 
table](https://github.com/lance-format/lance/issues/3601#issuecomment-2752838168)
 is an example use-case  for both greedy and fair memory pools runtime 
behaviors and this addition can help for this kind of comparison table by 
exposing used memory pool info as part of native logs.
   
   **Case1**: datafusion-cli result when `memory-limit` and 
`top-memory-consumers > 0` are set:
   ```
   eren.avsarogullari@AWGNPWVK961 debug % ./datafusion-cli --memory-limit 10M 
--command 'select * from generate_series(1,500000) as t1(v1) order by v1;' 
--top-memory-consumers 3
   DataFusion CLI v52.1.0
   Error: Not enough memory to continue external sort. Consider increasing the 
memory limit config: 'datafusion.runtime.memory_limit', or decreasing the 
config: 'datafusion.execution.sort_spill_reservation_bytes'.
   caused by
   Resources exhausted: Additional allocation failed for ExternalSorter[0] with 
top memory consumers (across reservations) using 'greedy' pool as:
     ExternalSorterMerge[0]#2(can spill: false) consumed 10.0 MB, peak 10.0 MB,
     DataFusion-Cli#0(can spill: false) consumed 0.0 B, peak 0.0 B,
     ExternalSorter[0]#1(can spill: true) consumed 0.0 B, peak 0.0 B.
   Error: Failed to allocate additional 128.0 KB for ExternalSorter[0] with 0.0 
B already allocated for this reservation - 0.0 B remain available for the total 
'greedy' pool
   ```
   **Case2**: datafusion-cli result when `memory-limit` and 
`top-memory-consumers = 0` (disabling top memory consumers logging) are set:
   ```
   eren.avsarogullari@AWGNPWVK961 debug % ./datafusion-cli --memory-limit 10M 
--command 'select * from generate_series(1,500000) as t1(v1) order by v1;' 
--top-memory-consumers 0
   DataFusion CLI v52.1.0
   Error: Not enough memory to continue external sort. Consider increasing the 
memory limit config: 'datafusion.runtime.memory_limit', or decreasing the 
config: 'datafusion.execution.sort_spill_reservation_bytes'.
   caused by
   Resources exhausted: Failed to allocate additional 128.0 KB for 
ExternalSorter[0] with 0.0 B already allocated for this reservation - 0.0 B 
remain available for the total 'greedy' pool
   ```
   **Case3**: datafusion-cli result when only `memory-limit`, `memory-pool` and 
`top-memory-consumers > 0` are set:
   ```
   eren.avsarogullari@AWGNPWVK961 debug % ./datafusion-cli --memory-limit 10M 
--mem-pool-type fair --top-memory-consumers 3 --command 'select * from 
generate_series(1,500000) as t1(v1) order by v1;'
   DataFusion CLI v52.1.0
   Error: Not enough memory to continue external sort. Consider increasing the 
memory limit config: 'datafusion.runtime.memory_limit', or decreasing the 
config: 'datafusion.execution.sort_spill_reservation_bytes'.
   caused by
   Resources exhausted: Additional allocation failed for ExternalSorter[0] with 
top memory consumers (across reservations) using 'fair' pool as:
     ExternalSorterMerge[0]#2(can spill: false) consumed 10.0 MB, peak 10.0 MB,
     DataFusion-Cli#0(can spill: false) consumed 0.0 B, peak 0.0 B,
     ExternalSorter[0]#1(can spill: true) consumed 0.0 B, peak 0.0 B.
   Error: Failed to allocate additional 128.0 KB for ExternalSorter[0] with 0.0 
B already allocated for this reservation - 0.0 B remain available for the total 
'fair' pool
   ```
   
   ## What changes are included in this PR?
   - Adding name property to MemoryPool instances,
   - Expose used MemoryPool info to Resources Exhausted error messages
   
   ## Are these changes tested?
   Yes and updating existing test cases.
   
   ## Are there any user-facing changes?
   Yes, being updated Resources Exhausted error messages.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to