adriangb opened a new pull request, #20631:
URL: https://github.com/apache/datafusion/pull/20631

   ## Which issue does this PR close?
   
   N/A — discovered while running benchmarks with `bench.sh`.
   
   ## Rationale for this change
   
   When running benchmarks via `bench.sh` / `dfbench`, setting 
`DATAFUSION_RUNTIME_MEMORY_LIMIT=2G` is ignored for memory pool enforcement. 
Most `DATAFUSION_*` env vars work because `SessionConfig::from_env()` picks 
them up, but the memory limit is a special case — it requires constructing a 
`MemoryPool` in the `RuntimeEnv`, which `dfbench` only did when 
`--memory-limit` was passed as a CLI flag.
   
   ## What changes are included in this PR?
   
   In `runtime_env_builder()`, when `self.memory_limit` (CLI flag) is `None`, 
fall back to reading the `DATAFUSION_RUNTIME_MEMORY_LIMIT` env var using the 
existing `parse_memory_limit()` function. The CLI flag still takes precedence 
when provided.
   
   ## Are these changes tested?
   
   Yes — added `test_runtime_env_builder_reads_env_var` which sets the env var, 
constructs a `CommonOpt` with no CLI memory limit, and verifies the resulting 
`RuntimeEnv` has a `Finite(2GB)` memory pool.
   
   ## Are there any user-facing changes?
   
   `dfbench` now honors the `DATAFUSION_RUNTIME_MEMORY_LIMIT` environment 
variable as a fallback when `--memory-limit` is not passed on the command line. 
No breaking changes — existing CLI flag behavior is unchanged.
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to