turboFei opened a new pull request, #56781:
URL: https://github.com/apache/spark/pull/56781

   ### What changes were proposed in this pull request?
   
   Add `export MALLOC_ARENA_MAX="${MALLOC_ARENA_MAX:-4}"` to Spark's Kubernetes 
container entrypoint script (`entrypoint.sh`), placed just before the command 
dispatch (`case "$1" in`).
   
   The value defaults to 4 but respects any pre-set value in the container 
environment, allowing operators to override it without rebuilding images.
   
   ### Why are the changes needed?
   
   `MALLOC_ARENA_MAX` limits the number of glibc malloc arenas. Without this 
cap, glibc creates arenas proportional to the CPU count, which can cause 
excessive virtual memory usage. This is a well-known problem in containerized 
environments where cgroup memory limits are strictly enforced.
   
   Hadoop already sets `MALLOC_ARENA_MAX=4` in its YARN startup scripts 
(`hadoop-functions.sh`). Spark on YARN therefore inherits this setting, but 
Spark on Kubernetes does not — creating inconsistent memory allocation behavior 
across deployment modes. This change closes that gap.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No. The default behavior is equivalent to what YARN users already 
experience. Users who need a different value can override it by setting 
`MALLOC_ARENA_MAX` in their pod spec environment.
   
   ### How was this patch tested?
   
   No automated test added — the change is a single shell variable export with 
no logic to unit-test. Manual verification: confirmed the variable is exported 
correctly in a local shell session and that a pre-set value is preserved (not 
overwritten) by the `:-` default syntax.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Sonnet 4.6


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to