[ 
https://issues.apache.org/jira/browse/SPARK-54596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-54596:
----------------------------------
    Affects Version/s: 4.2.0
                           (was: 4.2)

> Burst-aware memoryOverhead Allocation Algorithm for Spark@K8S
> -------------------------------------------------------------
>
>                 Key: SPARK-54596
>                 URL: https://issues.apache.org/jira/browse/SPARK-54596
>             Project: Spark
>          Issue Type: Improvement
>          Components: Kubernetes, Spark Core
>    Affects Versions: 4.2.0
>            Reporter: Nan Zhu
>            Priority: Major
>
> {{memoryOverhead}} is one of the largest contributors to memory consumption 
> in Spark workloads. To avoid being killed by the OOM Killer in cluster 
> environments, users typically reserve a large {{memoryOverhead}} budget. 
> However, the actual usage pattern of this space—covering 
> serialization/deserialization buffers, compression/decompression workspaces, 
> direct memory allocations, etc.—is highly bursty. As a result, users must 
> provision for peak {{memoryOverhead}} even though most Spark tasks utilize 
> only a small fraction of it for the majority of their lifespan.
> To address this inefficiency in {*}Spark on Kubernetes{*}, we implemented the 
> algorithm proposed in the paper _“Towards Resource Efficiency: Practical 
> Insights into Large-Scale Spark Workloads at ByteDance”_ (VLDB 2024). Our 
> approach introduces a configurable parameter that splits a user’s requested 
> {{memoryOverhead}} into two segments:
>  * *Guaranteed memoryOverhead* — exclusive to each Spark pod, ensuring 
> baseline reliability.
>  * *Shared memoryOverhead* — pooled and time-shared across multiple pods to 
> absorb bursty usage.
> In the Kubernetes environment, this translates to the following resource 
> settings for each Spark executor pod:
>  * *memory request* = user-requested on-heap memory + guaranteed 
> {{memoryOverhead}}
>  * *memory limit* = user-requested on-heap memory + guaranteed 
> {{memoryOverhead}} + shared {{memoryOverhead}}
> This design preserves reliability while substantially improving cluster-level 
> memory utilization by avoiding the over-provisioning traditionally required 
> for worst-case {{memoryOverhead}} peaks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to