we didn't separate the design into another doc since the main idea is relatively simple...
for request/limit calculation, I described it in Q4 of the SPIP doc https://docs.google.com/document/d/1v5PQel1ygVayBFS8rdtzIH8l1el6H1TDjULD3EyBeIc/edit?tab=t.0#heading=h.q4vjslmnfuo0 it is calculated based on per profile (you can say it is based on per stage), when the cluster manager compose the pod spec, it calculates the new memory overhead based on what user asks for in that resource profile On Mon, Dec 8, 2025 at 9:49 PM Wenchen Fan <[email protected]> wrote: > Do we have a design sketch? How to determine the memory request and limit? > Is it per stage or per executor? > > On Tue, Dec 9, 2025 at 1:40 PM Nan Zhu <[email protected]> wrote: > >> yeah, the implementation is basically relying on the request/limit >> concept in K8S, ... >> >> but if there is any other cluster manager coming in future, as long as >> it has a similar concept , it can leverage this easily as the main logic is >> implemented in ResourceProfile >> >> On Mon, Dec 8, 2025 at 9:34 PM Wenchen Fan <[email protected]> wrote: >> >>> This feature is only available on k8s because it allows containers to >>> have dynamic resources? >>> >>> On Mon, Dec 8, 2025 at 12:46 PM Yao <[email protected]> wrote: >>> >>>> Hi Folks, >>>> >>>> We are proposing a burst-aware memoryOverhead allocation algorithm for >>>> Spark@K8S to improve memory utilization of spark clusters. >>>> Please see more details in SPIP doc >>>> <https://docs.google.com/document/d/1v5PQel1ygVayBFS8rdtzIH8l1el6H1TDjULD3EyBeIc/edit?tab=t.0>. >>>> Feedbacks and discussions are welcomed. >>>> >>>> Thanks Chao for being shepard of this feature. >>>> Also want to thank the authors of the original paper >>>> <https://www.vldb.org/pvldb/vol17/p3759-shi.pdf> from ByteDance, >>>> specifically Rui([email protected]) and Yixin( >>>> [email protected]). >>>> >>>> Thank you. >>>> Yao Wang >>>> >>>
