2025年11月21日 03:20, "Michal Hocko" <[email protected] mailto:[email protected]?to=%22Michal%20Hocko%22%20%3Cmhocko%40suse.com%3E > 写到:
> > On Thu 20-11-25 09:29:52, [email protected] wrote: > [...] > > > > > I generally agree with an idea to use BPF for various memcg-related > > policies, but I'm not sure how specific callbacks can be used in > > practice. > > > > Hi Roman, > > > > Following are some ideas that can use ebpf memcg: > > > > Priority‑Based Reclaim and Limits in Multi‑Tenant Environments: > > On a single machine with multiple tenants / namespaces / containers, > > under memory pressure it’s hard to decide “who should be squeezed first” > > with static policies baked into the kernel. > > Assign a BPF profile to each tenant’s memcg: > > Under high global pressure, BPF can decide: > > Which memcgs’ memory.high should be raised (delaying reclaim), > > Which memcgs should be scanned and reclaimed more aggressively. > > > > Online Profiling / Diagnosing Memory Hotspots: > > A cgroup’s memory keeps growing, but without patching the kernel it’s > > difficult to obtain fine‑grained information. > > Attach BPF to the memcg charge/uncharge path: > > Record large allocations (greater than N KB) with call stacks and > > owning file/module, and send them to user space via a BPF ring buffer. > > Based on sampled data, generate: > > “Top N memory allocation stacks in this container over the last 10 > > minutes,” > > Reports of which objects / call paths are growing fastest. > > This makes it possible to pinpoint the root cause of host memory > > anomalies without changing application code, which is very useful > > in operations/ops scenarios. > > > > SLO‑Driven Auto Throttling / Scale‑In/Out Signals: > > Use eBPF to observe memory usage slope, frequent reclaim, > > or near‑OOM behavior within a memcg. > > When it decides “OOM is imminent,” instead of just killing/raising > > limits, it can emit a signal to a control‑plane component. > > For example, send an event to a user‑space agent to trigger > > automatic scaling, QPS adjustment, or throttling. > > > > Prevent a cgroup from launching a large‑scale fork+malloc attack: > > BPF checks per‑uid or per‑cgroup allocation behavior over the > > last few seconds during memcg charge. > > > AFAIU, these are just very high level ideas rather than anything you are > trying to target with this patch series, right? > > All I can see is that you add a reclaim hook but it is not really clear > to me how feasible it is to actually implement a real memory reclaim > strategy this way. > > In prinicipal I am not really opposed but the memory reclaim process is > rather involved process and I would really like to see there is > something real to be done without exporting all the MM code to BPF for > any practical use. Is there any POC out there? Hi Michal, I apologize for not delivering a more substantial POC. I was hesitant to add extensive eBPF support to memcg because I wasn't certain it aligned with the community's vision—and such support would require introducing many eBPF hooks into memcg. I will add more eBPF hook to memcg and provide a more meaningful POC in the next version. Best, Hui > -- > Michal Hocko > SUSE Labs >

