On Fri, Apr 16, 2021 at 7:28 PM Peter Zijlstra <[email protected]> wrote: > > On Fri, Apr 16, 2021 at 11:29:30AM +0200, Peter Zijlstra wrote: > > > > So I think we've had proposals for being able to close fds in the past; > > > while preserving groups etc. We've always pushed back on that because of > > > the resource limit issue. By having each counter be a filedesc we get a > > > natural limit on the amount of resources you can consume. And in that > > > respect, having to use 400k fds is things working as designed. > > > > > > Anyway, there might be a way around this.. > > So how about we flip the whole thing sideways, instead of doing one > event for multiple cgroups, do an event for multiple-cpus. > > Basically, allow: > > perf_event_open(.pid=fd, cpu=-1, .flag=PID_CGROUP); > > Which would have the kernel create nr_cpus events [the corrolary is that > we'd probably also allow: (.pid=-1, cpu=-1) ].
Do you mean it'd have separate perf_events per cpu internally? >From a cpu's perspective, there's nothing changed, right? Then it will have the same performance problem as of now. > > Output could be done by adding FORMAT_PERCPU, which takes the current > read() format and writes a copy for each CPU event. (p)read(v)() could > be used to explode or partial read that. Yeah, I think it's good for read. But what about mmap? I don't think we can use file offset since it's taken for auxtrace. Maybe we can simply disallow that.. > > This gets rid of the nasty variadic nature of the > 'get-me-these-n-cgroups'. While still getting rid of the n*m fd issue > you're facing. As I said, it's not just a file descriptor problem. In fact, performance is more concerning. Thanks, Namhyung

