On Thu, 26 Mar 2026 at 17:26, Samuel Wu <[email protected]> wrote:
>
> On Thu, Mar 26, 2026 at 8:02 AM Alexei Starovoitov
> <[email protected]> wrote:
> >
> > On Thu, Mar 26, 2026 at 7:54 AM Kumar Kartikeya Dwivedi
> > <[email protected]> wrote:
> > >
> > > On Thu, 26 Mar 2026 at 13:20, Puranjay Mohan <[email protected]> wrote:
> > > >
> > > > Samuel Wu <[email protected]> writes:
> > > >
> > > > > This patchset adds requisite kfuncs for BPF programs to safely 
> > > > > traverse
> > > > > wakeup_sources, and puts a config flag around the sysfs interface.
> > > > >
> > > > > Currently, a traversal of wakeup sources require going through
> > > > > /sys/class/wakeup/* or /d/wakeup_sources/*. The repeated syscalls to 
> > > > > query
> > > > > sysfs is inefficient, as there can be hundreds of wakeup_sources, 
> > > > > with each
> > > > > wakeup source also having multiple attributes. debugfs is unstable and
> > > > > insecure.
> > > > >
> > > > > Adding kfuncs to lock/unlock wakeup sources allows BPF program to 
> > > > > safely
> > > > > traverse the wakeup sources list. The head address of wakeup_sources 
> > > > > can
> > > > > safely be resolved through BPF helper functions or variable 
> > > > > attributes.
> > > > >
> > > > > On a quiescent Pixel 6 traversing 150 wakeup_sources, I am seeing ~34x
> > > > > speedup (sampled 75 times in table below). For a device under load, 
> > > > > the
> > > > > speedup is greater.
> > > > > +-------+----+----------+----------+
> > > > > |       | n  | AVG (ms) | STD (ms) |
> > > > > +-------+----+----------+----------+
> > > > > | sysfs | 75 | 44.9     | 12.6     |
> > > > > +-------+----+----------+----------+
> > > > > | BPF   | 75 | 1.3      | 0.7      |
> > > > > +-------+----+----------+----------+
> > > > >
> > > > > The initial attempts for BPF traversal of wakeup_sources was with BPF
> > > > > iterators [1]. However, BPF already allows for traversing of a simple 
> > > > > list
> > > > > with bpf_for(), and this current patchset has the added benefit of 
> > > > > being
> > > > > ~2-3x more performant than BPF iterators.
> > > >
> > > > I left some inline comments on patch 1, but the high level concern is
> > > > that encoding the SRCU index into a fake pointer to get KF_ACQUIRE/
> > > > KF_RELEASE tracking is working against the verifier rather than with it.
> > > > Nothing actually prevents a BPF program from walking the list without
> > > > the lock, and the whole pointer encoding trick goes away if this is done
> > > > as an open-coded iterator instead.
> > >
> > > Which is fine, the critical section is only doing CO-RE accesses, and
> > > the SRCU lock is just to be able to read things in a valid state while
> > > walking the list. It is all best-effort.
> > > Open coded iterators was already explored as an option in earlier
> > > iterations of the series and discarded as no-go.
> >
> > kinda best-effort...
> > the way it's written bpf_wakeup_sources_get_head() returns
> > trusted list_head. It's then core-read-ed anyway.
> > Ideally it should be trusted only within that srcu CS
> > and invalidated by the verifier similar to KF_RCU_PROTECTED,
> > but that's bigger task.
> > Instead let's make bpf_wakeup_sources_get_head() return 'void *',
> > so it's clearly untrusted.
>
> Thanks all for the fruitful discussion; this is more rigorous. I'll
> update v3 so that `bpf_wakeup_sources_get_head()`'s return type is
> `void *` and I can add a corresponding selftest that directly
> dereferences the head and expects a verifier failure.

You could also use bpf_core_cast() instead of using macros to read
every field, should be equivalent. You may still need the macros for
bitfields but it should work otherwise.

Reply via email to