On Wed, 1 Apr 2026 at 11:11, Greg Kroah-Hartman <[email protected]> wrote: > > On Tue, Mar 31, 2026 at 08:34:10AM -0700, Samuel Wu wrote: > > Iterating through wakeup sources via sysfs or debugfs can be inefficient > > or restricted. Introduce BPF kfuncs to allow high-performance and safe > > in-kernel traversal of the wakeup_sources list. > > What exactly is "inefficient"? I think you might have some numbers in > your 0/2 patch, but putting it in here would be best. > > And who is going to be calling these functions, just ebpf scripts? > > > The new kfuncs include: > > - bpf_wakeup_sources_get_head() to obtain the list head. > > - bpf_wakeup_sources_read_lock/unlock() to manage the SRCU lock. > > Does this mean we can stop exporting wakeup_sources_read_lock() now? > > > For verifier safety, the underlying SRCU index is wrapped in an opaque > > 'struct bpf_ws_lock' pointer. This enables the use of KF_ACQUIRE and > > KF_RELEASE flags, allowing the BPF verifier to strictly enforce paired > > lock/unlock cycles and prevent resource leaks. > > But it's an index, not a lock. Is this just a verifier thing?
It's a verifier thing. The index must be passed to SRCU unlock wrapped by the unlock kfunc for correctness. The verifier understands such acquire/release tracking for pointers (e.g., taking refcount and putting it), but not for scalar values, so we need to launder it through a pointer to an empty struct, which isn't really usable except for passing it eventually to unlock. If the program doesn't do the unlock, the verifier will reject it. > > > > > Signed-off-by: Samuel Wu <[email protected]> > > --- > > drivers/base/power/power.h | 7 ++++ > > drivers/base/power/wakeup.c | 72 +++++++++++++++++++++++++++++++++++-- > > 2 files changed, 77 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/base/power/power.h b/drivers/base/power/power.h > > index 922ed457db19..8823aceeac8b 100644 > > --- a/drivers/base/power/power.h > > +++ b/drivers/base/power/power.h > > @@ -168,3 +168,10 @@ static inline void device_pm_init(struct device *dev) > > device_pm_sleep_init(dev); > > pm_runtime_init(dev); > > } > > + > > +#ifdef CONFIG_BPF_SYSCALL > > +struct bpf_ws_lock { }; > > An empty structure? This is just an int, so you are casting an int to a > pointer? Can we make wakeup_sources_read_lock() actually use a > structure instead to make this simpler? See above. > > > +struct bpf_ws_lock *bpf_wakeup_sources_read_lock(void); > > +void bpf_wakeup_sources_read_unlock(struct bpf_ws_lock *lock); > > +void *bpf_wakeup_sources_get_head(void); > > +#endif > > diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c > > index b8e48a023bf0..8eda7d35d9cc 100644 > > --- a/drivers/base/power/wakeup.c > > +++ b/drivers/base/power/wakeup.c > > @@ -1168,11 +1168,79 @@ static const struct file_operations > > wakeup_sources_stats_fops = { > > .release = seq_release_private, > > }; > > > > -static int __init wakeup_sources_debugfs_init(void) > > +#ifdef CONFIG_BPF_SYSCALL > > +#include <linux/btf.h> > > + > > +__bpf_kfunc_start_defs(); > > + > > +/** > > + * bpf_wakeup_sources_read_lock - Acquire the SRCU lock for wakeup sources > > + * > > + * The underlying SRCU lock returns an integer index. However, the BPF > > verifier > > + * requires a pointer (PTR_TO_BTF_ID) to strictly track the state of > > acquired > > + * resources using KF_ACQUIRE and KF_RELEASE semantics. We use an opaque > > + * structure pointer (struct bpf_ws_lock *) to satisfy the verifier while > > + * safely encoding the integer index within the pointer address itself. > > + * > > + * Return: An opaque pointer encoding the SRCU lock index + 1 (to avoid > > NULL). > > + */ > > +__bpf_kfunc struct bpf_ws_lock *bpf_wakeup_sources_read_lock(void) > > +{ > > + return (struct bpf_ws_lock *)(long)(wakeup_sources_read_lock() + 1); > > Why are you incrementing this by 1? I think SRCU indices can be 0, so it would appear as a NULL pointer to the program. > > > +} > > + > > +/** > > + * bpf_wakeup_sources_read_unlock - Release the SRCU lock for wakeup > > sources > > + * @lock: The opaque pointer returned by bpf_wakeup_sources_read_lock() > > + * > > + * The BPF verifier guarantees that @lock is a valid, unreleased pointer > > from > > + * the acquire function. We decode the pointer back into the integer SRCU > > index > > + * by subtracting 1 and release the lock. > > + */ > > +__bpf_kfunc void bpf_wakeup_sources_read_unlock(struct bpf_ws_lock *lock) > > +{ > > + wakeup_sources_read_unlock((int)(long)lock - 1); > > Why decrementing by one? > > So it's really an int, but you are casting it to a pointer, incrementing > it by one to make it a "fake" pointer value (i.e. unaligned mess), and > then when unlocking casting the pointer back to an int, and then > decrementing the value? > > This feels "odd" :( It isn't readable, though, because it's an empty struct, so I don't think it would cause any issues in practice. > > [...]

