On Wed, Jan 30, 2019 at 01:34:19PM -0800, Alexei Starovoitov wrote:
> On Wed, Jan 30, 2019 at 10:05:29PM +0100, Peter Zijlstra wrote:
> > 
> > Would something like the below work for you instead?
> > 
> > I find it easier to read, and the additional CONFIG symbol would give
> > architectures (say ARM) an easy way to force the issue.
> > 
> > 
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -221,6 +221,72 @@ const struct bpf_func_proto bpf_get_curr
> >     .arg2_type      = ARG_CONST_SIZE,
> >  };
> >  
> > +#if defined(CONFIG_QUEUED_SPINLOCKS) || defined(CONFIG_BPF_ARCH_SPINLOCK)
> > +
> > +static inline void __bpf_spin_lock(struct bpf_spin_lock *lock)
> > +{
> > +   arch_spinlock_t *l = (void *)lock;
> > +   BUILD_BUG_ON(sizeof(*l) != sizeof(__u32));
> > +   if (1) {
> > +           union {
> > +                   __u32 val;
> > +                   arch_spinlock_t lock;
> > +           } u = { .lock = __ARCH_SPIN_LOCK_UNLOCKED };
> > +           compiletime_assert(u.val == 0, "__ARCH_SPIN_LOCK_UNLOCKED not 
> > 0");
> > +   }
> > +   arch_spin_lock(l);
> 
> And archs can select CONFIG_BPF_ARCH_SPINLOCK when they don't
> use qspinlock and their arch_spinlock_t is compatible ?
> Nice. I like the idea!

Exactly, took me a little while to come up with that test for
__ARCH_SPIN_LOCK_UNLOCKED, but it now checks for both assumptions, so no
surprises when people get it wrong by accident.

> > +}
> > +
> > +static inline void __bpf_spin_unlock(struct bpf_spin_lock *lock)
> > +{
> > +   arch_spinlock_t *l = (void *)lock;
> > +   arch_spin_unlock(l);
> > +}
> > +
> > +#else
> > +
> > +static inline void __bpf_spin_lock(struct bpf_spin_lock *lock)
> > +{
> > +   atomic_t *l = (void *)lock;
> > +   do {
> > +           atomic_cond_read_relaxed(l, !VAL);
> 
> wow. that's quite a macro magic.

Yeah, C sucks for not having lambdas, this was the best we could come up
with.

This basically allows architectures to optimize the
wait-for-variable-to-change thing. Currently only ARM64 does that, I
have a horrible horrible patch that makes x86 use MONITOR/MWAIT for
this, and I suppose POWER should use it but doesn't.

> Should it be
> atomic_cond_read_relaxed(l, (!VAL));
> like qspinlock.c does ?

Extra parens doesn't hurt of course, but I don't think it's strictly
needed, the atomic_cond_read_*() wrappers already add extra parent
before passing it on to smp_cond_load_*().

Reply via email to