On Mon, Apr 17, 2017 at 06:47:37AM +0530, Madhavan Srinivasan wrote: > Local atomic operations are fast and highly reentrant per CPU counters. > Used for percpu variable updates. Local atomic operations only guarantee > variable modification atomicity wrt the CPU which owns the data and > these needs to be executed in a preemption safe way. > > Here is the design of this patch. Since local_* operations > are only need to be atomic to interrupts (IIUC), we have two options. > Either replay the "op" if interrupted or replay the interrupt after > the "op". Initial patchset posted was based on implementing local_* operation > based on CR5 which replay's the "op". Patchset had issues in case of > rewinding the address pointor from an array. This make the slow path > really slow. Since CR5 based implementation proposed using __ex_table to find > the rewind addressr, this rasied concerns about size of __ex_table and > vmlinux. > > https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-December/123115.html > > +static __inline__ int local_add_unless(local_t *l, long a, long u) > +{ > + long t; > + unsigned long flags; > + > + powerpc_local_irq_pmu_save(flags); > + __asm__ __volatile__ ( > + PPC_LL" %0,0(%1)\n\ > + cmpw 0,%0,%3 \n\
Loading a long work here and comparing a single word. Replace cmpw with PPC_LCMP. > + beq- 2f \n\ > + add %0,%2,%0 \n" > + PPC_STL" %0,0(%1) \n" > +" subf %0,%2,%0 \n\ > +2:" > + : "=&r" (t) > + : "r" (&(l->a.counter)), "r" (a), "r" (u) > + : "cc", "memory"); > + powerpc_local_irq_pmu_restore(flags); > + > + return t != u; > +} > + > +#define local_inc_not_zero(l) local_add_unless((l), 1, 0) > + > +#define local_sub_and_test(a, l) (local_sub_return((a), (l)) == 0) > +#define local_dec_and_test(l) (local_dec_return((l)) == 0) > + > +/* > + * Atomically test *l and decrement if it is greater than 0. > + * The function returns the old value of *l minus 1. > + */ > +static __inline__ long local_dec_if_positive(local_t *l) > +{ > + long t; > + unsigned long flags; > + > + powerpc_local_irq_pmu_save(flags); > + __asm__ __volatile__( > + PPC_LL" %0,0(%1)\n\ > + cmpwi %0,1\n\ Same issue here. Replace cmpwi with PPC_LCMPI. > + addi %0,%0,-1\n\ > + blt- 2f\n" > + PPC_STL "%0,0(%1)\n" > + "\n\ > +2:" : "=&b" (t) > + : "r" (&(l->a.counter)) > + : "cc", "memory"); > + powerpc_local_irq_pmu_restore(flags); > + > + return t; > +} > +