On Sun, 29 Jul 2012, Eric Dumazet wrote:

> On Sun, 2012-07-29 at 12:10 +0200, Eric Dumazet wrote:
> 
> > You can probably design something needing no more than 4 bytes per cpu,
> > and this thing could use non locked operations as bonus.
> > 
> > like the following ...
> 
> Coming back from my bike ride, here is a more polished version with
> proper synchronization/ barriers.
> 
> struct percpu_rw_semaphore {
>       /* percpu_sem_down_read() use the following in fast path */
>       unsigned int __percpu *active_counters;
> 
>       unsigned int __percpu *counters;
>       struct rw_semaphore     sem; /* used in slow path and by writers */
> };
> 
> static inline int percpu_sem_init(struct percpu_rw_semaphore *p)
> {
>       p->counters = alloc_percpu(unsigned int);
>       if (!p->counters)
>               return -ENOMEM;
>       init_rwsem(&p->sem);
>       rcu_assign_pointer(p->active_counters, p->counters);
>       return 0;
> }
> 
> 
> static inline bool percpu_sem_down_read(struct percpu_rw_semaphore *p)
> {
>       unsigned int __percpu *counters;
> 
>       rcu_read_lock();
>       counters = rcu_dereference(p->active_counters);
>       if (counters) {
>               this_cpu_inc(*counters);
>               smp_wmb(); /* paired with smp_rmb() in percpu_count() */

Why is this barrier needed? RCU works as a barrier doesn't it?
RCU is unlocked when the cpu passes a quiescent state, and I suppose that 
entering the quiescent state works as a barrier. Or doesn't it?

>               rcu_read_unlock();
>               return true;
>       }
>       rcu_read_unlock();
>       down_read(&p->sem);
>       return false;
> }
> 
> static inline void percpu_sem_up_read(struct percpu_rw_semaphore *p, bool 
> fastpath)
> {
>       if (fastpath)
>               this_cpu_dec(*p->counters);
>       else
>               up_read(&p->sem);
> }
> 
> static inline unsigned int percpu_count(unsigned int __percpu *counters)
> {
>       unsigned int total = 0;
>       int cpu;
> 
>       for_each_possible_cpu(cpu)
>               total += *per_cpu_ptr(counters, cpu);
> 
>       return total;
> }
> 
> static inline void percpu_sem_down_write(struct percpu_rw_semaphore *p)
> {
>       down_write(&p->sem);
>       p->active_counters = NULL;
>       synchronize_rcu();
>       smp_rmb(); /* paired with smp_wmb() in percpu_sem_down_read() */

Why barrier here? Synchronize_rcu() doesn't work as a barrier?

Mikulas

>       while (percpu_count(p->counters))
>               schedule();
> }
> 
> static inline void percpu_sem_up_write(struct percpu_rw_semaphore *p)
> {
>       rcu_assign_pointer(p->active_counters, p->counters);
>       up_write(&p->sem);
> }
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to