On Fri, Sep 22, 2017 at 11:22:06AM +0800, Boqun Feng wrote:
> Hi Mathieu,
> 
> On Tue, Sep 19, 2017 at 06:13:41PM -0400, Mathieu Desnoyers wrote:
> > Provide a new command allowing processes to register their intent to use
> > the private expedited command.
> > 
> > This allows PowerPC to skip the full memory barrier in switch_mm(), and
> > only issue the barrier when scheduling into a task belonging to a
> > process that has registered to use expedited private.
> > 
> > Processes are now required to register before using
> > MEMBARRIER_CMD_PRIVATE_EXPEDITED, otherwise that command returns EPERM.
> > 
> 
> Sorry I'm late for the party, but I couldn't stop thinking whether we
> could avoid the register thing at all, because the registering makes
> sys_membarrier() more complex(both for the interface and the
> implementation). So how about we trade-off a little bit by taking
> some(not all) the rq->locks?
> 
> The idea is in membarrier_private_expedited(), we go through all ->curr
> on each CPU and 
> 
> 1)    If it's a userspace task and its ->mm is matched, we send an ipi
> 
> 2)    If it's a kernel task, we skip
> 
>       (Because there will be a smp_mb() implied by mmdrop(), when it
>       switchs to userspace task).
> 
> 3)    If it's a userspace task and its ->mm is not matched, we take
>       the corresponding rq->lock and check rq->curr again, if its ->mm
>       matched, we send an ipi, otherwise we do nothing.
> 
>       (Because if we observe rq->curr is not matched with rq->lock
>       held, when a task having matched ->mm schedules in, the rq->lock
>       pairing along with the smp_mb__after_spinlock() will guarantee
>       it observes all memory ops before sys_membarrir()).
> 
> membarrier_private_expedited() will look like this if we choose this
> way:
> 
> void membarrier_private_expedited()
> {
>       int cpu;
>       bool fallback = false;
>       cpumask_var_t tmpmask;
>       struct rq_flags rf;
> 
> 
>       if (num_online_cpus() == 1)
>               return;
> 
>       smp_mb();
> 
>       if (!zalloc_cpumask_var(&tmpmask, GFP_NOWAIT)) {
>               /* Fallback for OOM. */
>               fallback = true;
>       }
> 
>       cpus_read_lock();
>       for_each_online_cpu(cpu) {
>               struct task_struct *p;
> 
>               if (cpu == raw_smp_processor_id())
>                       continue;
> 
>               rcu_read_lock();
>               p = task_rcu_dereference(&cpu_rq(cpu)->curr);
> 
>               if (!p) {
>                       rcu_read_unlock();
>                       continue;
>               }
> 
>               if (p->mm == current->mm) {
>                       if (!fallback)
>                               __cpumask_set_cpu(cpu, tmpmask);
>                       else
>                               smp_call_function_single(cpu, ipi_mb, NULL, 1);
>               }
> 
>               if (p->mm == current->mm || !p->mm) {
>                       rcu_read_unlock();
>                       continue;
>               }
> 
>               rcu_read_unlock();
>               
>               /*
>                * This should be a arch-specific code, as we don't
>                * need it at else place other than some archs without
>                * a smp_mb() in switch_mm() (i.e. powerpc)
>                */
>               rq_lock_irq(cpu_rq(cpu), &rf);
>               if (p->mm == current->mm) {

Oops, this one should be

                if (cpu_curr(cpu)->mm == current->mm)

>                       if (!fallback)
>                               __cpumask_set_cpu(cpu, tmpmask);
>                       else
>                               smp_call_function_single(cpu, ipi_mb, NULL, 1);

, and this better be moved out of the lock rq->lock critical section.

Regards,
Boqun

>               }
>               rq_unlock_irq(cpu_rq(cpu), &rf);
>       }
>       if (!fallback) {
>               smp_call_function_many(tmpmask, ipi_mb, NULL, 1);
>               free_cpumask_var(tmpmask);
>       }
>       cpus_read_unlock();
> 
>       smp_mb();
> }
> 
> Thoughts?
> 
> Regards,
> Boqun
> 
[...]

Attachment: signature.asc
Description: PGP signature

Reply via email to