----- On Sep 4, 2019, at 6:53 AM, Oleg Nesterov [email protected] wrote:

> On 09/03, Mathieu Desnoyers wrote:
>>
>> @@ -1130,6 +1130,10 @@ struct task_struct {
>>      unsigned long                   numa_pages_migrated;
>>  #endif /* CONFIG_NUMA_BALANCING */
>>
>> +#ifdef CONFIG_MEMBARRIER
>> +    atomic_t membarrier_state;
>> +#endif
> 
> ...
> 
>> +static inline void membarrier_prepare_task_switch(struct task_struct *t)
>> +{
>> +    if (!t->mm)
>> +            return;
>> +    atomic_set(&t->membarrier_state,
>> +               atomic_read(&t->mm->membarrier_state));
>> +}
> 
> Why not
> 
>       rq->membarrier_state = next->mm ? t->mm->membarrier_state : 0;
> 
> and
> 
>       if (cpu_rq(cpu)->membarrier_state & MEMBARRIER_STATE_GLOBAL_EXPEDITED) {
>               ...
>       }
> 
> in membarrier_global_expedited() ? (I removed atomic_ to simplify)
> 
> IOW, why this new member has to live in task_struct, not in rq?

As replied to Linus, if we copy the membarrier_state into the rq, we'd need
to ensure we have full memory barriers between:

prior user-space memory accesses  /  setting the runqueue membarrier state

and

setting the runqueue membarrier state / following user-space memory accesses

Because membarrier does not take any runqueue lock when it iterates over
runqueues.

I try to avoid putting too much memory barrier constraints on the scheduler
for membarrier, but if it's really the way forward it could be done.

And the basic question remains: it is acceptable performance-wise to load
mm->membarrier_state from sched switch prepare ?

Thanks,

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

Reply via email to