Re: [PATCH] percpu_rwsem: fix missed wakeup due to reordering of load

2018-12-21 Thread Andrea Parri
On Fri, Dec 21, 2018 at 12:59:13PM +0530, Prateek Sood wrote:
> P1 is releaseing the cpu_hotplug_lock and P2 is acquiring
> cpu_hotplug_lock.
> 
> P1   P2
> percpu_up_read() path  percpu_down_write() path
> 
>   rcu_sync_enter() 
> //gp_state=GP_PASSED
> 
> rcu_sync_is_idle() //returns falsedown_write(rw_sem)
> 
> __percpu_up_read()
> 
> [L] task = rcu_dereference(w->task) //NULL
> 
> smp_rmb()  [S] w->task = current
> 
> smp_mb()
> 
>[L] readers_active_check() //fails
>schedule()
> 
> [S] __this_cpu_dec(read_count)
> 
> Since load of task can result in NULL, it can lead to missed wakeup
> in rcuwait_wake_up(). Above sequence violated the following constraint
> in rcuwait_wake_up():
> 
>WAITWAKE
> [S] tsk = current   [S] cond = true
> MB (A)MB (B)
> [L] cond[L] tsk
> 
> This can happen as smp_rmb() in rcuwait_wake_up() will provide ordering
> of load before barrier with load and store after barrier for arm64
> architecture. Here the requirement is to order store before smp_rmb()
> with load after the smp_rmb().
> 
> For the usage of rcuwait_wake_up() in __percpu_up_read() full barrier
> (smp_mb) is required to complete the constraint of rcuwait_wake_up().
> 
> Signed-off-by: Prateek Sood 
> Acked-by: Davidlohr Bueso 

It looks like Peter has already queued this, c.f.,

  
https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/commit/?h=locking/core=73685b8af253cf32b1b41b3045f2828c6fb2439e

with a modified changelog and my Reviewed-by (that I confirm).

I can't tell how/when this is going to be upstreamed (guess via -tip),
Peter?

  Andrea


> 
> ---
>  kernel/exit.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/exit.c b/kernel/exit.c
> index ac1a814..696e0e1 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -298,7 +298,7 @@ void rcuwait_wake_up(struct rcuwait *w)
>   /*
>* Order condition vs @task, such that everything prior to the load
>* of @task is visible. This is the condition as to why the user called
> -  * rcuwait_trywake() in the first place. Pairs with set_current_state()
> +  * rcuwait_wake_up() in the first place. Pairs with set_current_state()
>* barrier (A) in rcuwait_wait_event().
>*
>*WAITWAKE
> @@ -306,7 +306,7 @@ void rcuwait_wake_up(struct rcuwait *w)
>*MB (A)  MB (B)
>*[L] cond[L] tsk
>*/
> - smp_rmb(); /* (B) */
> + smp_mb(); /* (B) */
>  
>   /*
>* Avoid using task_rcu_dereference() magic as long as we are careful,
> -- 
> Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, 
> Inc., 
> is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
> 


[PATCH] percpu_rwsem: fix missed wakeup due to reordering of load

2018-12-20 Thread Prateek Sood
P1 is releaseing the cpu_hotplug_lock and P2 is acquiring
cpu_hotplug_lock.

P1   P2
percpu_up_read() path  percpu_down_write() path

  rcu_sync_enter() //gp_state=GP_PASSED

rcu_sync_is_idle() //returns falsedown_write(rw_sem)

__percpu_up_read()

[L] task = rcu_dereference(w->task) //NULL

smp_rmb()  [S] w->task = current

smp_mb()

   [L] readers_active_check() //fails
 schedule()

[S] __this_cpu_dec(read_count)

Since load of task can result in NULL, it can lead to missed wakeup
in rcuwait_wake_up(). Above sequence violated the following constraint
in rcuwait_wake_up():

 WAITWAKE
[S] tsk = current [S] cond = true
MB (A)  MB (B)
[L] cond  [L] tsk

This can happen as smp_rmb() in rcuwait_wake_up() will provide ordering
of load before barrier with load and store after barrier for arm64
architecture. Here the requirement is to order store before smp_rmb()
with load after the smp_rmb().

For the usage of rcuwait_wake_up() in __percpu_up_read() full barrier
(smp_mb) is required to complete the constraint of rcuwait_wake_up().

Signed-off-by: Prateek Sood 
Acked-by: Davidlohr Bueso 

---
 kernel/exit.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index ac1a814..696e0e1 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -298,7 +298,7 @@ void rcuwait_wake_up(struct rcuwait *w)
/*
 * Order condition vs @task, such that everything prior to the load
 * of @task is visible. This is the condition as to why the user called
-* rcuwait_trywake() in the first place. Pairs with set_current_state()
+* rcuwait_wake_up() in the first place. Pairs with set_current_state()
 * barrier (A) in rcuwait_wait_event().
 *
 *WAITWAKE
@@ -306,7 +306,7 @@ void rcuwait_wake_up(struct rcuwait *w)
 *MB (A)  MB (B)
 *[L] cond[L] tsk
 */
-   smp_rmb(); /* (B) */
+   smp_mb(); /* (B) */
 
/*
 * Avoid using task_rcu_dereference() magic as long as we are careful,
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., 
is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.



Re: [PATCH] percpu_rwsem: fix missed wakeup due to reordering of load

2018-12-20 Thread Prateek Sood
On 12/12/2018 08:58 PM, Andrea Parri wrote:
> On Fri, Nov 30, 2018 at 08:40:56PM +0530, Prateek Sood wrote:
>> In a scenario where cpu_hotplug_lock percpu_rw_semaphore is already
>> acquired for read operation by P1 using percpu_down_read().
>>
>> Now we have P1 in the path of releaseing the cpu_hotplug_lock and P2
>> is in the process of acquiring cpu_hotplug_lock.
>>
>> P1   P2
>> percpu_up_read() path  percpu_down_write() path
>>
>>   rcu_sync_enter() 
>> //gp_state=GP_PASSED
>>
>> rcu_sync_is_idle() //returns falsedown_write(rw_sem)
>>
>> __percpu_up_read()
>>
>> [L] task = rcu_dereference(w->task) //NULL
>>
>> smp_rmb()  [S] w->task = current
>>
>> smp_mb()
>>
>>[L] readers_active_check() //fails
>>   schedule()
>>
>> [S] __this_cpu_dec(read_count)
>>
>> Since load of task can result in NULL. This can lead to missed wakeup
>> in rcuwait_wake_up(). Above sequence violated the following constraint
>> in rcuwait_wake_up():
>>
>>   WAITWAKE
>> [S] tsk = current  [S] cond = true
>> MB (A)   MB (B)
>> [L] cond   [L] tsk
>>
>> This can happen as smp_rmb() in rcuwait_wake_up() will provide ordering
>> of load before barrier with load and store after barrier for arm64
>> architecture. Here the requirement is to order store before smp_rmb()
>> with load after the smp_rmb().
>>
>> For the usage of rcuwait_wake_up() in __percpu_up_read() full barrier
>> (smp_mb) is required to complete the constraint of rcuwait_wake_up().
>>
>> Signed-off-by: Prateek Sood 
> 
> I know this is going to sound ridiculous (coming from me or from
> the Italian that I am), but it looks like we could both work on
> our English. ;-)
> 
> But the fix seems correct to me:
> 
> Reviewed-by: Andrea Parri 
> 
> It might be a good idea to integrate this fix with fixes to the
> inline comments/annotations: for example, I see that the comment
> in rcuwait_wake_up() mentions a non-existing rcuwait_trywake();
Ok, I will update the comment in next version of the patch.

> moreover, the memory-barrier annotation "B" is used also for the
> smp_mb() preceding the __this_cpu_dec() in __percpu_up_read().
In this annotation "B" is corresponding to annotation "A" in
rcuwait_wait_event(). So this seems to be correct.

> 
>   Andrea
> 
> 
>> ---
>>  kernel/exit.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/exit.c b/kernel/exit.c
>> index f1d74f0..a10820d 100644
>> --- a/kernel/exit.c
>> +++ b/kernel/exit.c
>> @@ -306,7 +306,7 @@ void rcuwait_wake_up(struct rcuwait *w)
>>   *MB (A)  MB (B)
>>   *[L] cond[L] tsk
>>   */
>> -smp_rmb(); /* (B) */
>> +smp_mb(); /* (B) */
>>  
>>  /*
>>   * Avoid using task_rcu_dereference() magic as long as we are careful,
>> -- 
>> Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, 
>> Inc., 
>> is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
>>


-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project


Re: [PATCH] percpu_rwsem: fix missed wakeup due to reordering of load

2018-12-12 Thread Davidlohr Bueso

On 2018-12-12 06:26, Prateek Sood wrote:

Please confirm if the suspicion of smp_rmb is correct.
IMO, it should be smp_mb() translating to dmb ish.


Feel free to add my ack. This should also be Cc to stable as of v4.11.

Fixes: 8f95c90ceb54 (sched/wait, RCU: Introduce rcuwait machinery)

Thanks,
Davidlohr


Re: [PATCH] percpu_rwsem: fix missed wakeup due to reordering of load

2018-12-12 Thread Andrea Parri
On Fri, Nov 30, 2018 at 08:40:56PM +0530, Prateek Sood wrote:
> In a scenario where cpu_hotplug_lock percpu_rw_semaphore is already
> acquired for read operation by P1 using percpu_down_read().
> 
> Now we have P1 in the path of releaseing the cpu_hotplug_lock and P2
> is in the process of acquiring cpu_hotplug_lock.
> 
> P1   P2
> percpu_up_read() path  percpu_down_write() path
> 
>   rcu_sync_enter() 
> //gp_state=GP_PASSED
> 
> rcu_sync_is_idle() //returns falsedown_write(rw_sem)
> 
> __percpu_up_read()
> 
> [L] task = rcu_dereference(w->task) //NULL
> 
> smp_rmb()  [S] w->task = current
> 
> smp_mb()
> 
>[L] readers_active_check() //fails
>schedule()
> 
> [S] __this_cpu_dec(read_count)
> 
> Since load of task can result in NULL. This can lead to missed wakeup
> in rcuwait_wake_up(). Above sequence violated the following constraint
> in rcuwait_wake_up():
> 
>WAITWAKE
> [S] tsk = current   [S] cond = true
> MB (A)MB (B)
> [L] cond[L] tsk
> 
> This can happen as smp_rmb() in rcuwait_wake_up() will provide ordering
> of load before barrier with load and store after barrier for arm64
> architecture. Here the requirement is to order store before smp_rmb()
> with load after the smp_rmb().
> 
> For the usage of rcuwait_wake_up() in __percpu_up_read() full barrier
> (smp_mb) is required to complete the constraint of rcuwait_wake_up().
> 
> Signed-off-by: Prateek Sood 

I know this is going to sound ridiculous (coming from me or from
the Italian that I am), but it looks like we could both work on
our English. ;-)

But the fix seems correct to me:

Reviewed-by: Andrea Parri 

It might be a good idea to integrate this fix with fixes to the
inline comments/annotations: for example, I see that the comment
in rcuwait_wake_up() mentions a non-existing rcuwait_trywake();
moreover, the memory-barrier annotation "B" is used also for the
smp_mb() preceding the __this_cpu_dec() in __percpu_up_read().

  Andrea


> ---
>  kernel/exit.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/exit.c b/kernel/exit.c
> index f1d74f0..a10820d 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -306,7 +306,7 @@ void rcuwait_wake_up(struct rcuwait *w)
>*MB (A)  MB (B)
>*[L] cond[L] tsk
>*/
> - smp_rmb(); /* (B) */
> + smp_mb(); /* (B) */
>  
>   /*
>* Avoid using task_rcu_dereference() magic as long as we are careful,
> -- 
> Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, 
> Inc., 
> is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
> 


Re: [PATCH] percpu_rwsem: fix missed wakeup due to reordering of load

2018-12-12 Thread Prateek Sood
On 12/04/2018 01:06 AM, Prateek Sood wrote:
> On 12/03/2018 12:08 PM, Davidlohr Bueso wrote:
>> On 2018-11-30 07:10, Prateek Sood wrote:
>>> In a scenario where cpu_hotplug_lock percpu_rw_semaphore is already
>>> acquired for read operation by P1 using percpu_down_read().
>>>
>>> Now we have P1 in the path of releaseing the cpu_hotplug_lock and P2
>>> is in the process of acquiring cpu_hotplug_lock.
>>>
>>> P1   P2
>>> percpu_up_read() path  percpu_down_write() path
>>>
>>>   rcu_sync_enter() 
>>> //gp_state=GP_PASSED
>>>
>>> rcu_sync_is_idle() //returns false    down_write(rw_sem)
>>>
>>> __percpu_up_read()
>>>
>>> [L] task = rcu_dereference(w->task) //NULL
>>>
>>> smp_rmb()  [S] w->task = current
>>>
>>>     smp_mb()
>>>
>>>    [L] readers_active_check() 
>>> //fails
>>>  schedule()
>>>
>>> [S] __this_cpu_dec(read_count)
>>>
>>> Since load of task can result in NULL. This can lead to missed wakeup
>>> in rcuwait_wake_up(). Above sequence violated the following constraint
>>> in rcuwait_wake_up():
>>>
>>>  WAIT    WAKE
>>> [S] tsk = current  [S] cond = true
>>> MB (A)    MB (B)
>>> [L] cond  [L] tsk
>>>
>>
>> Hmm yeah we don't want rcu_wake_up() to get hoisted over the 
>> __this_cpu_dec(read_count). The smp_rmb() does not make sense to me here in 
>> the first place. Did you run into this scenario by code inspection or you 
>> actually it the issue?
>>
>> Thanks,
>> Davidlohr
> 
> I have checked one issue where it seems that cpu hotplug code
> path is not able to get cpu_hotplug_lock in write mode and there
> is a reader pending for cpu hotplug path to release
> percpu_rw_semaphore->rwsem to acquire cpu_hotplug_lock.
> This caused a deadlock.
> 
> From code inspection also it seems to be not adhering to arm64
> smp_rmb() constraint of load/load-store ordering guarantee.
> 
> 
> Thanks,
> Prateek
> 

Folks,

Please confirm if the suspicion of smp_rmb is correct.
IMO, it should be smp_mb() translating to dmb ish.


Thanks
Prateek

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project


Re: [PATCH] percpu_rwsem: fix missed wakeup due to reordering of load

2018-12-03 Thread Prateek Sood
On 12/03/2018 12:08 PM, Davidlohr Bueso wrote:
> On 2018-11-30 07:10, Prateek Sood wrote:
>> In a scenario where cpu_hotplug_lock percpu_rw_semaphore is already
>> acquired for read operation by P1 using percpu_down_read().
>>
>> Now we have P1 in the path of releaseing the cpu_hotplug_lock and P2
>> is in the process of acquiring cpu_hotplug_lock.
>>
>> P1   P2
>> percpu_up_read() path  percpu_down_write() path
>>
>>   rcu_sync_enter() 
>> //gp_state=GP_PASSED
>>
>> rcu_sync_is_idle() //returns false    down_write(rw_sem)
>>
>> __percpu_up_read()
>>
>> [L] task = rcu_dereference(w->task) //NULL
>>
>> smp_rmb()  [S] w->task = current
>>
>>     smp_mb()
>>
>>    [L] readers_active_check() //fails
>>  schedule()
>>
>> [S] __this_cpu_dec(read_count)
>>
>> Since load of task can result in NULL. This can lead to missed wakeup
>> in rcuwait_wake_up(). Above sequence violated the following constraint
>> in rcuwait_wake_up():
>>
>>  WAIT    WAKE
>> [S] tsk = current  [S] cond = true
>> MB (A)    MB (B)
>> [L] cond  [L] tsk
>>
> 
> Hmm yeah we don't want rcu_wake_up() to get hoisted over the 
> __this_cpu_dec(read_count). The smp_rmb() does not make sense to me here in 
> the first place. Did you run into this scenario by code inspection or you 
> actually it the issue?
> 
> Thanks,
> Davidlohr

I have checked one issue where it seems that cpu hotplug code
path is not able to get cpu_hotplug_lock in write mode and there
is a reader pending for cpu hotplug path to release
percpu_rw_semaphore->rwsem to acquire cpu_hotplug_lock.
This caused a deadlock.

>From code inspection also it seems to be not adhering to arm64
smp_rmb() constraint of load/load-store ordering guarantee.


Thanks,
Prateek

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project


Re: [PATCH] percpu_rwsem: fix missed wakeup due to reordering of load

2018-12-03 Thread Prateek Sood
On 12/03/2018 12:08 PM, Davidlohr Bueso wrote:
> On 2018-11-30 07:10, Prateek Sood wrote:
>> In a scenario where cpu_hotplug_lock percpu_rw_semaphore is already
>> acquired for read operation by P1 using percpu_down_read().
>>
>> Now we have P1 in the path of releaseing the cpu_hotplug_lock and P2
>> is in the process of acquiring cpu_hotplug_lock.
>>
>> P1   P2
>> percpu_up_read() path  percpu_down_write() path
>>
>>   rcu_sync_enter() 
>> //gp_state=GP_PASSED
>>
>> rcu_sync_is_idle() //returns false    down_write(rw_sem)
>>
>> __percpu_up_read()
>>
>> [L] task = rcu_dereference(w->task) //NULL
>>
>> smp_rmb()  [S] w->task = current
>>
>>     smp_mb()
>>
>>    [L] readers_active_check() //fails
>>  schedule()
>>
>> [S] __this_cpu_dec(read_count)
>>
>> Since load of task can result in NULL. This can lead to missed wakeup
>> in rcuwait_wake_up(). Above sequence violated the following constraint
>> in rcuwait_wake_up():
>>
>>  WAIT    WAKE
>> [S] tsk = current  [S] cond = true
>> MB (A)    MB (B)
>> [L] cond  [L] tsk
>>
> 
> Hmm yeah we don't want rcu_wake_up() to get hoisted over the 
> __this_cpu_dec(read_count). The smp_rmb() does not make sense to me here in 
> the first place. Did you run into this scenario by code inspection or you 
> actually it the issue?
> 
> Thanks,
> Davidlohr

I have checked one issue where it seems that cpu hotplug code
path is not able to get cpu_hotplug_lock in write mode and there
is a reader pending for cpu hotplug path to release
percpu_rw_semaphore->rwsem to acquire cpu_hotplug_lock.
This caused a deadlock.

>From code inspection also it seems to be not adhering to arm64
smp_rmb() constraint of load/load-store ordering guarantee.


Thanks,
Prateek

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project


Re: [PATCH] percpu_rwsem: fix missed wakeup due to reordering of load

2018-12-02 Thread Davidlohr Bueso

On 2018-11-30 07:10, Prateek Sood wrote:

In a scenario where cpu_hotplug_lock percpu_rw_semaphore is already
acquired for read operation by P1 using percpu_down_read().

Now we have P1 in the path of releaseing the cpu_hotplug_lock and P2
is in the process of acquiring cpu_hotplug_lock.

P1   P2
percpu_up_read() path  percpu_down_write() path

  rcu_sync_enter() 
//gp_state=GP_PASSED


rcu_sync_is_idle() //returns falsedown_write(rw_sem)

__percpu_up_read()

[L] task = rcu_dereference(w->task) //NULL

smp_rmb()  [S] w->task = current

smp_mb()

   [L] readers_active_check() 
//fails

 schedule()

[S] __this_cpu_dec(read_count)

Since load of task can result in NULL. This can lead to missed wakeup
in rcuwait_wake_up(). Above sequence violated the following constraint
in rcuwait_wake_up():

 WAITWAKE
[S] tsk = current [S] cond = true
MB (A)  MB (B)
[L] cond  [L] tsk



Hmm yeah we don't want rcu_wake_up() to get hoisted over the 
__this_cpu_dec(read_count). The smp_rmb() does not make sense to me here 
in the first place. Did you run into this scenario by code inspection or 
you actually it the issue?


Thanks,
Davidlohr


Re: [PATCH] percpu_rwsem: fix missed wakeup due to reordering of load

2018-12-02 Thread Davidlohr Bueso

On 2018-11-30 07:10, Prateek Sood wrote:

In a scenario where cpu_hotplug_lock percpu_rw_semaphore is already
acquired for read operation by P1 using percpu_down_read().

Now we have P1 in the path of releaseing the cpu_hotplug_lock and P2
is in the process of acquiring cpu_hotplug_lock.

P1   P2
percpu_up_read() path  percpu_down_write() path

  rcu_sync_enter() 
//gp_state=GP_PASSED


rcu_sync_is_idle() //returns falsedown_write(rw_sem)

__percpu_up_read()

[L] task = rcu_dereference(w->task) //NULL

smp_rmb()  [S] w->task = current

smp_mb()

   [L] readers_active_check() 
//fails

 schedule()

[S] __this_cpu_dec(read_count)

Since load of task can result in NULL. This can lead to missed wakeup
in rcuwait_wake_up(). Above sequence violated the following constraint
in rcuwait_wake_up():

 WAITWAKE
[S] tsk = current [S] cond = true
MB (A)  MB (B)
[L] cond  [L] tsk



Hmm yeah we don't want rcu_wake_up() to get hoisted over the 
__this_cpu_dec(read_count). The smp_rmb() does not make sense to me here 
in the first place. Did you run into this scenario by code inspection or 
you actually it the issue?


Thanks,
Davidlohr


[PATCH] percpu_rwsem: fix missed wakeup due to reordering of load

2018-11-30 Thread Prateek Sood
In a scenario where cpu_hotplug_lock percpu_rw_semaphore is already
acquired for read operation by P1 using percpu_down_read().

Now we have P1 in the path of releaseing the cpu_hotplug_lock and P2
is in the process of acquiring cpu_hotplug_lock.

P1   P2
percpu_up_read() path  percpu_down_write() path

  rcu_sync_enter() //gp_state=GP_PASSED

rcu_sync_is_idle() //returns falsedown_write(rw_sem)

__percpu_up_read()

[L] task = rcu_dereference(w->task) //NULL

smp_rmb()  [S] w->task = current

smp_mb()

   [L] readers_active_check() //fails
 schedule()

[S] __this_cpu_dec(read_count)

Since load of task can result in NULL. This can lead to missed wakeup
in rcuwait_wake_up(). Above sequence violated the following constraint
in rcuwait_wake_up():

 WAITWAKE
[S] tsk = current [S] cond = true
MB (A)  MB (B)
[L] cond  [L] tsk

This can happen as smp_rmb() in rcuwait_wake_up() will provide ordering
of load before barrier with load and store after barrier for arm64
architecture. Here the requirement is to order store before smp_rmb()
with load after the smp_rmb().

For the usage of rcuwait_wake_up() in __percpu_up_read() full barrier
(smp_mb) is required to complete the constraint of rcuwait_wake_up().

Signed-off-by: Prateek Sood 
---
 kernel/exit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index f1d74f0..a10820d 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -306,7 +306,7 @@ void rcuwait_wake_up(struct rcuwait *w)
 *MB (A)  MB (B)
 *[L] cond[L] tsk
 */
-   smp_rmb(); /* (B) */
+   smp_mb(); /* (B) */
 
/*
 * Avoid using task_rcu_dereference() magic as long as we are careful,
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., 
is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.



[PATCH] percpu_rwsem: fix missed wakeup due to reordering of load

2018-11-30 Thread Prateek Sood
In a scenario where cpu_hotplug_lock percpu_rw_semaphore is already
acquired for read operation by P1 using percpu_down_read().

Now we have P1 in the path of releaseing the cpu_hotplug_lock and P2
is in the process of acquiring cpu_hotplug_lock.

P1   P2
percpu_up_read() path  percpu_down_write() path

  rcu_sync_enter() //gp_state=GP_PASSED

rcu_sync_is_idle() //returns falsedown_write(rw_sem)

__percpu_up_read()

[L] task = rcu_dereference(w->task) //NULL

smp_rmb()  [S] w->task = current

smp_mb()

   [L] readers_active_check() //fails
 schedule()

[S] __this_cpu_dec(read_count)

Since load of task can result in NULL. This can lead to missed wakeup
in rcuwait_wake_up(). Above sequence violated the following constraint
in rcuwait_wake_up():

 WAITWAKE
[S] tsk = current [S] cond = true
MB (A)  MB (B)
[L] cond  [L] tsk

This can happen as smp_rmb() in rcuwait_wake_up() will provide ordering
of load before barrier with load and store after barrier for arm64
architecture. Here the requirement is to order store before smp_rmb()
with load after the smp_rmb().

For the usage of rcuwait_wake_up() in __percpu_up_read() full barrier
(smp_mb) is required to complete the constraint of rcuwait_wake_up().

Signed-off-by: Prateek Sood 
---
 kernel/exit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index f1d74f0..a10820d 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -306,7 +306,7 @@ void rcuwait_wake_up(struct rcuwait *w)
 *MB (A)  MB (B)
 *[L] cond[L] tsk
 */
-   smp_rmb(); /* (B) */
+   smp_mb(); /* (B) */
 
/*
 * Avoid using task_rcu_dereference() magic as long as we are careful,
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., 
is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.