Hi,

> This works, but have you looked elsewhere in the kernel for kthread
> examples we could copy that do a race-free check without a timeout?

If you refer to other implementations in kernel, the following
modifications may be better.
The important thing is to call kthread_should_stop() after
set_current_state(TASK_INTERRUPTIBLE).  How is this fix?

--- fs/dlm/recoverd.c.orig      2017-08-10 16:42:31.131296351 +0900
+++ fs/dlm/recoverd.c   2017-08-10 16:43:22.483295232 +0900
@@ -287,8 +287,10 @@
        set_bit(LSFL_RECOVER_LOCK, &ls->ls_flags);
        wake_up(&ls->ls_recover_lock_wait);

-       while (!kthread_should_stop()) {
+       while (1) {
                set_current_state(TASK_INTERRUPTIBLE);
+               if (kthread_should_stop())
+                       break;
                if (!test_bit(LSFL_RECOVER_WORK, &ls->ls_flags) &&
                    !test_bit(LSFL_RECOVER_DOWN, &ls->ls_flags))
                        schedule();

thanks,
-- owa

-----Original Message-----
From: David Teigland [mailto:teigl...@redhat.com] 
Sent: Thursday, August 10, 2017 1:19 AM
To: owa tsutomu(大輪 勤 TMC ○SSDジ□ES技○ES五)
Cc: cluster-devel@redhat.com; miyauchi tadashi(宮内 忠志 TOPS (SW開)[基本])
Subject: Re: [Cluster-devel] [PATCH 10/17] dlm: use schedule_timeout instead of 
schedule in dlm_recoverd

On Wed, Aug 09, 2017 at 05:51:01AM +0000, tsutomu....@toshiba.co.jp wrote:
> When dlm_recoverd_stop() is called between kthread_should_stop() and
> set_task_state(), dlm_recoverd will not wake up.

This works, but have you looked elsewhere in the kernel for kthread
examples we could copy that do a race-free check without a timeout?
Thanks

> 
> Signed-off-by: Tadashi Miyauchi <miyau...@toshiba-tops.co.jp>
> Signed-off-by: Tsutomu Owa <tsutomu....@toshiba.co.jp>
> ---
>  fs/dlm/recoverd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
> index 6859b4b..d3956cc 100644
> --- a/fs/dlm/recoverd.c
> +++ b/fs/dlm/recoverd.c
> @@ -276,6 +276,7 @@ static void do_ls_recovery(struct dlm_ls *ls)
>  static int dlm_recoverd(void *arg)
>  {
>       struct dlm_ls *ls;
> +     unsigned long timeout = (dlm_config.ci_recover_timer * HZ) >> 1;
>  
>       ls = dlm_find_lockspace_local(arg);
>       if (!ls) {
> @@ -291,7 +292,7 @@ static int dlm_recoverd(void *arg)
>               set_current_state(TASK_INTERRUPTIBLE);
>               if (!test_bit(LSFL_RECOVER_WORK, &ls->ls_flags) &&
>                   !test_bit(LSFL_RECOVER_DOWN, &ls->ls_flags))
> -                     schedule();
> +                     schedule_timeout(timeout);
>               set_current_state(TASK_RUNNING);
>  
>               if (test_and_clear_bit(LSFL_RECOVER_DOWN, &ls->ls_flags)) {
> -- 
> 2.7.4
> 
> 
> 



Reply via email to