Hi, > This works, but have you looked elsewhere in the kernel for kthread > examples we could copy that do a race-free check without a timeout?
If you refer to other implementations in kernel, the following modifications may be better. The important thing is to call kthread_should_stop() after set_current_state(TASK_INTERRUPTIBLE). How is this fix? --- fs/dlm/recoverd.c.orig 2017-08-10 16:42:31.131296351 +0900 +++ fs/dlm/recoverd.c 2017-08-10 16:43:22.483295232 +0900 @@ -287,8 +287,10 @@ set_bit(LSFL_RECOVER_LOCK, &ls->ls_flags); wake_up(&ls->ls_recover_lock_wait); - while (!kthread_should_stop()) { + while (1) { set_current_state(TASK_INTERRUPTIBLE); + if (kthread_should_stop()) + break; if (!test_bit(LSFL_RECOVER_WORK, &ls->ls_flags) && !test_bit(LSFL_RECOVER_DOWN, &ls->ls_flags)) schedule(); thanks, -- owa -----Original Message----- From: David Teigland [mailto:teigl...@redhat.com] Sent: Thursday, August 10, 2017 1:19 AM To: owa tsutomu(大輪 勤 TMC ○SSDジ□ES技○ES五) Cc: cluster-devel@redhat.com; miyauchi tadashi(宮内 忠志 TOPS (SW開)[基本]) Subject: Re: [Cluster-devel] [PATCH 10/17] dlm: use schedule_timeout instead of schedule in dlm_recoverd On Wed, Aug 09, 2017 at 05:51:01AM +0000, tsutomu....@toshiba.co.jp wrote: > When dlm_recoverd_stop() is called between kthread_should_stop() and > set_task_state(), dlm_recoverd will not wake up. This works, but have you looked elsewhere in the kernel for kthread examples we could copy that do a race-free check without a timeout? Thanks > > Signed-off-by: Tadashi Miyauchi <miyau...@toshiba-tops.co.jp> > Signed-off-by: Tsutomu Owa <tsutomu....@toshiba.co.jp> > --- > fs/dlm/recoverd.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c > index 6859b4b..d3956cc 100644 > --- a/fs/dlm/recoverd.c > +++ b/fs/dlm/recoverd.c > @@ -276,6 +276,7 @@ static void do_ls_recovery(struct dlm_ls *ls) > static int dlm_recoverd(void *arg) > { > struct dlm_ls *ls; > + unsigned long timeout = (dlm_config.ci_recover_timer * HZ) >> 1; > > ls = dlm_find_lockspace_local(arg); > if (!ls) { > @@ -291,7 +292,7 @@ static int dlm_recoverd(void *arg) > set_current_state(TASK_INTERRUPTIBLE); > if (!test_bit(LSFL_RECOVER_WORK, &ls->ls_flags) && > !test_bit(LSFL_RECOVER_DOWN, &ls->ls_flags)) > - schedule(); > + schedule_timeout(timeout); > set_current_state(TASK_RUNNING); > > if (test_and_clear_bit(LSFL_RECOVER_DOWN, &ls->ls_flags)) { > -- > 2.7.4 > > >