Looks good to me.

There is one test in show_pwq()
"""
 worker == pwq->wq->rescuer ? "(RESCUER)" : "",
"""
I'm wondering if it needs to be updated to
"""
worker->rescue_wq ? "(RESCUER)" : "",
"""

And document "/* MD: rescue worker */" might be better
than current "/* I: rescue worker */", although ->rescuer can
be accessed without wq_mayday_lock lock in some code.

Reviewed-by: Lai Jiangshan <jiangshan...@gmail.com>



On Thu, Sep 19, 2019 at 9:43 AM Tejun Heo <t...@kernel.org> wrote:
>
> Before actually destrying a workqueue, destroy_workqueue() checks
> whether it's actually idle.  If it isn't, it prints out a bunch of
> warning messages and leaves the workqueue dangling.  It unfortunately
> has a couple issues.
>
> * Mayday list queueing increments pwq's refcnts which gets detected as
>   busy and fails the sanity checks.  However, because mayday list
>   queueing is asynchronous, this condition can happen without any
>   actual work items left in the workqueue.
>
> * Sanity check failure leaves the sysfs interface behind too which can
>   lead to init failure of newer instances of the workqueue.
>
> This patch fixes the above two by
>
> * If a workqueue has a rescuer, disable and kill the rescuer before
>   sanity checks.  Disabling and killing is guaranteed to flush the
>   existing mayday list.
>
> * Remove sysfs interface before sanity checks.
>
> Signed-off-by: Tejun Heo <t...@kernel.org>
> Reported-by: Marcin Pawlowski <mpawlow...@fb.com>
> Reported-by: "Williams, Gerald S" <gerald.s.willi...@intel.com>
> Cc: sta...@vger.kernel.org
> ---
> Applying to wq/for-5.4.
>
> Thanks.
>
>  kernel/workqueue.c | 24 +++++++++++++++++++-----
>  1 file changed, 19 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 95aea04ff722..73e3ea9e31d3 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -4318,9 +4318,28 @@ void destroy_workqueue(struct workqueue_struct *wq)
>         struct pool_workqueue *pwq;
>         int node;
>
> +       /*
> +        * Remove it from sysfs first so that sanity check failure doesn't
> +        * lead to sysfs name conflicts.
> +        */
> +       workqueue_sysfs_unregister(wq);
> +
>         /* drain it before proceeding with destruction */
>         drain_workqueue(wq);
>
> +       /* kill rescuer, if sanity checks fail, leave it w/o rescuer */
> +       if (wq->rescuer) {
> +               struct worker *rescuer = wq->rescuer;
> +
> +               /* this prevents new queueing */
> +               spin_lock_irq(&wq_mayday_lock);
> +               wq->rescuer = NULL;
> +               spin_unlock_irq(&wq_mayday_lock);
> +
> +               /* rescuer will empty maydays list before exiting */
> +               kthread_stop(rescuer->task);
> +       }
> +
>         /* sanity checks */
>         mutex_lock(&wq->mutex);
>         for_each_pwq(pwq, wq) {
> @@ -4352,11 +4371,6 @@ void destroy_workqueue(struct workqueue_struct *wq)
>         list_del_rcu(&wq->list);
>         mutex_unlock(&wq_pool_mutex);
>
> -       workqueue_sysfs_unregister(wq);
> -
> -       if (wq->rescuer)
> -               kthread_stop(wq->rescuer->task);
> -
>         if (!(wq->flags & WQ_UNBOUND)) {
>                 wq_unregister_lockdep(wq);
>                 /*
> --
> 2.17.1
>

Reply via email to