On Tue, Jun 13, 2017 at 04:58:37PM -0400, Tejun Heo wrote:
> Hello, Paul.
> 
> On Fri, May 05, 2017 at 10:11:59AM -0700, Paul E. McKenney wrote:
> > Just following up...  I have hit this bug a couple of times over the
> > past few days.  Anything I can do to help?
> 
> My apologies for dropping the ball on this.  I've gone over the hot
> plug code in workqueue several times but can't really find how this
> would happen.  Can you please apply the following patch and see what
> it says when the problem happens?

I have fired it up, thank you!

Last time I saw one failure in 21 hours of test runs, so I have kicked
of 42 one-hour test runs.  Will see what happens tomorrow morning,
Pacific Time.

                                                        Thanx, Paul

> Thanks.
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index c74bf39ef764..bd2ce3cbfb41 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1691,13 +1691,20 @@ static struct worker *alloc_worker(int node)
>  static void worker_attach_to_pool(struct worker *worker,
>                                  struct worker_pool *pool)
>  {
> +     int ret;
> +
>       mutex_lock(&pool->attach_mutex);
> 
>       /*
>        * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
>        * online CPUs.  It'll be re-applied when any of the CPUs come up.
>        */
> -     set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> +     ret = set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> +
> +     WARN(ret && !(pool->flags & POOL_DISASSOCIATED),
> +          "set_cpus_allowed_ptr failed, ret=%d pool->cpu/flags=%d/0x%x 
> cpumask=%*pbl online=%*pbl active=%*pbl\n",
> +          ret, pool->cpu, pool->flags, cpumask_pr_args(pool->attrs->cpumask),
> +          cpumask_pr_args(cpu_online_mask), 
> cpumask_pr_args(cpu_active_mask));
> 
>       /*
>        * The pool->attach_mutex ensures %POOL_DISASSOCIATED remains
> @@ -2037,8 +2044,11 @@ __acquires(&pool->lock)
>       lockdep_copy_map(&lockdep_map, &work->lockdep_map);
>  #endif
>       /* ensure we're on the correct CPU */
> -     WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
> -                  raw_smp_processor_id() != pool->cpu);
> +     if (WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
> +                      raw_smp_processor_id() != pool->cpu))
> +             printk_once("XXX workfn=%pf pool->cpu/flags=%d/0x%x curcpu=%d 
> online=%*pbl active=%*pbl\n",
> +                         work->func, pool->cpu, pool->flags, 
> raw_smp_processor_id(),
> +                         cpumask_pr_args(cpu_online_mask), 
> cpumask_pr_args(cpu_active_mask));
> 
>       /*
>        * A single work shouldn't be executed concurrently by
> 

Reply via email to