Peter Zijlstra <pet...@infradead.org> writes:
> @@ -7883,14 +7889,6 @@ int sched_cpu_deactivate(unsigned int cp
>       set_cpu_active(cpu, false);
>  
>       /*
> -      * From this point forward, this CPU will refuse to run any task that
> -      * is not: migrate_disable() or KTHREAD_IS_PER_CPU, and will actively
> -      * push those tasks away until this gets cleared, see
> -      * sched_cpu_dying().
> -      */
> -     balance_push_set(cpu, true);
> -
> -     /*
>        * We've cleared cpu_active_mask / set balance_push, wait for all
>        * preempt-disabled and RCU users of this state to go away such that
>        * all new such users will observe it.
> @@ -7910,6 +7908,14 @@ int sched_cpu_deactivate(unsigned int cp
>       }
>       rq_unlock_irqrestore(rq, &rf);
>  
> +     /*
> +      * From this point forward, this CPU will refuse to run any task that
> +      * is not: migrate_disable() or KTHREAD_IS_PER_CPU, and will actively
> +      * push those tasks away until this gets cleared, see
> +      * sched_cpu_dying().
> +      */
> +     balance_push_set(cpu, true);
> +

AIUI with cpu_dying_mask being flipped before even entering
sched_cpu_deactivate(), we don't need this to be before the
synchronize_rcu() anymore; is there more than that to why you're punting it
back this side of it?

>  #ifdef CONFIG_SCHED_SMT
>       /*
>        * When going down, decrement the number of cores with SMT present.

> @@ -8206,7 +8212,7 @@ void __init sched_init(void)
>               rq->sd = NULL;
>               rq->rd = NULL;
>               rq->cpu_capacity = rq->cpu_capacity_orig = SCHED_CAPACITY_SCALE;
> -             rq->balance_callback = NULL;
> +             rq->balance_callback = &balance_push_callback;
>               rq->active_balance = 0;
>               rq->next_balance = jiffies;
>               rq->push_cpu = 0;
> @@ -8253,6 +8259,7 @@ void __init sched_init(void)
>  
>  #ifdef CONFIG_SMP
>       idle_thread_set_boot_cpu();
> +     balance_push_set(smp_processor_id(), false);
>  #endif
>       init_sched_fair_class();
>

I don't get what these two changes do - the end result is the same as
before, no?


Also, AIUI this patch covers the cpu_dying -> !cpu_dying rollback case
since balance_push gets numbed down by !cpu_dying. What about the other way
around (hot-plug failure + rollback)? We may have allowed !pcpu tasks on the
now-dying CPU, and we'd need to re-install the balance_push callback. 

I'm starting to think we'd need to have

  rq->balance_callback = &balance_push_callback

for any CPU with hotplug state < CPUHP_AP_ACTIVE. Thus we would
need:

  balance_push_set(cpu, true) in sched_init() and sched_cpu_deactivate()
  balance_push_set(cpu, false) in sched_cpu_activate()

and the rest would be driven by the cpu_dying_mask.

Reply via email to