On 19 February 2013 11:29, Vincent Guittot <vincent.guit...@linaro.org> wrote:
> On 18 February 2013 16:40, Frederic Weisbecker <fweis...@gmail.com> wrote:
>> 2013/2/18 Vincent Guittot <vincent.guit...@linaro.org>:
>>> On 18 February 2013 15:38, Frederic Weisbecker <fweis...@gmail.com> wrote:
>>>> I pasted the original at: http://pastebin.com/DMm5U8J8
>>>
>>> We can clear the idle flag only in the nohz_kick_needed which will not
>>> be called if the sched_domain is NULL so the sequence will be
>>>
>>> = CPU 0 =                                            = CPU 1=
>>>
>>> detach_and_destroy_domain {
>>>             rcu_assign_pointer(cpu1_dom, NULL);
>>> }
>>>
>>> dom = new_domain(...) {
>>>              nr_cpus_busy = 0;
>>>              set_idle(CPU 1);
>>> }
>>>                                                     dom =
>>> rcu_dereference(cpu1_dom)
>>>                                                     //dom == NULL, return
>>>
>>> rcu_assign_pointer(cpu1_dom, dom);
>>>
>>>                                                     dom =
>>> rcu_dereference(cpu1_dom)
>>>                                                     //dom != NULL,
>>>                                                     nohz_kick_needed {
>>>
>>> set_idle(CPU 1)
>>>                                                                    dom
>>> = rcu_dereference(cpu1_dom)
>>>
>>> //dec nr_cpus_busy,
>>>                                                     }
>>>
>>> Vincent
>>
>> Ok but CPU 0 can assign NULL to the domain of cpu1 while CPU 1 is
>> already in the middle of nohz_kick_needed().
>
> Yes nothing prevents the sequence below to occur
>
> = CPU 0 =                                            = CPU 1=
>                                                     dom =
> rcu_dereference(cpu1_dom)
>                                                     //dom != NULL
> detach_and_destroy_domain {
>             rcu_assign_pointer(cpu1_dom, NULL);
> }
>
> dom = new_domain(...) {
>              nr_cpus_busy = 0;
>              //nr_cpus_busy in the new_dom
>              set_idle(CPU 1);
> }
>                                                     nohz_kick_needed {
>                                                              clear_idle(CPU 1)
>                                                              dom =
> rcu_dereference(cpu1_dom)
>
> //cpu1_dom == old_dom
>                                                              inc nr_cpus_busy,
>
> //nr_cpus_busy in the old_dom
>                                                     }
>
> rcu_assign_pointer(cpu1_dom, dom);
> //cpu1_dom == new_dom

The sequence above is not correct in addition to become unreadable
after going through gmail

The correct and readable version
https://pastebin.linaro.org/1750/

Vincent

>
> I'm not sure that this can happen in practice because CPU1 is in
> interrupt handler but we don't have any mechanism to prevent the
> sequence.
>
> The NULL sched_domain can be used to detect this situation and the
> set_cpu_sd_state_busy function can be modified like below
>
> inline void set_cpu_sd_state_busy
>  {
>         struct sched_domain *sd;
>         int cpu = smp_processor_id();
> +       int clear = 0;
>
>         if (!test_bit(NOHZ_IDLE, nohz_flags(cpu)))
>                 return;
> -       clear_bit(NOHZ_IDLE, nohz_flags(cpu));
>
>         rcu_read_lock();
>         for_each_domain(cpu, sd) {
>                 atomic_inc(&sd->groups->sgp->nr_busy_cpus);
> +               clear = 1;
>         }
>         rcu_read_unlock();
> +
> +       if (likely(clear))
> +               clear_bit(NOHZ_IDLE, nohz_flags(cpu));
>  }
>
> The NOHZ_IDLE flag will not be clear if we have a NULL sched_domain
> attached to the CPU.
> With this implementation, we still don't need to get the sched_domain
> for testing the NOHZ_IDLE flag which occurs each time CPU becomes idle
>
> The patch 2 become useless
>
> Vincent
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to