On 06/26, Peter Zijlstra wrote:
>
> On Fri, Jun 26, 2015 at 04:14:55AM +0200, Oleg Nesterov wrote:
> > Not sure.
> >
> > And note that this series kills stop_cpus_mutex, so that multiple
> > stop_cpus()'s / stop_machine()'s can run in parallel if cpumask's
> > do not overlap.
> >
> > Note also the changelog in 6/6, we can simplify + optimize this code
> > a bit more.
> >
> > What do you think?
>
> The problem I have with this is that it makes the better operation
> (stop_two_cpus) slower while improving the worse operation (stop_cpus).
>
> I would much prefer to keep stop_two_cpus() as proposed with taking two
> cpu_stopper::lock instances and replacing the stop_cpu_mutex with a
> percpu-rwsem.

OK, lets avoid cpumask in stop_two_cpus,

        int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t 
fn, void *arg)
        {
                struct multi_stop_data msdata;
                struct cpu_stop_done done;
                struct cpu_stop_work *work1, *work2;

                msdata = (struct multi_stop_data){
                        .fn = fn,
                        .data = arg,
                        .num_threads = 2,
                        .active_cpus = cpumask_of(cpu1),
                };

                cpu_stop_init_done(&done, 2);
                set_state(&msdata, MULTI_STOP_PREPARE);

                if (cpu1 > cpu2)
                        swap(cpu1, cpu2);
                work1 = stop_work_alloc_one(cpu1, true);
                work2 = stop_work_alloc_one(cpu1, true);

                *work1 = *work2 = (struct cpu_stop_work) {
                        .fn = multi_cpu_stop,
                        .arg = &msdata,
                        .done = &done
                };

                preempt_disable();
                cpu_stop_queue_work(cpu1, work1);
                cpu_stop_queue_work(cpu2, work2);
                preempt_enable();

                wait_for_completion(&done.completion);

                stop_work_free_one(cpu1);
                stop_work_free_one(cpu2);
                wake_up(&stop_work_wq);

                return done.executed ? done.ret : -ENOENT;
        }

2 cmpxchg()'s vs 2 spin_lock()'s. Plus wake_up(), but we can check
waitqueue_active().

Do you think thi will be noticeably slower?

Of course, if it races with another stop_two_cpus/stop_cpus it will
sleep, but in this case we need to wait anyway.


And I don't think that percpu-rwsem instead of stop_cpu_mutex makes
sense. at least I don't understand how can it help. OK, stop_two_cpus()
can use percpu_down_read() to avoid the deadlock with stop_cpus(), but
you still need double-lock... So I don't think this will make it faster,
this will just penalize stop_cpus(). Or I misunderstood.

So I am still not convinced... But probably I am too biased ;)


Btw. I can't understand the cpu_active() checks in stop_two_cpus().
Do we really need them?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to