On Mon, Feb 03, 2014 at 12:21:16PM +0530, Viresh Kumar wrote: > Sorry was away for short vacation. > > On 28 January 2014 19:20, Frederic Weisbecker <[email protected]> wrote: > > On Thu, Jan 23, 2014 at 07:50:40PM +0530, Viresh Kumar wrote: > >> Wait, I got the wrong code here. That's wasn't my initial intention. > >> I actually wanted to write something like this: > >> > >> - wake_up_nohz_cpu(cpu); > >> + if (!tbase_get_deferrable(timer->base) || idle_cpu(cpu)) > >> + wake_up_nohz_cpu(cpu); > >> > >> Will that work? > > Something is seriously wrong with me, again wrote rubbish code. > Let me phrase what I wanted to write :) > > "don't send IPI to a idle CPU for a deferrable timer." > > Probably I code it correctly this time atleast. > > - wake_up_nohz_cpu(cpu); > + if (!(tbase_get_deferrable(timer->base) && idle_cpu(cpu))) > + wake_up_nohz_cpu(cpu);
Yeah but that's racy if the target is nohz full. We may be seeing it idle whereas it woke up lately and run in userspace tickless for a while. > > > Well, this is going to wake up the target from its idle state, which is > > what we want to avoid if the timer is deferrable, right? > > Yeah, sorry for doing it for second time :( I'm certainly not blaming you for being confused, that would be the pot calling the kettle black ;) > > > The simplest thing we want is: > > > > if (!tbase_get_deferrable(timer->base) || > > tick_nohz_full_cpu(cpu)) > > wake_up_nohz_cpu(cpu); > > > > This spares the IPI for the common case where the timer is deferrable and > > we run > > in periodic or dynticks-idle mode (which should be 99.99% of the existing > > workloads). > > I wasn't looking at this problem with NO_HZ_FULL in mind. As I thought its > only about if the CPU is idle or not. And so the solution I was > talking about was: > > "don't send IPI to a idle CPU for a deferrable timer." > > But I see that still failing with the code you wrote. For normal cases where > we > don't enable NO_HZ_FULL, we will still end up waking up idle CPUs which > is what Lei Wen reported initially. Not with the small change I proposed above. I'm applying it. > > Also if a CPU is marked for NO_HZ_FULL and is not idle currently then we > wouldn't send a IPI for a deferrable timer. But we actually need that, so that > we can reevaluate the timers order again? Right. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

