On Tue, Dec 01, 2015 at 09:41:09PM +0100, Peter Zijlstra wrote:
> On Fri, Nov 13, 2015 at 03:22:04PM +0100, Frederic Weisbecker wrote:
> > The tick dependency is evaluated on every IRQ. This is a batch of checks
> > which determine whether it is safe to stop the tick or not. These checks
> > are often split in many details: posix cpu timers, scheduler, sched clock,
> > perf events. Each of which are made of smaller details: posix cpu
> > timer involves checking process wide timers then thread wide timers. Perf
> > involves checking freq events then more per cpu details.
> > 
> > Checking these details asynchronously every time we update the full
> > dynticks state bring avoidable overhead and a messy layout.
> > 
> > Lets introduce instead tick dependency masks: one for system wide
> > dependency (unstable sched clock), one for CPU wide dependency (sched,
> > perf), and task/signal level dependencies. The subsystems are responsible
> > of setting and clearing their dependency through a set of APIs that will
> > take care of concurrent dependency mask modifications and kick targets
> > to restart the relevant CPU tick whenever needed.
> 
> Maybe better explain why we need the per task and per signal thingy?

I'll detail that some more in the changelog. The only user of the per task/per 
signal
tick dependency is posix cpu timer. I've been first proposing a global tick 
dependency
as soon as any posix cpu timer is armed. It simplified everything but some 
reviewers
complained (eg: some users might want to run posix timers on housekeepers 
without
bothering full dynticks CPUs). I could remove the per signal dependency with 
dispatching
it through all threads in the group each time there is an update but that's the 
best I can
think of.

> 
> > +static void trace_tick_dependency(unsigned long dep)
> > +{
> > +   if (dep & TICK_POSIX_TIMER_MASK) {
> > +           trace_tick_stop(0, "posix timers running\n");
> > +           return;
> > +   }
> > +
> > +   if (dep & TICK_PERF_EVENTS_MASK) {
> > +           trace_tick_stop(0, "perf events running\n");
> > +           return;
> > +   }
> > +
> > +   if (dep & TICK_SCHED_MASK) {
> > +           trace_tick_stop(0, "more than 1 task in runqueue\n");
> > +           return;
> > +   }
> > +
> > +   if (dep & TICK_CLOCK_UNSTABLE_MASK)
> > +           trace_tick_stop(0, "unstable sched clock\n");
> > +}
> 
> I would suggest ditching the strings and using the

Using a code value instead?

> 
> > +static void kick_all_work_fn(struct work_struct *work)
> > +{
> > +       tick_nohz_full_kick_all();
> > +}
> > +static DECLARE_WORK(kick_all_work, kick_all_work_fn);
> > +
> > +void __tick_nohz_set_dep_delayed(enum tick_dependency_bit bit, unsigned 
> > long *dep)
> > +{
> > +   unsigned long prev;
> > +
> > +   prev = fetch_or(dep, BIT_MASK(bit));
> > +   if (!prev) {
> > +           /*
> > +           * We need the IPIs to be sent from sane process context.
> 
> Why ?

Because posix timers code is all called with interrupts disabled and we can't
send IPIs then.

> 
> > +           * The posix cpu timers are always set with irqs disabled.
> > +           */
> > +           schedule_work(&kick_all_work);
> > +   }
> > +}
> > +
> > +/*
> > + * Set a global tick dependency. Lets do the wide IPI kick asynchronously
> > + * for callers with irqs disabled.
> 
> This seems to suggest you can call this with IRQs disabled

Ah right, that's a misleading comment. We need to use the _delayed() version
when interrupts are disabled.

Thanks.

> 
> > + */
> > +void tick_nohz_set_dep(enum tick_dependency_bit bit)
> > +{
> > +   unsigned long prev;
> > +
> > +   prev = fetch_or(&tick_dependency, BIT_MASK(bit));
> > +   if (!prev)
> > +           tick_nohz_full_kick_all();
> 
> But that function seems implemented using smp_call_function_many() which
> cannot be called with IRQs disabled.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to