Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation

2020-12-29 Thread Qais Yousef
On 12/29/20 15:30, Frederic Weisbecker wrote:
> On Tue, Dec 29, 2020 at 02:12:31PM +, Qais Yousef wrote:
> > On 12/29/20 14:41, Frederic Weisbecker wrote:
> > > > > -void vtime_account_irq(struct task_struct *tsk)
> > > > > +void vtime_account_irq(struct task_struct *tsk, unsigned int offset)
> > > > >  {
> > > > > - if (hardirq_count()) {
> > > > > + unsigned int pc = preempt_count() - offset;
> > > > > +
> > > > > + if (pc & HARDIRQ_OFFSET) {
> > > > 
> > > > Shouldn't this be HARDIRQ_MASK like above?
> > > 
> > > In the rare cases of nested hardirqs happening with broken drivers, Only 
> > > the outer hardirq
> > > does matter. All the time spent in the inner hardirqs is included in the 
> > > outer
> > > one.
> > 
> > Ah I see. The original code was doing hardirq_count(), which apparently 
> > wasn't
> > right either.
> > 
> > Shouldn't it be pc == HARDIRQ_OFFSET then? All odd nest counts will trigger
> > this otherwise, and IIUC we want this to trigger once on first entry only.
> 
> Right but we must also handle hardirqs interrupting either preempt disabled 
> sections
> or softirq servicing/disabled section.
> 
> 3 stacking hardirqs should be rare enough that we don't really care. In the
> worst case we are going to account the third IRQ seperately. Not a correctness
> issue, just a rare unoptimized case.

I admit I need to wrap my head around some more details to fully comprehend
that, but that's my own confusion to clear out :-)

Thanks for your answer.

Cheers

--
Qais Yousef


Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation

2020-12-29 Thread Frederic Weisbecker
On Tue, Dec 29, 2020 at 02:12:31PM +, Qais Yousef wrote:
> On 12/29/20 14:41, Frederic Weisbecker wrote:
> > > > -void vtime_account_irq(struct task_struct *tsk)
> > > > +void vtime_account_irq(struct task_struct *tsk, unsigned int offset)
> > > >  {
> > > > -   if (hardirq_count()) {
> > > > +   unsigned int pc = preempt_count() - offset;
> > > > +
> > > > +   if (pc & HARDIRQ_OFFSET) {
> > > 
> > > Shouldn't this be HARDIRQ_MASK like above?
> > 
> > In the rare cases of nested hardirqs happening with broken drivers, Only 
> > the outer hardirq
> > does matter. All the time spent in the inner hardirqs is included in the 
> > outer
> > one.
> 
> Ah I see. The original code was doing hardirq_count(), which apparently wasn't
> right either.
> 
> Shouldn't it be pc == HARDIRQ_OFFSET then? All odd nest counts will trigger
> this otherwise, and IIUC we want this to trigger once on first entry only.

Right but we must also handle hardirqs interrupting either preempt disabled 
sections
or softirq servicing/disabled section.

3 stacking hardirqs should be rare enough that we don't really care. In the
worst case we are going to account the third IRQ seperately. Not a correctness
issue, just a rare unoptimized case.

> 
> Thanks
> 
> --
> Qais Yousef


Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation

2020-12-29 Thread Qais Yousef
On 12/29/20 14:41, Frederic Weisbecker wrote:
> On Mon, Dec 28, 2020 at 02:15:29AM +, Qais Yousef wrote:
> > Hi Frederic
> > 
> > On 12/02/20 12:57, Frederic Weisbecker wrote:
> > > @@ -66,9 +68,9 @@ void irqtime_account_irq(struct task_struct *curr)
> > >* in that case, so as not to confuse scheduler with a special task
> > >* that do not consume any time, but still wants to run.
> > >*/
> > > - if (hardirq_count())
> > > + if (pc & HARDIRQ_MASK)
> > >   irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
> > > - else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
> > > + else if ((pc & SOFTIRQ_OFFSET) && curr != this_cpu_ksoftirqd())
> > 
> > Noob question. Why for SOFTIRQs we do sofirq_count() & *SOFTIRQ_OFFSET*? It
> > seems we're in-softirq only if the count is odd numbered.
> > 
> > /me tries to dig more
> > 
> > Hmm could it be because the softirq count is actually 1 bit and the rest is
> > for SOFTIRQ_DISABLE_OFFSET (BH disabled)?
> 
> Exactly!
> 
> > 
> > IOW, 1 bit is for we're in softirq context, and the remaining 7 bits are to
> > count BH disable nesting, right?
> > 
> > I guess this would make sense; we don't nest softirqs processing AFAIK. But
> > I could be misreading the code too :-)
> 
> You got it right!
> 
> This is commented in softirq.c somewhere:
> 
> /*
>  * preempt_count and SOFTIRQ_OFFSET usage:
>  * - preempt_count is changed by SOFTIRQ_OFFSET on entering or leaving
>  *   softirq processing.
>  * - preempt_count is changed by SOFTIRQ_DISABLE_OFFSET (= 2 * SOFTIRQ_OFFSET)
>  *   on local_bh_disable or local_bh_enable.
>  * This lets us distinguish between whether we are currently processing
>  * softirq and whether we just have bh disabled.
>  */
> 
> But we should elaborate on the fact that, indeed, softirq processing can't 
> nest,
> while softirq disablement can. I should try to send a patch and comment more
> thoroughly on the subtleties of preempt mask in preempt.h.

Thanks for the info!

> 
> > 
> > >   irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
> > >  }
> > >  
> > > @@ -417,11 +419,13 @@ void vtime_task_switch(struct task_struct *prev)
> > >  }
> > >  # endif
> > >  
> > > -void vtime_account_irq(struct task_struct *tsk)
> > > +void vtime_account_irq(struct task_struct *tsk, unsigned int offset)
> > >  {
> > > - if (hardirq_count()) {
> > > + unsigned int pc = preempt_count() - offset;
> > > +
> > > + if (pc & HARDIRQ_OFFSET) {
> > 
> > Shouldn't this be HARDIRQ_MASK like above?
> 
> In the rare cases of nested hardirqs happening with broken drivers, Only the 
> outer hardirq
> does matter. All the time spent in the inner hardirqs is included in the outer
> one.

Ah I see. The original code was doing hardirq_count(), which apparently wasn't
right either.

Shouldn't it be pc == HARDIRQ_OFFSET then? All odd nest counts will trigger
this otherwise, and IIUC we want this to trigger once on first entry only.

Thanks

--
Qais Yousef


Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation

2020-12-29 Thread Frederic Weisbecker
On Mon, Dec 28, 2020 at 02:15:29AM +, Qais Yousef wrote:
> Hi Frederic
> 
> On 12/02/20 12:57, Frederic Weisbecker wrote:
> > @@ -66,9 +68,9 @@ void irqtime_account_irq(struct task_struct *curr)
> >  * in that case, so as not to confuse scheduler with a special task
> >  * that do not consume any time, but still wants to run.
> >  */
> > -   if (hardirq_count())
> > +   if (pc & HARDIRQ_MASK)
> > irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
> > -   else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
> > +   else if ((pc & SOFTIRQ_OFFSET) && curr != this_cpu_ksoftirqd())
> 
> Noob question. Why for SOFTIRQs we do sofirq_count() & *SOFTIRQ_OFFSET*? It
> seems we're in-softirq only if the count is odd numbered.
> 
> /me tries to dig more
> 
> Hmm could it be because the softirq count is actually 1 bit and the rest is
> for SOFTIRQ_DISABLE_OFFSET (BH disabled)?

Exactly!

> 
> IOW, 1 bit is for we're in softirq context, and the remaining 7 bits are to
> count BH disable nesting, right?
> 
> I guess this would make sense; we don't nest softirqs processing AFAIK. But
> I could be misreading the code too :-)

You got it right!

This is commented in softirq.c somewhere:

/*
 * preempt_count and SOFTIRQ_OFFSET usage:
 * - preempt_count is changed by SOFTIRQ_OFFSET on entering or leaving
 *   softirq processing.
 * - preempt_count is changed by SOFTIRQ_DISABLE_OFFSET (= 2 * SOFTIRQ_OFFSET)
 *   on local_bh_disable or local_bh_enable.
 * This lets us distinguish between whether we are currently processing
 * softirq and whether we just have bh disabled.
 */

But we should elaborate on the fact that, indeed, softirq processing can't nest,
while softirq disablement can. I should try to send a patch and comment more
thoroughly on the subtleties of preempt mask in preempt.h.

> 
> > irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
> >  }
> >  
> > @@ -417,11 +419,13 @@ void vtime_task_switch(struct task_struct *prev)
> >  }
> >  # endif
> >  
> > -void vtime_account_irq(struct task_struct *tsk)
> > +void vtime_account_irq(struct task_struct *tsk, unsigned int offset)
> >  {
> > -   if (hardirq_count()) {
> > +   unsigned int pc = preempt_count() - offset;
> > +
> > +   if (pc & HARDIRQ_OFFSET) {
> 
> Shouldn't this be HARDIRQ_MASK like above?

In the rare cases of nested hardirqs happening with broken drivers, Only the 
outer hardirq
does matter. All the time spent in the inner hardirqs is included in the outer
one.

Thanks.

> 
> > vtime_account_hardirq(tsk);
> > -   } else if (in_serving_softirq()) {
> > +   } else if (pc & SOFTIRQ_OFFSET) {
> > vtime_account_softirq(tsk);
> > } else if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) &&
> >is_idle_task(tsk)) {
> 
> Thanks
> 
> --
> Qais Yousef


Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation

2020-12-27 Thread Qais Yousef
Hi Frederic

On 12/02/20 12:57, Frederic Weisbecker wrote:
>  #endif /* _LINUX_KERNEL_VTIME_H */
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index 02163d4260d7..5f611658eeab 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -44,12 +44,13 @@ static void irqtime_account_delta(struct irqtime 
> *irqtime, u64 delta,
>  }
>  
>  /*
> - * Called before incrementing preempt_count on {soft,}irq_enter
> + * Called after incrementing preempt_count on {soft,}irq_enter
>   * and before decrementing preempt_count on {soft,}irq_exit.
>   */
> -void irqtime_account_irq(struct task_struct *curr)
> +void irqtime_account_irq(struct task_struct *curr, unsigned int offset)
>  {
>   struct irqtime *irqtime = this_cpu_ptr(_irqtime);
> + unsigned int pc;
>   s64 delta;
>   int cpu;
>  
> @@ -59,6 +60,7 @@ void irqtime_account_irq(struct task_struct *curr)
>   cpu = smp_processor_id();
>   delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
>   irqtime->irq_start_time += delta;
> + pc = preempt_count() - offset;
>  
>   /*
>* We do not account for softirq time from ksoftirqd here.
> @@ -66,9 +68,9 @@ void irqtime_account_irq(struct task_struct *curr)
>* in that case, so as not to confuse scheduler with a special task
>* that do not consume any time, but still wants to run.
>*/
> - if (hardirq_count())
> + if (pc & HARDIRQ_MASK)
>   irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
> - else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
> + else if ((pc & SOFTIRQ_OFFSET) && curr != this_cpu_ksoftirqd())

Noob question. Why for SOFTIRQs we do sofirq_count() & *SOFTIRQ_OFFSET*? It
seems we're in-softirq only if the count is odd numbered.

/me tries to dig more

Hmm could it be because the softirq count is actually 1 bit and the rest is
for SOFTIRQ_DISABLE_OFFSET (BH disabled)?

IOW, 1 bit is for we're in softirq context, and the remaining 7 bits are to
count BH disable nesting, right?

I guess this would make sense; we don't nest softirqs processing AFAIK. But
I could be misreading the code too :-)

>   irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
>  }
>  
> @@ -417,11 +419,13 @@ void vtime_task_switch(struct task_struct *prev)
>  }
>  # endif
>  
> -void vtime_account_irq(struct task_struct *tsk)
> +void vtime_account_irq(struct task_struct *tsk, unsigned int offset)
>  {
> - if (hardirq_count()) {
> + unsigned int pc = preempt_count() - offset;
> +
> + if (pc & HARDIRQ_OFFSET) {

Shouldn't this be HARDIRQ_MASK like above?

>   vtime_account_hardirq(tsk);
> - } else if (in_serving_softirq()) {
> + } else if (pc & SOFTIRQ_OFFSET) {
>   vtime_account_softirq(tsk);
>   } else if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) &&
>  is_idle_task(tsk)) {

Thanks

--
Qais Yousef


Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation

2020-12-02 Thread Peter Zijlstra
On Wed, Dec 02, 2020 at 12:57:31PM +0100, Frederic Weisbecker wrote:
> IRQ time entry is currently accounted before HARDIRQ_OFFSET or
> SOFTIRQ_OFFSET are incremented. This is convenient to decide to which
> index the cputime to account is dispatched.
> 
> Unfortunately it prevents tick_irq_enter() from being called under
> HARDIRQ_OFFSET because tick_irq_enter() has to be called before the IRQ
> entry accounting due to the necessary clock catch up. As a result we
> don't benefit from appropriate lockdep coverage on tick_irq_enter().
> 
> To prepare for fixing this, move the IRQ entry cputime accounting after
> the preempt offset is incremented. This requires the cputime dispatch
> code to handle the extra offset.
> 
> Signed-off-by: Frederic Weisbecker 

Acked-by: Peter Zijlstra (Intel) 


Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation

2020-12-01 Thread Thomas Gleixner
On Tue, Dec 01 2020 at 16:01, Peter Zijlstra wrote:
> On Tue, Dec 01, 2020 at 03:35:45PM +0100, Frederic Weisbecker wrote:
>> And that one too makes things simple. But note that
>> 
>> account_hardirq_enter_time()
>> 
>> will still need some preempt count checks to see if
>> this is a nested hardirq, a hardirq interrupting a softirq
>> or a hardirq interrupting a task.
>
> So the current tests get that all correct in a single function.
> Splitting it out will just result in more lines to get wrong.
>
> That is, I don't think you can do it saner than:
>
>   account_softirq_enter() := irqtime_account_irq(curr, SOFTIRQ_OFFSET);
>   account_softirq_exit()  := irqtime_account_irq(curr, 0);
>   account_hardirq_enter() := irqtime_account_irq(curr, HARDIRQ_OFFSET);
>   account_hardirq_exit()  := irqtime_account_irq(curr, 0);
>
> Fundamentally you have to determine the previous context to determine
> where to account the delta to. Note that when the previous context is
> task context we throw away the delta.

Fair enough.


Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation

2020-12-01 Thread Peter Zijlstra
On Tue, Dec 01, 2020 at 03:35:45PM +0100, Frederic Weisbecker wrote:
> And that one too makes things simple. But note that
> 
> account_hardirq_enter_time()
> 
> will still need some preempt count checks to see if
> this is a nested hardirq, a hardirq interrupting a softirq
> or a hardirq interrupting a task.

So the current tests get that all correct in a single function.
Splitting it out will just result in more lines to get wrong.

That is, I don't think you can do it saner than:

  account_softirq_enter() := irqtime_account_irq(curr, SOFTIRQ_OFFSET);
  account_softirq_exit()  := irqtime_account_irq(curr, 0);
  account_hardirq_enter() := irqtime_account_irq(curr, HARDIRQ_OFFSET);
  account_hardirq_exit()  := irqtime_account_irq(curr, 0);

Fundamentally you have to determine the previous context to determine
where to account the delta to. Note that when the previous context is
task context we throw away the delta.


Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation

2020-12-01 Thread Frederic Weisbecker
On Tue, Dec 01, 2020 at 02:34:49PM +0100, Thomas Gleixner wrote:
> On Tue, Dec 01 2020 at 12:40, Frederic Weisbecker wrote:
> > On Tue, Dec 01, 2020 at 12:33:26PM +0100, Thomas Gleixner wrote:
> >> >  /*
> >> >   * We do not account for softirq time from ksoftirqd here.
> >> >   * We want to continue accounting softirq time to ksoftirqd thread
> >> >   * in that case, so as not to confuse scheduler with a special task
> >> >   * that do not consume any time, but still wants to run.
> >> >   */
> >> >  if (pc & HARDIRQ_MASK)
> >> >  irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
> >> >  else if ((pc & SOFTIRQ_OFFSET) && curr != this_cpu_ksoftirqd())
> >> >  irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
> >> > }
> >> 
> >> Why not making all of this explicit instead of these conditionals?
> >
> > Hmm, I'm not sure I get what you suggest?
> 
> Instead of playing games with preeempt count and offsets and checking
> for ksoftirqd, can't you just have:
> 
> account_hardirqtime()
> account_softirqtime()
> 
> and call them from the right spots. See the below for illustration (it's
> obviously incomplete).
> 
> Thanks,
> 
> tglx
> ---
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -377,6 +377,7 @@ static inline void invoke_softirq(void)
>   return;
>  
>   if (!force_irqthreads) {
> + account_softirq_enter_time(current);
>  #ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
>   /*
>* We can safely execute softirq on the current stack if
> @@ -391,6 +392,7 @@ static inline void invoke_softirq(void)
>* to prevent from any overrun.
>*/
>   do_softirq_own_stack();
> + account_softirq_exit_time(current);

Indeed for the softirq part it simplifies things.


>  #endif
>   } else {
>   wakeup_softirqd();
> @@ -417,7 +419,7 @@ static inline void __irq_exit_rcu(void)
>  #else
>   lockdep_assert_irqs_disabled();
>  #endif
> - account_irq_exit_time(current);
> + account_hardirq_exit_time(current);

And that one too makes things simple. But note that

account_hardirq_enter_time()

will still need some preempt count checks to see if
this is a nested hardirq, a hardirq interrupting a softirq
or a hardirq interrupting a task.

But I think it's a win in the end. I'll try that.

Thanks.

>   preempt_count_sub(HARDIRQ_OFFSET);
>   if (!in_interrupt() && local_softirq_pending())
>   invoke_softirq();


Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation

2020-12-01 Thread Thomas Gleixner
On Tue, Dec 01 2020 at 12:40, Frederic Weisbecker wrote:
> On Tue, Dec 01, 2020 at 12:33:26PM +0100, Thomas Gleixner wrote:
>> >/*
>> > * We do not account for softirq time from ksoftirqd here.
>> > * We want to continue accounting softirq time to ksoftirqd thread
>> > * in that case, so as not to confuse scheduler with a special task
>> > * that do not consume any time, but still wants to run.
>> > */
>> >if (pc & HARDIRQ_MASK)
>> >irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
>> >else if ((pc & SOFTIRQ_OFFSET) && curr != this_cpu_ksoftirqd())
>> >irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
>> > }
>> 
>> Why not making all of this explicit instead of these conditionals?
>
> Hmm, I'm not sure I get what you suggest?

Instead of playing games with preeempt count and offsets and checking
for ksoftirqd, can't you just have:

account_hardirqtime()
account_softirqtime()

and call them from the right spots. See the below for illustration (it's
obviously incomplete).

Thanks,

tglx
---
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -377,6 +377,7 @@ static inline void invoke_softirq(void)
return;
 
if (!force_irqthreads) {
+   account_softirq_enter_time(current);
 #ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
/*
 * We can safely execute softirq on the current stack if
@@ -391,6 +392,7 @@ static inline void invoke_softirq(void)
 * to prevent from any overrun.
 */
do_softirq_own_stack();
+   account_softirq_exit_time(current);
 #endif
} else {
wakeup_softirqd();
@@ -417,7 +419,7 @@ static inline void __irq_exit_rcu(void)
 #else
lockdep_assert_irqs_disabled();
 #endif
-   account_irq_exit_time(current);
+   account_hardirq_exit_time(current);
preempt_count_sub(HARDIRQ_OFFSET);
if (!in_interrupt() && local_softirq_pending())
invoke_softirq();


Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation

2020-12-01 Thread Frederic Weisbecker
On Tue, Dec 01, 2020 at 12:33:26PM +0100, Thomas Gleixner wrote:
> On Tue, Dec 01 2020 at 10:20, Peter Zijlstra wrote:
> > On Tue, Dec 01, 2020 at 01:12:25AM +0100, Frederic Weisbecker wrote:
> > Why not something like:
> >
> > void irqtime_account_irq(struct task_struct *curr, unsigned int offset)
> > {
> > struct irqtime *irqtime = this_cpu_ptr(_irqtime);
> > unsigned int pc = preempt_count() - offset;
> > s64 delta;
> > int cpu;
> >
> > if (!sched_clock_irqtime)
> > return;
> >
> > cpu = smp_processor_id();
> > delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
> > irqtime->irq_start_time += delta;
> >
> > /*
> >  * We do not account for softirq time from ksoftirqd here.
> >  * We want to continue accounting softirq time to ksoftirqd thread
> >  * in that case, so as not to confuse scheduler with a special task
> >  * that do not consume any time, but still wants to run.
> >  */
> > if (pc & HARDIRQ_MASK)
> > irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
> > else if ((pc & SOFTIRQ_OFFSET) && curr != this_cpu_ksoftirqd())
> > irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
> > }
> 
> Why not making all of this explicit instead of these conditionals?

Hmm, I'm not sure I get what you suggest?

Thanks.


Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation

2020-12-01 Thread Thomas Gleixner
On Tue, Dec 01 2020 at 10:20, Peter Zijlstra wrote:
> On Tue, Dec 01, 2020 at 01:12:25AM +0100, Frederic Weisbecker wrote:
> Why not something like:
>
> void irqtime_account_irq(struct task_struct *curr, unsigned int offset)
> {
>   struct irqtime *irqtime = this_cpu_ptr(_irqtime);
>   unsigned int pc = preempt_count() - offset;
>   s64 delta;
>   int cpu;
>
>   if (!sched_clock_irqtime)
>   return;
>
>   cpu = smp_processor_id();
>   delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
>   irqtime->irq_start_time += delta;
>
>   /*
>* We do not account for softirq time from ksoftirqd here.
>* We want to continue accounting softirq time to ksoftirqd thread
>* in that case, so as not to confuse scheduler with a special task
>* that do not consume any time, but still wants to run.
>*/
>   if (pc & HARDIRQ_MASK)
>   irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
>   else if ((pc & SOFTIRQ_OFFSET) && curr != this_cpu_ksoftirqd())
>   irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
> }

Why not making all of this explicit instead of these conditionals?

Thanks,

tglx


 


Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation

2020-12-01 Thread Frederic Weisbecker
On Tue, Dec 01, 2020 at 10:20:11AM +0100, Peter Zijlstra wrote:
> On Tue, Dec 01, 2020 at 01:12:25AM +0100, Frederic Weisbecker wrote:
> > +static s64 irqtime_get_delta(struct irqtime *irqtime)
> >  {
> > +   int cpu = smp_processor_id();
> > s64 delta;
> >  
> > delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
> > irqtime->irq_start_time += delta;
> >  
> > +   return delta;
> > +}
> > +
> > +/* Called after incrementing preempt_count on {soft,}irq_enter */
> > +void irqtime_account_enter(struct task_struct *curr)
> > +{
> > +   struct irqtime *irqtime = this_cpu_ptr(_irqtime);
> > +   u64 delta;
> > +
> > +   if (!sched_clock_irqtime)
> > +   return;
> > +
> > +   delta = irqtime_get_delta(irqtime);
> > +   /*
> > +* We do not account for softirq time from ksoftirqd here.
> > +* We want to continue accounting softirq time to ksoftirqd thread
> > +* in that case, so as not to confuse scheduler with a special task
> > +* that do not consume any time, but still wants to run.
> > +*/
> > +   if ((irq_count() == (SOFTIRQ_OFFSET | HARDIRQ_OFFSET)) &&
> > +   curr != this_cpu_ksoftirqd())
> > +   irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
> > +}
> > +
> > +/* Called before decrementing preempt_count on {soft,}irq_exit */
> > +void irqtime_account_exit(struct task_struct *curr)
> > +{
> > +   struct irqtime *irqtime = this_cpu_ptr(_irqtime);
> > +   u64 delta;
> > +
> > +   if (!sched_clock_irqtime)
> > +   return;
> > +
> > +   delta = irqtime_get_delta(irqtime);
> > /*
> >  * We do not account for softirq time from ksoftirqd here.
> >  * We want to continue accounting softirq time to ksoftirqd thread
> 
> 
> Urgh...
> 
> 
> Why not something like:
> 
> void irqtime_account_irq(struct task_struct *curr, unsigned int offset)
> {
>   struct irqtime *irqtime = this_cpu_ptr(_irqtime);
>   unsigned int pc = preempt_count() - offset;
>   s64 delta;
>   int cpu;
> 
>   if (!sched_clock_irqtime)
>   return;
> 
>   cpu = smp_processor_id();
>   delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
>   irqtime->irq_start_time += delta;
> 
>   /*
>* We do not account for softirq time from ksoftirqd here.
>* We want to continue accounting softirq time to ksoftirqd thread
>* in that case, so as not to confuse scheduler with a special task
>* that do not consume any time, but still wants to run.
>*/
>   if (pc & HARDIRQ_MASK)
>   irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
>   else if ((pc & SOFTIRQ_OFFSET) && curr != this_cpu_ksoftirqd())
>   irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
> }
> 
> 

Right that's better, I'll do that.

Thanks.


Re: [PATCH 4/5] irqtime: Move irqtime entry accounting after irq offset incrementation

2020-12-01 Thread Peter Zijlstra
On Tue, Dec 01, 2020 at 01:12:25AM +0100, Frederic Weisbecker wrote:
> +static s64 irqtime_get_delta(struct irqtime *irqtime)
>  {
> + int cpu = smp_processor_id();
>   s64 delta;
>  
>   delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
>   irqtime->irq_start_time += delta;
>  
> + return delta;
> +}
> +
> +/* Called after incrementing preempt_count on {soft,}irq_enter */
> +void irqtime_account_enter(struct task_struct *curr)
> +{
> + struct irqtime *irqtime = this_cpu_ptr(_irqtime);
> + u64 delta;
> +
> + if (!sched_clock_irqtime)
> + return;
> +
> + delta = irqtime_get_delta(irqtime);
> + /*
> +  * We do not account for softirq time from ksoftirqd here.
> +  * We want to continue accounting softirq time to ksoftirqd thread
> +  * in that case, so as not to confuse scheduler with a special task
> +  * that do not consume any time, but still wants to run.
> +  */
> + if ((irq_count() == (SOFTIRQ_OFFSET | HARDIRQ_OFFSET)) &&
> + curr != this_cpu_ksoftirqd())
> + irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
> +}
> +
> +/* Called before decrementing preempt_count on {soft,}irq_exit */
> +void irqtime_account_exit(struct task_struct *curr)
> +{
> + struct irqtime *irqtime = this_cpu_ptr(_irqtime);
> + u64 delta;
> +
> + if (!sched_clock_irqtime)
> + return;
> +
> + delta = irqtime_get_delta(irqtime);
>   /*
>* We do not account for softirq time from ksoftirqd here.
>* We want to continue accounting softirq time to ksoftirqd thread


Urgh...


Why not something like:

void irqtime_account_irq(struct task_struct *curr, unsigned int offset)
{
struct irqtime *irqtime = this_cpu_ptr(_irqtime);
unsigned int pc = preempt_count() - offset;
s64 delta;
int cpu;

if (!sched_clock_irqtime)
return;

cpu = smp_processor_id();
delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
irqtime->irq_start_time += delta;

/*
 * We do not account for softirq time from ksoftirqd here.
 * We want to continue accounting softirq time to ksoftirqd thread
 * in that case, so as not to confuse scheduler with a special task
 * that do not consume any time, but still wants to run.
 */
if (pc & HARDIRQ_MASK)
irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
else if ((pc & SOFTIRQ_OFFSET) && curr != this_cpu_ksoftirqd())
irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
}