Re: [PATCH 1/3] sched: define a function to report the number of context switches on a CPU

2019-08-21 Thread Peter Zijlstra
On Wed, Aug 21, 2019 at 08:20:48AM +, Long Li wrote:
> >>>Subject: Re: [PATCH 1/3] sched: define a function to report the number of
> >>>context switches on a CPU
> >>>
> >>>On Mon, Aug 19, 2019 at 11:14:27PM -0700, lon...@linuxonhyperv.com
> >>>wrote:
>  From: Long Li 
> 
>  The number of context switches on a CPU is useful to determine how
>  busy this CPU is on processing IRQs. Export this information so it can
>  be used by device drivers.
> >>>
> >>>Please do explain that; because I'm not seeing how number of switches
> >>>relates to processing IRQs _at_all_!
> 
> Some kernel components rely on context switch to progress, for example
> watchdog and RCU. On a CPU with reasonable interrupt load, it
> continues to make context switches, normally a number of switches per
> seconds. 

That isn't true; RCU is perfectly fine with a single task always running
and not making context switches, and so is the watchdog.


RE: [PATCH 1/3] sched: define a function to report the number of context switches on a CPU

2019-08-21 Thread Long Li
>>>Subject: Re: [PATCH 1/3] sched: define a function to report the number of
>>>context switches on a CPU
>>>
>>>On Mon, Aug 19, 2019 at 11:14:27PM -0700, lon...@linuxonhyperv.com
>>>wrote:
 From: Long Li 

 The number of context switches on a CPU is useful to determine how
 busy this CPU is on processing IRQs. Export this information so it can
 be used by device drivers.
>>>
>>>Please do explain that; because I'm not seeing how number of switches
>>>relates to processing IRQs _at_all_!

Some kernel components rely on context switch to progress, for example watchdog 
and RCU. On a CPU with reasonable interrupt load, it continues to make context 
switches, normally a number of switches per seconds. 

While observing a CPU with heavy interrupt loads, I see that it spends all its 
time in IRQ and softIRQ, and not to get a chance to do a switch (calling 
__schedule()) for a long time. This will result in system unresponsive at 
times. The purpose is to find out if this CPU is in this state, and implement 
some throttling mechanism to help reduce the number of interrupts. I think the 
number of switches is not accurate for detecting this condition in the most 
precise way, but maybe it's good enough.

I agree this may not be the best way. If you have other idea on detecting a CPU 
is swamped by interrupts, please point me to where to look at.

Thanks

Long




Re: [PATCH 1/3] sched: define a function to report the number of context switches on a CPU

2019-08-20 Thread Peter Zijlstra
On Mon, Aug 19, 2019 at 11:14:27PM -0700, lon...@linuxonhyperv.com wrote:

> +u64 get_cpu_rq_switches(int cpu)
> +{
> + return cpu_rq(cpu)->nr_switches;
> +}
> +EXPORT_SYMBOL_GPL(get_cpu_rq_switches);

Also, that is broken on 32bit.


Re: [PATCH 1/3] sched: define a function to report the number of context switches on a CPU

2019-08-20 Thread Peter Zijlstra
On Mon, Aug 19, 2019 at 11:14:27PM -0700, lon...@linuxonhyperv.com wrote:
> From: Long Li 
> 
> The number of context switches on a CPU is useful to determine how busy this
> CPU is on processing IRQs. Export this information so it can be used by device
> drivers.

Please do explain that; because I'm not seeing how number of switches
relates to processing IRQs _at_all_!


[PATCH 1/3] sched: define a function to report the number of context switches on a CPU

2019-08-20 Thread longli
From: Long Li 

The number of context switches on a CPU is useful to determine how busy this
CPU is on processing IRQs. Export this information so it can be used by device
drivers.

Signed-off-by: Long Li 
---
 include/linux/sched.h | 1 +
 kernel/sched/core.c   | 6 ++
 2 files changed, 7 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 9b35aff09f70..575f1ef7b159 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1500,6 +1500,7 @@ current_restore_flags(unsigned long orig_flags, unsigned 
long flags)
 
 extern int cpuset_cpumask_can_shrink(const struct cpumask *cur, const struct 
cpumask *trial);
 extern int task_can_attach(struct task_struct *p, const struct cpumask 
*cs_cpus_allowed);
+extern u64 get_cpu_rq_switches(int cpu);
 #ifdef CONFIG_SMP
 extern void do_set_cpus_allowed(struct task_struct *p, const struct cpumask 
*new_mask);
 extern int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask 
*new_mask);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4a8e7207cafa..1a76f0e97c2d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1143,6 +1143,12 @@ int set_cpus_allowed_ptr(struct task_struct *p, const 
struct cpumask *new_mask)
 }
 EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr);
 
+u64 get_cpu_rq_switches(int cpu)
+{
+   return cpu_rq(cpu)->nr_switches;
+}
+EXPORT_SYMBOL_GPL(get_cpu_rq_switches);
+
 void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
 {
 #ifdef CONFIG_SCHED_DEBUG
-- 
2.17.1