Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 18 February 2013 20:53, Steven Rostedt wrote: > On Mon, 2013-02-18 at 17:50 +0100, Vincent Guittot wrote: > >> yes for sure. >> The problem is more linked to cpuidle and function tracer. >> >> cpu hotplug and function tracer work when cpuidle is disable. >> cpu hotplug and cpuidle works if i don't enable function tracer. >> my platform is dead as soon as I enable function tracer if cpuidle is >> enabled. It looks like some notrace are missing in my platform driver >> but we haven't completely fix the issue yet >> > > You can bisect to find out exactly what function is the problem: > > cat /debug/tracing/available_filter_functions > t > > f(t) { > num=`wc -l t` > sed -ne "1,${num}p" t > t1 > let num=num+1 > sed -ne "${num},$p" t > t2 > > cat t1 > /debug/tracing/set_ftrace_filter > # note this may take a long time to finish > > echo function > /debug/tracing/current_tracer > > > } > Thanks, i'm going to have a look Vincent > -- Steve > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 18 February 2013 20:53, Steven Rostedt rost...@goodmis.org wrote: On Mon, 2013-02-18 at 17:50 +0100, Vincent Guittot wrote: yes for sure. The problem is more linked to cpuidle and function tracer. cpu hotplug and function tracer work when cpuidle is disable. cpu hotplug and cpuidle works if i don't enable function tracer. my platform is dead as soon as I enable function tracer if cpuidle is enabled. It looks like some notrace are missing in my platform driver but we haven't completely fix the issue yet You can bisect to find out exactly what function is the problem: cat /debug/tracing/available_filter_functions t f(t) { num=`wc -l t` sed -ne 1,${num}p t t1 let num=num+1 sed -ne ${num},$p t t2 cat t1 /debug/tracing/set_ftrace_filter # note this may take a long time to finish echo function /debug/tracing/current_tracer failed? bisect f(t1), if not bisect f(t2) } Thanks, i'm going to have a look Vincent -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On Mon, 2013-02-18 at 17:50 +0100, Vincent Guittot wrote: > yes for sure. > The problem is more linked to cpuidle and function tracer. > > cpu hotplug and function tracer work when cpuidle is disable. > cpu hotplug and cpuidle works if i don't enable function tracer. > my platform is dead as soon as I enable function tracer if cpuidle is > enabled. It looks like some notrace are missing in my platform driver > but we haven't completely fix the issue yet > You can bisect to find out exactly what function is the problem: cat /debug/tracing/available_filter_functions > t f(t) { num=`wc -l t` sed -ne "1,${num}p" t > t1 let num=num+1 sed -ne "${num},$p" t > t2 cat t1 > /debug/tracing/set_ftrace_filter # note this may take a long time to finish echo function > /debug/tracing/current_tracer } -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On Mon, 2013-02-18 at 17:50 +0100, Vincent Guittot wrote: > yes for sure. > The problem is more linked to cpuidle and function tracer. > > cpu hotplug and function tracer work when cpuidle is disable. > cpu hotplug and cpuidle works if i don't enable function tracer. > my platform is dead as soon as I enable function tracer if cpuidle is > enabled. It looks like some notrace are missing in my platform driver > but we haven't completely fix the issue yet > You can bisect to find out exactly what function is the problem: cat /debug/tracing/available_filter_functions > t f(t) { num=`wc -l t` sed -ne "1,${num}p" t > t1 let num=num+1 sed -ne "${num},$p" t > t2 cat t1 > /debug/tracing/set_ftrace_filter # note this may take a long time to finish echo function > /debug/tracing/current_tracer } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 18 February 2013 16:30, Steven Rostedt wrote: > On Mon, 2013-02-18 at 11:58 +0100, Vincent Guittot wrote: > >> My tests have been done without cpuidle because i have some issues >> with function tracer and cpuidle >> >> But the cpu hotplug and cpuidle work well when I run the tests without >> enabling the function tracer >> > > I know suspend and resume has issues with function tracing (because it > makes things like calling smp_processor_id() crash the system), but I'm > unaware of issues with hotplug itself. Could be some of the same issues. > > Can you give me more details, I'll try to investigate it. yes for sure. The problem is more linked to cpuidle and function tracer. cpu hotplug and function tracer work when cpuidle is disable. cpu hotplug and cpuidle works if i don't enable function tracer. my platform is dead as soon as I enable function tracer if cpuidle is enabled. It looks like some notrace are missing in my platform driver but we haven't completely fix the issue yet Vincent > > Thanks, > > -- Steve > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On Mon, 2013-02-18 at 11:58 +0100, Vincent Guittot wrote: > My tests have been done without cpuidle because i have some issues > with function tracer and cpuidle > > But the cpu hotplug and cpuidle work well when I run the tests without > enabling the function tracer > I know suspend and resume has issues with function tracing (because it makes things like calling smp_processor_id() crash the system), but I'm unaware of issues with hotplug itself. Could be some of the same issues. Can you give me more details, I'll try to investigate it. Thanks, -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 02/18/2013 04:24 PM, Thomas Gleixner wrote: > On Mon, 18 Feb 2013, Srivatsa S. Bhat wrote: >> Lockup observed while running this patchset, with CPU_IDLE and INTEL_IDLE >> turned >> on in the .config: >> >> smpboot: CPU 1 is now offline >> Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 11 >> Pid: 0, comm: swapper/11 Not tainted 3.8.0-rc7+stpmch13-1 #8 >> Call Trace: >> [] do_raw_spin_lock+0x7e/0x150 >> [] _raw_spin_lock_irqsave+0x61/0x70 >> [] ? clockevents_notify+0x28/0x150 >> [] ? _raw_spin_unlock_irqrestore+0x77/0x80 >> [] clockevents_notify+0x28/0x150 >> [] intel_idle+0xaf/0xe0 >> [] ? disable_cpuidle+0x20/0x20 >> [] cpuidle_enter+0x19/0x20 >> [] cpuidle_wrap_enter+0x41/0xa0 >> [] cpuidle_enter_tk+0x10/0x20 >> [] cpuidle_enter_state+0x17/0x50 >> [] cpuidle_idle_call+0xd9/0x290 >> [] cpu_idle+0xe5/0x140 >> [] start_secondary+0xdd/0xdf > >> BUG: spinlock lockup suspected on CPU#2, migration/2/19 >> lock: clockevents_lock+0x0/0x40, .magic: dead4ead, .owner: swapper/8/0, >> .owner_cpu: 8 > > Unfortunately there is no back trace for cpu8. Yes :-( I had run this several times hoping to get a backtrace on the lock-holder, expecting trigger_all_cpu_backtrace() to get it right at least once. But I hadn't succeeded even once. > That's probably caused > by the watchdog -> panic setting. > Oh, ok.. > So we have no idea why cpu2 and 11 get stuck on the clockevents_lock > and without that information it's impossible to decode. > But thankfully, the issue seems to have been resolved by the diff I posted in my previous mail, along with the fixes related to memory barriers. Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 18 February 2013 11:51, Srivatsa S. Bhat wrote: > On 02/18/2013 04:04 PM, Srivatsa S. Bhat wrote: >> On 02/18/2013 03:54 PM, Vincent Guittot wrote: >>> On 15 February 2013 20:40, Srivatsa S. Bhat >>> wrote: Hi Vincent, On 02/15/2013 06:58 PM, Vincent Guittot wrote: > Hi Srivatsa, > > I have run some tests with you branch (thanks Paul for the git tree) > and you will find results below. > Thank you very much for testing this patchset! > The tests condition are: > - 5 CPUs system in 2 clusters > - The test plugs/unplugs CPU2 and it increases the system load each 20 > plug/unplug sequence with either more cyclictests threads > - The test is done with all CPUs online and with only CPU0 and CPU2 > > The main conclusion is that there is no differences with and without > your patches with my stress tests. I'm not sure that it was the > expected results but the cpu_down is already quite low : 4-5ms in > average > Atleast my patchset doesn't perform _worse_ than mainline, with respect to cpu_down duration :-) >>> >>> yes exactly and it has pass more than 400 consecutive plug/unplug on >>> an ARM platform >>> >> >> Great! However, did you turn on CPU_IDLE during your tests? >> >> In my tests, I had turned off cpu idle in the .config, like I had mentioned >> in >> the cover letter. I'm struggling to get it working with CPU_IDLE/INTEL_IDLE >> turned on, because it gets into a lockup almost immediately. It appears that >> the lock-holder of clockevents_lock never releases it, for some reason.. >> See below for the full log. Lockdep has not been useful in debugging this, >> unfortunately :-( >> > > Ah, nevermind, the following diff fixes it :-) I had applied this fix on v5 > and tested but it still had races where I used to hit the lockups. Now after > I fixed all the memory barrier issues that Paul and Oleg pointed out in v5, > I applied this fix again and tested it just now - it works beautifully! :-) My tests have been done without cpuidle because i have some issues with function tracer and cpuidle But the cpu hotplug and cpuidle work well when I run the tests without enabling the function tracer Vincent > > I'll include this fix and post a v6 soon. > > Regards, > Srivatsa S. Bhat > > ---> > > > diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c > index 30b6de0..ca340fd 100644 > --- a/kernel/time/clockevents.c > +++ b/kernel/time/clockevents.c > @@ -17,6 +17,7 @@ > #include > #include > #include > +#include > > #include "tick-internal.h" > > @@ -431,6 +432,7 @@ void clockevents_notify(unsigned long reason, void *arg) > unsigned long flags; > int cpu; > > + get_online_cpus_atomic(); > raw_spin_lock_irqsave(_lock, flags); > clockevents_do_notify(reason, arg); > > @@ -459,6 +461,7 @@ void clockevents_notify(unsigned long reason, void *arg) > break; > } > raw_spin_unlock_irqrestore(_lock, flags); > + put_online_cpus_atomic(); > } > EXPORT_SYMBOL_GPL(clockevents_notify); > #endif > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On Mon, 18 Feb 2013, Srivatsa S. Bhat wrote: > Lockup observed while running this patchset, with CPU_IDLE and INTEL_IDLE > turned > on in the .config: > > smpboot: CPU 1 is now offline > Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 11 > Pid: 0, comm: swapper/11 Not tainted 3.8.0-rc7+stpmch13-1 #8 > Call Trace: > [] do_raw_spin_lock+0x7e/0x150 > [] _raw_spin_lock_irqsave+0x61/0x70 > [] ? clockevents_notify+0x28/0x150 > [] ? _raw_spin_unlock_irqrestore+0x77/0x80 > [] clockevents_notify+0x28/0x150 > [] intel_idle+0xaf/0xe0 > [] ? disable_cpuidle+0x20/0x20 > [] cpuidle_enter+0x19/0x20 > [] cpuidle_wrap_enter+0x41/0xa0 > [] cpuidle_enter_tk+0x10/0x20 > [] cpuidle_enter_state+0x17/0x50 > [] cpuidle_idle_call+0xd9/0x290 > [] cpu_idle+0xe5/0x140 > [] start_secondary+0xdd/0xdf > BUG: spinlock lockup suspected on CPU#2, migration/2/19 > lock: clockevents_lock+0x0/0x40, .magic: dead4ead, .owner: swapper/8/0, > .owner_cpu: 8 Unfortunately there is no back trace for cpu8. That's probably caused by the watchdog -> panic setting. So we have no idea why cpu2 and 11 get stuck on the clockevents_lock and without that information it's impossible to decode. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 02/18/2013 04:04 PM, Srivatsa S. Bhat wrote: > On 02/18/2013 03:54 PM, Vincent Guittot wrote: >> On 15 February 2013 20:40, Srivatsa S. Bhat >> wrote: >>> Hi Vincent, >>> >>> On 02/15/2013 06:58 PM, Vincent Guittot wrote: Hi Srivatsa, I have run some tests with you branch (thanks Paul for the git tree) and you will find results below. >>> >>> Thank you very much for testing this patchset! >>> The tests condition are: - 5 CPUs system in 2 clusters - The test plugs/unplugs CPU2 and it increases the system load each 20 plug/unplug sequence with either more cyclictests threads - The test is done with all CPUs online and with only CPU0 and CPU2 The main conclusion is that there is no differences with and without your patches with my stress tests. I'm not sure that it was the expected results but the cpu_down is already quite low : 4-5ms in average >>> >>> Atleast my patchset doesn't perform _worse_ than mainline, with respect >>> to cpu_down duration :-) >> >> yes exactly and it has pass more than 400 consecutive plug/unplug on >> an ARM platform >> > > Great! However, did you turn on CPU_IDLE during your tests? > > In my tests, I had turned off cpu idle in the .config, like I had mentioned in > the cover letter. I'm struggling to get it working with CPU_IDLE/INTEL_IDLE > turned on, because it gets into a lockup almost immediately. It appears that > the lock-holder of clockevents_lock never releases it, for some reason.. > See below for the full log. Lockdep has not been useful in debugging this, > unfortunately :-( > Ah, nevermind, the following diff fixes it :-) I had applied this fix on v5 and tested but it still had races where I used to hit the lockups. Now after I fixed all the memory barrier issues that Paul and Oleg pointed out in v5, I applied this fix again and tested it just now - it works beautifully! :-) I'll include this fix and post a v6 soon. Regards, Srivatsa S. Bhat ---> diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c index 30b6de0..ca340fd 100644 --- a/kernel/time/clockevents.c +++ b/kernel/time/clockevents.c @@ -17,6 +17,7 @@ #include #include #include +#include #include "tick-internal.h" @@ -431,6 +432,7 @@ void clockevents_notify(unsigned long reason, void *arg) unsigned long flags; int cpu; + get_online_cpus_atomic(); raw_spin_lock_irqsave(_lock, flags); clockevents_do_notify(reason, arg); @@ -459,6 +461,7 @@ void clockevents_notify(unsigned long reason, void *arg) break; } raw_spin_unlock_irqrestore(_lock, flags); + put_online_cpus_atomic(); } EXPORT_SYMBOL_GPL(clockevents_notify); #endif -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 02/18/2013 03:54 PM, Vincent Guittot wrote: > On 15 February 2013 20:40, Srivatsa S. Bhat > wrote: >> Hi Vincent, >> >> On 02/15/2013 06:58 PM, Vincent Guittot wrote: >>> Hi Srivatsa, >>> >>> I have run some tests with you branch (thanks Paul for the git tree) >>> and you will find results below. >>> >> >> Thank you very much for testing this patchset! >> >>> The tests condition are: >>> - 5 CPUs system in 2 clusters >>> - The test plugs/unplugs CPU2 and it increases the system load each 20 >>> plug/unplug sequence with either more cyclictests threads >>> - The test is done with all CPUs online and with only CPU0 and CPU2 >>> >>> The main conclusion is that there is no differences with and without >>> your patches with my stress tests. I'm not sure that it was the >>> expected results but the cpu_down is already quite low : 4-5ms in >>> average >>> >> >> Atleast my patchset doesn't perform _worse_ than mainline, with respect >> to cpu_down duration :-) > > yes exactly and it has pass more than 400 consecutive plug/unplug on > an ARM platform > Great! However, did you turn on CPU_IDLE during your tests? In my tests, I had turned off cpu idle in the .config, like I had mentioned in the cover letter. I'm struggling to get it working with CPU_IDLE/INTEL_IDLE turned on, because it gets into a lockup almost immediately. It appears that the lock-holder of clockevents_lock never releases it, for some reason.. See below for the full log. Lockdep has not been useful in debugging this, unfortunately :-( >> >> So, here is the analysis: >> Stop-machine() doesn't really slow down CPU-down operation, if the rest >> of the CPUs are mostly running in userspace all the time. Because, the >> CPUs running userspace workloads cooperate very eagerly with the stop-machine >> dance - they receive the resched IPI, and allow the per-cpu cpu-stopper >> thread to monopolize the CPU, almost immediately. >> >> The scenario where stop-machine() takes longer to take effect is when >> most of the online CPUs are running in kernelspace, because, then the >> probability that they call preempt_disable() frequently (and hence inhibit >> stop-machine) is higher. That's why, in my tests, I ran genload from LTP >> which generated a lot of system-time (system-time in 'top' indicates activity >> in kernelspace). Hence my patchset showed significant improvement over >> mainline in my tests. >> > > ok, I hadn't noticed this important point for the test > >> However, your test is very useful too, if we measure a different parameter: >> the latency impact on the workloads running on the system (cyclic test). >> One other important aim of this patchset is to make hotplug as less intrusive >> as possible, for other workloads running on the system. So if you measure >> the cyclictest numbers, I would expect my patchset to show better numbers >> than mainline, when you do cpu-hotplug in parallel (same test that you did). >> Mainline would run stop-machine and hence interrupt the cyclic test tasks >> too often. My patchset wouldn't do that, and hence cyclic test should >> ideally show better numbers. > > In fact, I haven't looked at the results as i was more interested by > the load that was generated > >> >> I'd really appreciate if you could try that out and let me know how it >> goes.. :-) Thank you very much! > > ok, I'm going to try to run a test series > Great! Thank you :-) Regards, Srivatsa S. Bhat Lockup observed while running this patchset, with CPU_IDLE and INTEL_IDLE turned on in the .config: smpboot: CPU 1 is now offline Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 11 Pid: 0, comm: swapper/11 Not tainted 3.8.0-rc7+stpmch13-1 #8 Call Trace: [] panic+0xc9/0x1ee [] watchdog_overflow_callback+0xb1/0xc0 [] __perf_event_overflow+0x9c/0x330 [] ? x86_perf_event_set_period+0xd8/0x160 [] perf_event_overflow+0x14/0x20 [] intel_pmu_handle_irq+0x1c4/0x360 [] perf_event_nmi_handler+0x21/0x30 [] nmi_handle+0xb6/0x200 [] ? oops_begin+0xd0/0xd0 [] default_do_nmi+0x68/0x220 [] do_nmi+0xc0/0x110 [] end_repeat_nmi+0x1e/0x2e [] ? delay_tsc+0x38/0xb0 [] ? delay_tsc+0x38/0xb0 [] ? delay_tsc+0x38/0xb0 <> [] __delay+0xf/0x20 [] do_raw_spin_lock+0x7e/0x150 [] _raw_spin_lock_irqsave+0x61/0x70 [] ? clockevents_notify+0x28/0x150 [] ? _raw_spin_unlock_irqrestore+0x77/0x80 [] clockevents_notify+0x28/0x150 [] intel_idle+0xaf/0xe0 [] ? disable_cpuidle+0x20/0x20 [] cpuidle_enter+0x19/0x20 [] cpuidle_wrap_enter+0x41/0xa0 [] cpuidle_enter_tk+0x10/0x20 [] cpuidle_enter_state+0x17/0x50 [] cpuidle_idle_call+0xd9/0x290 [] cpu_idle+0xe5/0x140 [] start_secondary+0xdd/0xdf BUG: spinlock lockup suspected on CPU#2, migration/2/19 lock: clockevents_lock+0x0/0x40, .magic: dead4ead, .owner: swapper/8/0, .owner_cpu: 8 Pid: 19, comm: migration/2 Not tainted 3.8.0-rc7+stpmch13-1 #8 Call Trace: [] spin_dump+0x78/0xc0 []
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 15 February 2013 20:40, Srivatsa S. Bhat wrote: > Hi Vincent, > > On 02/15/2013 06:58 PM, Vincent Guittot wrote: >> Hi Srivatsa, >> >> I have run some tests with you branch (thanks Paul for the git tree) >> and you will find results below. >> > > Thank you very much for testing this patchset! > >> The tests condition are: >> - 5 CPUs system in 2 clusters >> - The test plugs/unplugs CPU2 and it increases the system load each 20 >> plug/unplug sequence with either more cyclictests threads >> - The test is done with all CPUs online and with only CPU0 and CPU2 >> >> The main conclusion is that there is no differences with and without >> your patches with my stress tests. I'm not sure that it was the >> expected results but the cpu_down is already quite low : 4-5ms in >> average >> > > Atleast my patchset doesn't perform _worse_ than mainline, with respect > to cpu_down duration :-) yes exactly and it has pass more than 400 consecutive plug/unplug on an ARM platform > > So, here is the analysis: > Stop-machine() doesn't really slow down CPU-down operation, if the rest > of the CPUs are mostly running in userspace all the time. Because, the > CPUs running userspace workloads cooperate very eagerly with the stop-machine > dance - they receive the resched IPI, and allow the per-cpu cpu-stopper > thread to monopolize the CPU, almost immediately. > > The scenario where stop-machine() takes longer to take effect is when > most of the online CPUs are running in kernelspace, because, then the > probability that they call preempt_disable() frequently (and hence inhibit > stop-machine) is higher. That's why, in my tests, I ran genload from LTP > which generated a lot of system-time (system-time in 'top' indicates activity > in kernelspace). Hence my patchset showed significant improvement over > mainline in my tests. > ok, I hadn't noticed this important point for the test > However, your test is very useful too, if we measure a different parameter: > the latency impact on the workloads running on the system (cyclic test). > One other important aim of this patchset is to make hotplug as less intrusive > as possible, for other workloads running on the system. So if you measure > the cyclictest numbers, I would expect my patchset to show better numbers > than mainline, when you do cpu-hotplug in parallel (same test that you did). > Mainline would run stop-machine and hence interrupt the cyclic test tasks > too often. My patchset wouldn't do that, and hence cyclic test should > ideally show better numbers. In fact, I haven't looked at the results as i was more interested by the load that was generated > > I'd really appreciate if you could try that out and let me know how it > goes.. :-) Thank you very much! ok, I'm going to try to run a test series Vincent > > Regards, > Srivatsa S. Bhat > >> >> >> On 12 February 2013 04:58, Srivatsa S. Bhat >> wrote: >>> On 02/12/2013 12:38 AM, Paul E. McKenney wrote: On Mon, Feb 11, 2013 at 05:53:41PM +0530, Srivatsa S. Bhat wrote: > On 02/11/2013 05:28 PM, Vincent Guittot wrote: >> On 8 February 2013 19:09, Srivatsa S. Bhat >> wrote: [ . . . ] >>> Adding Vincent to CC, who had previously evaluated the performance and >>> latency implications of CPU hotplug on ARM platforms, IIRC. >>> >> >> Hi Srivatsa, >> >> I can try to run some of our stress tests on your patches. > > Great! > >> Have you >> got a git tree that i can pull ? >> > > Unfortunately, no, none at the moment.. :-( You do need to create an externally visible git tree. >>> >>> Ok, I'll do that soon. >>> In the meantime, I have added your series at rcu/bhat.2013.01.21a on -rcu: git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git This should appear soon on a kernel.org mirror near you. ;-) >>> >>> Thank you very much, Paul! :-) >>> >>> Regards, >>> Srivatsa S. Bhat >>> > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 15 February 2013 20:40, Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: Hi Vincent, On 02/15/2013 06:58 PM, Vincent Guittot wrote: Hi Srivatsa, I have run some tests with you branch (thanks Paul for the git tree) and you will find results below. Thank you very much for testing this patchset! The tests condition are: - 5 CPUs system in 2 clusters - The test plugs/unplugs CPU2 and it increases the system load each 20 plug/unplug sequence with either more cyclictests threads - The test is done with all CPUs online and with only CPU0 and CPU2 The main conclusion is that there is no differences with and without your patches with my stress tests. I'm not sure that it was the expected results but the cpu_down is already quite low : 4-5ms in average Atleast my patchset doesn't perform _worse_ than mainline, with respect to cpu_down duration :-) yes exactly and it has pass more than 400 consecutive plug/unplug on an ARM platform So, here is the analysis: Stop-machine() doesn't really slow down CPU-down operation, if the rest of the CPUs are mostly running in userspace all the time. Because, the CPUs running userspace workloads cooperate very eagerly with the stop-machine dance - they receive the resched IPI, and allow the per-cpu cpu-stopper thread to monopolize the CPU, almost immediately. The scenario where stop-machine() takes longer to take effect is when most of the online CPUs are running in kernelspace, because, then the probability that they call preempt_disable() frequently (and hence inhibit stop-machine) is higher. That's why, in my tests, I ran genload from LTP which generated a lot of system-time (system-time in 'top' indicates activity in kernelspace). Hence my patchset showed significant improvement over mainline in my tests. ok, I hadn't noticed this important point for the test However, your test is very useful too, if we measure a different parameter: the latency impact on the workloads running on the system (cyclic test). One other important aim of this patchset is to make hotplug as less intrusive as possible, for other workloads running on the system. So if you measure the cyclictest numbers, I would expect my patchset to show better numbers than mainline, when you do cpu-hotplug in parallel (same test that you did). Mainline would run stop-machine and hence interrupt the cyclic test tasks too often. My patchset wouldn't do that, and hence cyclic test should ideally show better numbers. In fact, I haven't looked at the results as i was more interested by the load that was generated I'd really appreciate if you could try that out and let me know how it goes.. :-) Thank you very much! ok, I'm going to try to run a test series Vincent Regards, Srivatsa S. Bhat On 12 February 2013 04:58, Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: On 02/12/2013 12:38 AM, Paul E. McKenney wrote: On Mon, Feb 11, 2013 at 05:53:41PM +0530, Srivatsa S. Bhat wrote: On 02/11/2013 05:28 PM, Vincent Guittot wrote: On 8 February 2013 19:09, Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: [ . . . ] Adding Vincent to CC, who had previously evaluated the performance and latency implications of CPU hotplug on ARM platforms, IIRC. Hi Srivatsa, I can try to run some of our stress tests on your patches. Great! Have you got a git tree that i can pull ? Unfortunately, no, none at the moment.. :-( You do need to create an externally visible git tree. Ok, I'll do that soon. In the meantime, I have added your series at rcu/bhat.2013.01.21a on -rcu: git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git This should appear soon on a kernel.org mirror near you. ;-) Thank you very much, Paul! :-) Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 02/18/2013 03:54 PM, Vincent Guittot wrote: On 15 February 2013 20:40, Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: Hi Vincent, On 02/15/2013 06:58 PM, Vincent Guittot wrote: Hi Srivatsa, I have run some tests with you branch (thanks Paul for the git tree) and you will find results below. Thank you very much for testing this patchset! The tests condition are: - 5 CPUs system in 2 clusters - The test plugs/unplugs CPU2 and it increases the system load each 20 plug/unplug sequence with either more cyclictests threads - The test is done with all CPUs online and with only CPU0 and CPU2 The main conclusion is that there is no differences with and without your patches with my stress tests. I'm not sure that it was the expected results but the cpu_down is already quite low : 4-5ms in average Atleast my patchset doesn't perform _worse_ than mainline, with respect to cpu_down duration :-) yes exactly and it has pass more than 400 consecutive plug/unplug on an ARM platform Great! However, did you turn on CPU_IDLE during your tests? In my tests, I had turned off cpu idle in the .config, like I had mentioned in the cover letter. I'm struggling to get it working with CPU_IDLE/INTEL_IDLE turned on, because it gets into a lockup almost immediately. It appears that the lock-holder of clockevents_lock never releases it, for some reason.. See below for the full log. Lockdep has not been useful in debugging this, unfortunately :-( So, here is the analysis: Stop-machine() doesn't really slow down CPU-down operation, if the rest of the CPUs are mostly running in userspace all the time. Because, the CPUs running userspace workloads cooperate very eagerly with the stop-machine dance - they receive the resched IPI, and allow the per-cpu cpu-stopper thread to monopolize the CPU, almost immediately. The scenario where stop-machine() takes longer to take effect is when most of the online CPUs are running in kernelspace, because, then the probability that they call preempt_disable() frequently (and hence inhibit stop-machine) is higher. That's why, in my tests, I ran genload from LTP which generated a lot of system-time (system-time in 'top' indicates activity in kernelspace). Hence my patchset showed significant improvement over mainline in my tests. ok, I hadn't noticed this important point for the test However, your test is very useful too, if we measure a different parameter: the latency impact on the workloads running on the system (cyclic test). One other important aim of this patchset is to make hotplug as less intrusive as possible, for other workloads running on the system. So if you measure the cyclictest numbers, I would expect my patchset to show better numbers than mainline, when you do cpu-hotplug in parallel (same test that you did). Mainline would run stop-machine and hence interrupt the cyclic test tasks too often. My patchset wouldn't do that, and hence cyclic test should ideally show better numbers. In fact, I haven't looked at the results as i was more interested by the load that was generated I'd really appreciate if you could try that out and let me know how it goes.. :-) Thank you very much! ok, I'm going to try to run a test series Great! Thank you :-) Regards, Srivatsa S. Bhat Lockup observed while running this patchset, with CPU_IDLE and INTEL_IDLE turned on in the .config: smpboot: CPU 1 is now offline Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 11 Pid: 0, comm: swapper/11 Not tainted 3.8.0-rc7+stpmch13-1 #8 Call Trace: NMI [815a319e] panic+0xc9/0x1ee [810fdd41] watchdog_overflow_callback+0xb1/0xc0 [8113ab5c] __perf_event_overflow+0x9c/0x330 [81028a88] ? x86_perf_event_set_period+0xd8/0x160 [8113b514] perf_event_overflow+0x14/0x20 [8102ee54] intel_pmu_handle_irq+0x1c4/0x360 [815a8ef1] perf_event_nmi_handler+0x21/0x30 [815a8366] nmi_handle+0xb6/0x200 [815a82b0] ? oops_begin+0xd0/0xd0 [815a85c8] default_do_nmi+0x68/0x220 [815a8840] do_nmi+0xc0/0x110 [815a7911] end_repeat_nmi+0x1e/0x2e [812a3f98] ? delay_tsc+0x38/0xb0 [812a3f98] ? delay_tsc+0x38/0xb0 [812a3f98] ? delay_tsc+0x38/0xb0 EOE [812a3f1f] __delay+0xf/0x20 [812aba1e] do_raw_spin_lock+0x7e/0x150 [815a64c1] _raw_spin_lock_irqsave+0x61/0x70 [810c0758] ? clockevents_notify+0x28/0x150 [815a6d37] ? _raw_spin_unlock_irqrestore+0x77/0x80 [810c0758] clockevents_notify+0x28/0x150 [8130459f] intel_idle+0xaf/0xe0 [81472ee0] ? disable_cpuidle+0x20/0x20 [81472ef9] cpuidle_enter+0x19/0x20 [814734c1] cpuidle_wrap_enter+0x41/0xa0 [81473530] cpuidle_enter_tk+0x10/0x20 [81472f17] cpuidle_enter_state+0x17/0x50 [81473899]
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 02/18/2013 04:04 PM, Srivatsa S. Bhat wrote: On 02/18/2013 03:54 PM, Vincent Guittot wrote: On 15 February 2013 20:40, Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: Hi Vincent, On 02/15/2013 06:58 PM, Vincent Guittot wrote: Hi Srivatsa, I have run some tests with you branch (thanks Paul for the git tree) and you will find results below. Thank you very much for testing this patchset! The tests condition are: - 5 CPUs system in 2 clusters - The test plugs/unplugs CPU2 and it increases the system load each 20 plug/unplug sequence with either more cyclictests threads - The test is done with all CPUs online and with only CPU0 and CPU2 The main conclusion is that there is no differences with and without your patches with my stress tests. I'm not sure that it was the expected results but the cpu_down is already quite low : 4-5ms in average Atleast my patchset doesn't perform _worse_ than mainline, with respect to cpu_down duration :-) yes exactly and it has pass more than 400 consecutive plug/unplug on an ARM platform Great! However, did you turn on CPU_IDLE during your tests? In my tests, I had turned off cpu idle in the .config, like I had mentioned in the cover letter. I'm struggling to get it working with CPU_IDLE/INTEL_IDLE turned on, because it gets into a lockup almost immediately. It appears that the lock-holder of clockevents_lock never releases it, for some reason.. See below for the full log. Lockdep has not been useful in debugging this, unfortunately :-( Ah, nevermind, the following diff fixes it :-) I had applied this fix on v5 and tested but it still had races where I used to hit the lockups. Now after I fixed all the memory barrier issues that Paul and Oleg pointed out in v5, I applied this fix again and tested it just now - it works beautifully! :-) I'll include this fix and post a v6 soon. Regards, Srivatsa S. Bhat --- diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c index 30b6de0..ca340fd 100644 --- a/kernel/time/clockevents.c +++ b/kernel/time/clockevents.c @@ -17,6 +17,7 @@ #include linux/module.h #include linux/notifier.h #include linux/smp.h +#include linux/cpu.h #include tick-internal.h @@ -431,6 +432,7 @@ void clockevents_notify(unsigned long reason, void *arg) unsigned long flags; int cpu; + get_online_cpus_atomic(); raw_spin_lock_irqsave(clockevents_lock, flags); clockevents_do_notify(reason, arg); @@ -459,6 +461,7 @@ void clockevents_notify(unsigned long reason, void *arg) break; } raw_spin_unlock_irqrestore(clockevents_lock, flags); + put_online_cpus_atomic(); } EXPORT_SYMBOL_GPL(clockevents_notify); #endif -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On Mon, 18 Feb 2013, Srivatsa S. Bhat wrote: Lockup observed while running this patchset, with CPU_IDLE and INTEL_IDLE turned on in the .config: smpboot: CPU 1 is now offline Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 11 Pid: 0, comm: swapper/11 Not tainted 3.8.0-rc7+stpmch13-1 #8 Call Trace: [812aba1e] do_raw_spin_lock+0x7e/0x150 [815a64c1] _raw_spin_lock_irqsave+0x61/0x70 [810c0758] ? clockevents_notify+0x28/0x150 [815a6d37] ? _raw_spin_unlock_irqrestore+0x77/0x80 [810c0758] clockevents_notify+0x28/0x150 [8130459f] intel_idle+0xaf/0xe0 [81472ee0] ? disable_cpuidle+0x20/0x20 [81472ef9] cpuidle_enter+0x19/0x20 [814734c1] cpuidle_wrap_enter+0x41/0xa0 [81473530] cpuidle_enter_tk+0x10/0x20 [81472f17] cpuidle_enter_state+0x17/0x50 [81473899] cpuidle_idle_call+0xd9/0x290 [810203d5] cpu_idle+0xe5/0x140 [8159c603] start_secondary+0xdd/0xdf BUG: spinlock lockup suspected on CPU#2, migration/2/19 lock: clockevents_lock+0x0/0x40, .magic: dead4ead, .owner: swapper/8/0, .owner_cpu: 8 Unfortunately there is no back trace for cpu8. That's probably caused by the watchdog - panic setting. So we have no idea why cpu2 and 11 get stuck on the clockevents_lock and without that information it's impossible to decode. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 18 February 2013 11:51, Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: On 02/18/2013 04:04 PM, Srivatsa S. Bhat wrote: On 02/18/2013 03:54 PM, Vincent Guittot wrote: On 15 February 2013 20:40, Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: Hi Vincent, On 02/15/2013 06:58 PM, Vincent Guittot wrote: Hi Srivatsa, I have run some tests with you branch (thanks Paul for the git tree) and you will find results below. Thank you very much for testing this patchset! The tests condition are: - 5 CPUs system in 2 clusters - The test plugs/unplugs CPU2 and it increases the system load each 20 plug/unplug sequence with either more cyclictests threads - The test is done with all CPUs online and with only CPU0 and CPU2 The main conclusion is that there is no differences with and without your patches with my stress tests. I'm not sure that it was the expected results but the cpu_down is already quite low : 4-5ms in average Atleast my patchset doesn't perform _worse_ than mainline, with respect to cpu_down duration :-) yes exactly and it has pass more than 400 consecutive plug/unplug on an ARM platform Great! However, did you turn on CPU_IDLE during your tests? In my tests, I had turned off cpu idle in the .config, like I had mentioned in the cover letter. I'm struggling to get it working with CPU_IDLE/INTEL_IDLE turned on, because it gets into a lockup almost immediately. It appears that the lock-holder of clockevents_lock never releases it, for some reason.. See below for the full log. Lockdep has not been useful in debugging this, unfortunately :-( Ah, nevermind, the following diff fixes it :-) I had applied this fix on v5 and tested but it still had races where I used to hit the lockups. Now after I fixed all the memory barrier issues that Paul and Oleg pointed out in v5, I applied this fix again and tested it just now - it works beautifully! :-) My tests have been done without cpuidle because i have some issues with function tracer and cpuidle But the cpu hotplug and cpuidle work well when I run the tests without enabling the function tracer Vincent I'll include this fix and post a v6 soon. Regards, Srivatsa S. Bhat --- diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c index 30b6de0..ca340fd 100644 --- a/kernel/time/clockevents.c +++ b/kernel/time/clockevents.c @@ -17,6 +17,7 @@ #include linux/module.h #include linux/notifier.h #include linux/smp.h +#include linux/cpu.h #include tick-internal.h @@ -431,6 +432,7 @@ void clockevents_notify(unsigned long reason, void *arg) unsigned long flags; int cpu; + get_online_cpus_atomic(); raw_spin_lock_irqsave(clockevents_lock, flags); clockevents_do_notify(reason, arg); @@ -459,6 +461,7 @@ void clockevents_notify(unsigned long reason, void *arg) break; } raw_spin_unlock_irqrestore(clockevents_lock, flags); + put_online_cpus_atomic(); } EXPORT_SYMBOL_GPL(clockevents_notify); #endif -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 02/18/2013 04:24 PM, Thomas Gleixner wrote: On Mon, 18 Feb 2013, Srivatsa S. Bhat wrote: Lockup observed while running this patchset, with CPU_IDLE and INTEL_IDLE turned on in the .config: smpboot: CPU 1 is now offline Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 11 Pid: 0, comm: swapper/11 Not tainted 3.8.0-rc7+stpmch13-1 #8 Call Trace: [812aba1e] do_raw_spin_lock+0x7e/0x150 [815a64c1] _raw_spin_lock_irqsave+0x61/0x70 [810c0758] ? clockevents_notify+0x28/0x150 [815a6d37] ? _raw_spin_unlock_irqrestore+0x77/0x80 [810c0758] clockevents_notify+0x28/0x150 [8130459f] intel_idle+0xaf/0xe0 [81472ee0] ? disable_cpuidle+0x20/0x20 [81472ef9] cpuidle_enter+0x19/0x20 [814734c1] cpuidle_wrap_enter+0x41/0xa0 [81473530] cpuidle_enter_tk+0x10/0x20 [81472f17] cpuidle_enter_state+0x17/0x50 [81473899] cpuidle_idle_call+0xd9/0x290 [810203d5] cpu_idle+0xe5/0x140 [8159c603] start_secondary+0xdd/0xdf BUG: spinlock lockup suspected on CPU#2, migration/2/19 lock: clockevents_lock+0x0/0x40, .magic: dead4ead, .owner: swapper/8/0, .owner_cpu: 8 Unfortunately there is no back trace for cpu8. Yes :-( I had run this several times hoping to get a backtrace on the lock-holder, expecting trigger_all_cpu_backtrace() to get it right at least once. But I hadn't succeeded even once. That's probably caused by the watchdog - panic setting. Oh, ok.. So we have no idea why cpu2 and 11 get stuck on the clockevents_lock and without that information it's impossible to decode. But thankfully, the issue seems to have been resolved by the diff I posted in my previous mail, along with the fixes related to memory barriers. Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On Mon, 2013-02-18 at 11:58 +0100, Vincent Guittot wrote: My tests have been done without cpuidle because i have some issues with function tracer and cpuidle But the cpu hotplug and cpuidle work well when I run the tests without enabling the function tracer I know suspend and resume has issues with function tracing (because it makes things like calling smp_processor_id() crash the system), but I'm unaware of issues with hotplug itself. Could be some of the same issues. Can you give me more details, I'll try to investigate it. Thanks, -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 18 February 2013 16:30, Steven Rostedt rost...@goodmis.org wrote: On Mon, 2013-02-18 at 11:58 +0100, Vincent Guittot wrote: My tests have been done without cpuidle because i have some issues with function tracer and cpuidle But the cpu hotplug and cpuidle work well when I run the tests without enabling the function tracer I know suspend and resume has issues with function tracing (because it makes things like calling smp_processor_id() crash the system), but I'm unaware of issues with hotplug itself. Could be some of the same issues. Can you give me more details, I'll try to investigate it. yes for sure. The problem is more linked to cpuidle and function tracer. cpu hotplug and function tracer work when cpuidle is disable. cpu hotplug and cpuidle works if i don't enable function tracer. my platform is dead as soon as I enable function tracer if cpuidle is enabled. It looks like some notrace are missing in my platform driver but we haven't completely fix the issue yet Vincent Thanks, -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On Mon, 2013-02-18 at 17:50 +0100, Vincent Guittot wrote: yes for sure. The problem is more linked to cpuidle and function tracer. cpu hotplug and function tracer work when cpuidle is disable. cpu hotplug and cpuidle works if i don't enable function tracer. my platform is dead as soon as I enable function tracer if cpuidle is enabled. It looks like some notrace are missing in my platform driver but we haven't completely fix the issue yet You can bisect to find out exactly what function is the problem: cat /debug/tracing/available_filter_functions t f(t) { num=`wc -l t` sed -ne 1,${num}p t t1 let num=num+1 sed -ne ${num},$p t t2 cat t1 /debug/tracing/set_ftrace_filter # note this may take a long time to finish echo function /debug/tracing/current_tracer failed? bisect f(t1), if not bisect f(t2) } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On Mon, 2013-02-18 at 17:50 +0100, Vincent Guittot wrote: yes for sure. The problem is more linked to cpuidle and function tracer. cpu hotplug and function tracer work when cpuidle is disable. cpu hotplug and cpuidle works if i don't enable function tracer. my platform is dead as soon as I enable function tracer if cpuidle is enabled. It looks like some notrace are missing in my platform driver but we haven't completely fix the issue yet You can bisect to find out exactly what function is the problem: cat /debug/tracing/available_filter_functions t f(t) { num=`wc -l t` sed -ne 1,${num}p t t1 let num=num+1 sed -ne ${num},$p t t2 cat t1 /debug/tracing/set_ftrace_filter # note this may take a long time to finish echo function /debug/tracing/current_tracer failed? bisect f(t1), if not bisect f(t2) } -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
Hi Vincent, On 02/15/2013 06:58 PM, Vincent Guittot wrote: > Hi Srivatsa, > > I have run some tests with you branch (thanks Paul for the git tree) > and you will find results below. > Thank you very much for testing this patchset! > The tests condition are: > - 5 CPUs system in 2 clusters > - The test plugs/unplugs CPU2 and it increases the system load each 20 > plug/unplug sequence with either more cyclictests threads > - The test is done with all CPUs online and with only CPU0 and CPU2 > > The main conclusion is that there is no differences with and without > your patches with my stress tests. I'm not sure that it was the > expected results but the cpu_down is already quite low : 4-5ms in > average > Atleast my patchset doesn't perform _worse_ than mainline, with respect to cpu_down duration :-) So, here is the analysis: Stop-machine() doesn't really slow down CPU-down operation, if the rest of the CPUs are mostly running in userspace all the time. Because, the CPUs running userspace workloads cooperate very eagerly with the stop-machine dance - they receive the resched IPI, and allow the per-cpu cpu-stopper thread to monopolize the CPU, almost immediately. The scenario where stop-machine() takes longer to take effect is when most of the online CPUs are running in kernelspace, because, then the probability that they call preempt_disable() frequently (and hence inhibit stop-machine) is higher. That's why, in my tests, I ran genload from LTP which generated a lot of system-time (system-time in 'top' indicates activity in kernelspace). Hence my patchset showed significant improvement over mainline in my tests. However, your test is very useful too, if we measure a different parameter: the latency impact on the workloads running on the system (cyclic test). One other important aim of this patchset is to make hotplug as less intrusive as possible, for other workloads running on the system. So if you measure the cyclictest numbers, I would expect my patchset to show better numbers than mainline, when you do cpu-hotplug in parallel (same test that you did). Mainline would run stop-machine and hence interrupt the cyclic test tasks too often. My patchset wouldn't do that, and hence cyclic test should ideally show better numbers. I'd really appreciate if you could try that out and let me know how it goes.. :-) Thank you very much! Regards, Srivatsa S. Bhat > > > On 12 February 2013 04:58, Srivatsa S. Bhat > wrote: >> On 02/12/2013 12:38 AM, Paul E. McKenney wrote: >>> On Mon, Feb 11, 2013 at 05:53:41PM +0530, Srivatsa S. Bhat wrote: On 02/11/2013 05:28 PM, Vincent Guittot wrote: > On 8 February 2013 19:09, Srivatsa S. Bhat > wrote: >>> >>> [ . . . ] >>> >> Adding Vincent to CC, who had previously evaluated the performance and >> latency implications of CPU hotplug on ARM platforms, IIRC. >> > > Hi Srivatsa, > > I can try to run some of our stress tests on your patches. Great! > Have you > got a git tree that i can pull ? > Unfortunately, no, none at the moment.. :-( >>> >>> You do need to create an externally visible git tree. >> >> Ok, I'll do that soon. >> >>> In the meantime, >>> I have added your series at rcu/bhat.2013.01.21a on -rcu: >>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git >>> >>> This should appear soon on a kernel.org mirror near you. ;-) >>> >> >> Thank you very much, Paul! :-) >> >> Regards, >> Srivatsa S. Bhat >> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
Hi Vincent, On 02/15/2013 06:58 PM, Vincent Guittot wrote: Hi Srivatsa, I have run some tests with you branch (thanks Paul for the git tree) and you will find results below. Thank you very much for testing this patchset! The tests condition are: - 5 CPUs system in 2 clusters - The test plugs/unplugs CPU2 and it increases the system load each 20 plug/unplug sequence with either more cyclictests threads - The test is done with all CPUs online and with only CPU0 and CPU2 The main conclusion is that there is no differences with and without your patches with my stress tests. I'm not sure that it was the expected results but the cpu_down is already quite low : 4-5ms in average Atleast my patchset doesn't perform _worse_ than mainline, with respect to cpu_down duration :-) So, here is the analysis: Stop-machine() doesn't really slow down CPU-down operation, if the rest of the CPUs are mostly running in userspace all the time. Because, the CPUs running userspace workloads cooperate very eagerly with the stop-machine dance - they receive the resched IPI, and allow the per-cpu cpu-stopper thread to monopolize the CPU, almost immediately. The scenario where stop-machine() takes longer to take effect is when most of the online CPUs are running in kernelspace, because, then the probability that they call preempt_disable() frequently (and hence inhibit stop-machine) is higher. That's why, in my tests, I ran genload from LTP which generated a lot of system-time (system-time in 'top' indicates activity in kernelspace). Hence my patchset showed significant improvement over mainline in my tests. However, your test is very useful too, if we measure a different parameter: the latency impact on the workloads running on the system (cyclic test). One other important aim of this patchset is to make hotplug as less intrusive as possible, for other workloads running on the system. So if you measure the cyclictest numbers, I would expect my patchset to show better numbers than mainline, when you do cpu-hotplug in parallel (same test that you did). Mainline would run stop-machine and hence interrupt the cyclic test tasks too often. My patchset wouldn't do that, and hence cyclic test should ideally show better numbers. I'd really appreciate if you could try that out and let me know how it goes.. :-) Thank you very much! Regards, Srivatsa S. Bhat On 12 February 2013 04:58, Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: On 02/12/2013 12:38 AM, Paul E. McKenney wrote: On Mon, Feb 11, 2013 at 05:53:41PM +0530, Srivatsa S. Bhat wrote: On 02/11/2013 05:28 PM, Vincent Guittot wrote: On 8 February 2013 19:09, Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: [ . . . ] Adding Vincent to CC, who had previously evaluated the performance and latency implications of CPU hotplug on ARM platforms, IIRC. Hi Srivatsa, I can try to run some of our stress tests on your patches. Great! Have you got a git tree that i can pull ? Unfortunately, no, none at the moment.. :-( You do need to create an externally visible git tree. Ok, I'll do that soon. In the meantime, I have added your series at rcu/bhat.2013.01.21a on -rcu: git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git This should appear soon on a kernel.org mirror near you. ;-) Thank you very much, Paul! :-) Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 02/12/2013 12:38 AM, Paul E. McKenney wrote: > On Mon, Feb 11, 2013 at 05:53:41PM +0530, Srivatsa S. Bhat wrote: >> On 02/11/2013 05:28 PM, Vincent Guittot wrote: >>> On 8 February 2013 19:09, Srivatsa S. Bhat >>> wrote: > > [ . . . ] > Adding Vincent to CC, who had previously evaluated the performance and latency implications of CPU hotplug on ARM platforms, IIRC. >>> >>> Hi Srivatsa, >>> >>> I can try to run some of our stress tests on your patches. >> >> Great! >> >>> Have you >>> got a git tree that i can pull ? >>> >> >> Unfortunately, no, none at the moment.. :-( > > You do need to create an externally visible git tree. Ok, I'll do that soon. > In the meantime, > I have added your series at rcu/bhat.2013.01.21a on -rcu: > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git > > This should appear soon on a kernel.org mirror near you. ;-) > Thank you very much, Paul! :-) Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On Mon, Feb 11, 2013 at 05:53:41PM +0530, Srivatsa S. Bhat wrote: > On 02/11/2013 05:28 PM, Vincent Guittot wrote: > > On 8 February 2013 19:09, Srivatsa S. Bhat > > wrote: [ . . . ] > >> Adding Vincent to CC, who had previously evaluated the performance and > >> latency implications of CPU hotplug on ARM platforms, IIRC. > >> > > > > Hi Srivatsa, > > > > I can try to run some of our stress tests on your patches. > > Great! > > > Have you > > got a git tree that i can pull ? > > > > Unfortunately, no, none at the moment.. :-( You do need to create an externally visible git tree. In the meantime, I have added your series at rcu/bhat.2013.01.21a on -rcu: git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git This should appear soon on a kernel.org mirror near you. ;-) Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 02/11/2013 05:28 PM, Vincent Guittot wrote: > On 8 February 2013 19:09, Srivatsa S. Bhat > wrote: >> On 02/08/2013 10:14 PM, Srivatsa S. Bhat wrote: >>> On 02/08/2013 09:11 PM, Russell King - ARM Linux wrote: On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote: > On 02/07/2013 09:44 AM, Rusty Russell wrote: >> "Srivatsa S. Bhat" writes: >>> On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote: >>> Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c >>> latency] >>> >>> # online CPUsMainline (with stop-m/c) This patchset (no >>> stop-m/c) >>> >>> 8 17.04 7.73 >>> >>> 16 18.05 6.44 >>> >>> 32 17.31 7.39 >>> >>> 64 32.40 9.28 >>> >>> 128 98.23 7.35 >> >> Nice! > > Thank you :-) > >> I wonder how the ARM guys feel with their quad-cpu systems... >> > > That would be definitely interesting to know :-) That depends what exactly you'd like tested (and how) and whether you'd like it to be a test-chip based quad core, or an OMAP dual-core SoC. >>> >>> The effect of stop_machine() doesn't really depend on the CPU architecture >>> used underneath or the platform. It depends only on the _number_ of >>> _logical_ CPUs used. >>> >>> And stop_machine() has 2 noticeable drawbacks: >>> 1. It makes the hotplug operation itself slow >>> 2. and it causes disruptions to the workloads running on the other >>> CPUs by hijacking the entire machine for significant amounts of time. >>> >>> In my experiments (mentioned above), I tried to measure how my patchset >>> improves (reduces) the duration of hotplug (CPU offline) itself. Which is >>> also slightly indicative of the impact it has on the rest of the system. >>> >>> But what would be nice to test, is a setup where the workloads running on >>> the rest of the system are latency-sensitive, and measure the impact of >>> CPU offline on them, with this patchset applied. That would tell us how >>> far is this useful in making CPU hotplug less disruptive on the system. >>> >>> Of course, it would be nice to also see whether we observe any reduction >>> in hotplug duration itself (point 1 above) on ARM platforms with lot >>> of CPUs. [This could potentially speed up suspend/resume, which is used >>> rather heavily on ARM platforms]. >>> >>> The benefits from this patchset over mainline (both in terms of points >>> 1 and 2 above) is expected to increase, with increasing number of CPUs in >>> the system. >>> >> >> Adding Vincent to CC, who had previously evaluated the performance and >> latency implications of CPU hotplug on ARM platforms, IIRC. >> > > Hi Srivatsa, > > I can try to run some of our stress tests on your patches. Great! > Have you > got a git tree that i can pull ? > Unfortunately, no, none at the moment.. :-( Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
Hi Srivatsa, I can try to run some of our stress tests on your patches. Have you got a git tree that i can pull ? Regards, Vincent On 8 February 2013 19:09, Srivatsa S. Bhat wrote: > On 02/08/2013 10:14 PM, Srivatsa S. Bhat wrote: >> On 02/08/2013 09:11 PM, Russell King - ARM Linux wrote: >>> On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote: On 02/07/2013 09:44 AM, Rusty Russell wrote: > "Srivatsa S. Bhat" writes: >> On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote: >> Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c >> latency] >> >> # online CPUsMainline (with stop-m/c) This patchset (no >> stop-m/c) >> >> 8 17.04 7.73 >> >> 16 18.05 6.44 >> >> 32 17.31 7.39 >> >> 64 32.40 9.28 >> >> 128 98.23 7.35 > > Nice! Thank you :-) > I wonder how the ARM guys feel with their quad-cpu systems... > That would be definitely interesting to know :-) >>> >>> That depends what exactly you'd like tested (and how) and whether you'd >>> like it to be a test-chip based quad core, or an OMAP dual-core SoC. >>> >> >> The effect of stop_machine() doesn't really depend on the CPU architecture >> used underneath or the platform. It depends only on the _number_ of >> _logical_ CPUs used. >> >> And stop_machine() has 2 noticeable drawbacks: >> 1. It makes the hotplug operation itself slow >> 2. and it causes disruptions to the workloads running on the other >> CPUs by hijacking the entire machine for significant amounts of time. >> >> In my experiments (mentioned above), I tried to measure how my patchset >> improves (reduces) the duration of hotplug (CPU offline) itself. Which is >> also slightly indicative of the impact it has on the rest of the system. >> >> But what would be nice to test, is a setup where the workloads running on >> the rest of the system are latency-sensitive, and measure the impact of >> CPU offline on them, with this patchset applied. That would tell us how >> far is this useful in making CPU hotplug less disruptive on the system. >> >> Of course, it would be nice to also see whether we observe any reduction >> in hotplug duration itself (point 1 above) on ARM platforms with lot >> of CPUs. [This could potentially speed up suspend/resume, which is used >> rather heavily on ARM platforms]. >> >> The benefits from this patchset over mainline (both in terms of points >> 1 and 2 above) is expected to increase, with increasing number of CPUs in >> the system. >> > > Adding Vincent to CC, who had previously evaluated the performance and > latency implications of CPU hotplug on ARM platforms, IIRC. > > Regards, > Srivatsa S. Bhat > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
Hi Srivatsa, I can try to run some of our stress tests on your patches. Have you got a git tree that i can pull ? Regards, Vincent On 8 February 2013 19:09, Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: On 02/08/2013 10:14 PM, Srivatsa S. Bhat wrote: On 02/08/2013 09:11 PM, Russell King - ARM Linux wrote: On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote: On 02/07/2013 09:44 AM, Rusty Russell wrote: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com writes: On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote: Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c latency] # online CPUsMainline (with stop-m/c) This patchset (no stop-m/c) 8 17.04 7.73 16 18.05 6.44 32 17.31 7.39 64 32.40 9.28 128 98.23 7.35 Nice! Thank you :-) I wonder how the ARM guys feel with their quad-cpu systems... That would be definitely interesting to know :-) That depends what exactly you'd like tested (and how) and whether you'd like it to be a test-chip based quad core, or an OMAP dual-core SoC. The effect of stop_machine() doesn't really depend on the CPU architecture used underneath or the platform. It depends only on the _number_ of _logical_ CPUs used. And stop_machine() has 2 noticeable drawbacks: 1. It makes the hotplug operation itself slow 2. and it causes disruptions to the workloads running on the other CPUs by hijacking the entire machine for significant amounts of time. In my experiments (mentioned above), I tried to measure how my patchset improves (reduces) the duration of hotplug (CPU offline) itself. Which is also slightly indicative of the impact it has on the rest of the system. But what would be nice to test, is a setup where the workloads running on the rest of the system are latency-sensitive, and measure the impact of CPU offline on them, with this patchset applied. That would tell us how far is this useful in making CPU hotplug less disruptive on the system. Of course, it would be nice to also see whether we observe any reduction in hotplug duration itself (point 1 above) on ARM platforms with lot of CPUs. [This could potentially speed up suspend/resume, which is used rather heavily on ARM platforms]. The benefits from this patchset over mainline (both in terms of points 1 and 2 above) is expected to increase, with increasing number of CPUs in the system. Adding Vincent to CC, who had previously evaluated the performance and latency implications of CPU hotplug on ARM platforms, IIRC. Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 02/11/2013 05:28 PM, Vincent Guittot wrote: On 8 February 2013 19:09, Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: On 02/08/2013 10:14 PM, Srivatsa S. Bhat wrote: On 02/08/2013 09:11 PM, Russell King - ARM Linux wrote: On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote: On 02/07/2013 09:44 AM, Rusty Russell wrote: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com writes: On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote: Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c latency] # online CPUsMainline (with stop-m/c) This patchset (no stop-m/c) 8 17.04 7.73 16 18.05 6.44 32 17.31 7.39 64 32.40 9.28 128 98.23 7.35 Nice! Thank you :-) I wonder how the ARM guys feel with their quad-cpu systems... That would be definitely interesting to know :-) That depends what exactly you'd like tested (and how) and whether you'd like it to be a test-chip based quad core, or an OMAP dual-core SoC. The effect of stop_machine() doesn't really depend on the CPU architecture used underneath or the platform. It depends only on the _number_ of _logical_ CPUs used. And stop_machine() has 2 noticeable drawbacks: 1. It makes the hotplug operation itself slow 2. and it causes disruptions to the workloads running on the other CPUs by hijacking the entire machine for significant amounts of time. In my experiments (mentioned above), I tried to measure how my patchset improves (reduces) the duration of hotplug (CPU offline) itself. Which is also slightly indicative of the impact it has on the rest of the system. But what would be nice to test, is a setup where the workloads running on the rest of the system are latency-sensitive, and measure the impact of CPU offline on them, with this patchset applied. That would tell us how far is this useful in making CPU hotplug less disruptive on the system. Of course, it would be nice to also see whether we observe any reduction in hotplug duration itself (point 1 above) on ARM platforms with lot of CPUs. [This could potentially speed up suspend/resume, which is used rather heavily on ARM platforms]. The benefits from this patchset over mainline (both in terms of points 1 and 2 above) is expected to increase, with increasing number of CPUs in the system. Adding Vincent to CC, who had previously evaluated the performance and latency implications of CPU hotplug on ARM platforms, IIRC. Hi Srivatsa, I can try to run some of our stress tests on your patches. Great! Have you got a git tree that i can pull ? Unfortunately, no, none at the moment.. :-( Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On Mon, Feb 11, 2013 at 05:53:41PM +0530, Srivatsa S. Bhat wrote: On 02/11/2013 05:28 PM, Vincent Guittot wrote: On 8 February 2013 19:09, Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: [ . . . ] Adding Vincent to CC, who had previously evaluated the performance and latency implications of CPU hotplug on ARM platforms, IIRC. Hi Srivatsa, I can try to run some of our stress tests on your patches. Great! Have you got a git tree that i can pull ? Unfortunately, no, none at the moment.. :-( You do need to create an externally visible git tree. In the meantime, I have added your series at rcu/bhat.2013.01.21a on -rcu: git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git This should appear soon on a kernel.org mirror near you. ;-) Thanx, Paul -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 02/12/2013 12:38 AM, Paul E. McKenney wrote: On Mon, Feb 11, 2013 at 05:53:41PM +0530, Srivatsa S. Bhat wrote: On 02/11/2013 05:28 PM, Vincent Guittot wrote: On 8 February 2013 19:09, Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: [ . . . ] Adding Vincent to CC, who had previously evaluated the performance and latency implications of CPU hotplug on ARM platforms, IIRC. Hi Srivatsa, I can try to run some of our stress tests on your patches. Great! Have you got a git tree that i can pull ? Unfortunately, no, none at the moment.. :-( You do need to create an externally visible git tree. Ok, I'll do that soon. In the meantime, I have added your series at rcu/bhat.2013.01.21a on -rcu: git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git This should appear soon on a kernel.org mirror near you. ;-) Thank you very much, Paul! :-) Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 02/08/2013 10:14 PM, Srivatsa S. Bhat wrote: > On 02/08/2013 09:11 PM, Russell King - ARM Linux wrote: >> On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote: >>> On 02/07/2013 09:44 AM, Rusty Russell wrote: "Srivatsa S. Bhat" writes: > On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote: > Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c > latency] > > # online CPUsMainline (with stop-m/c) This patchset (no > stop-m/c) > > 8 17.04 7.73 > > 16 18.05 6.44 > > 32 17.31 7.39 > > 64 32.40 9.28 > > 128 98.23 7.35 Nice! >>> >>> Thank you :-) >>> I wonder how the ARM guys feel with their quad-cpu systems... >>> >>> That would be definitely interesting to know :-) >> >> That depends what exactly you'd like tested (and how) and whether you'd >> like it to be a test-chip based quad core, or an OMAP dual-core SoC. >> > > The effect of stop_machine() doesn't really depend on the CPU architecture > used underneath or the platform. It depends only on the _number_ of > _logical_ CPUs used. > > And stop_machine() has 2 noticeable drawbacks: > 1. It makes the hotplug operation itself slow > 2. and it causes disruptions to the workloads running on the other > CPUs by hijacking the entire machine for significant amounts of time. > > In my experiments (mentioned above), I tried to measure how my patchset > improves (reduces) the duration of hotplug (CPU offline) itself. Which is > also slightly indicative of the impact it has on the rest of the system. > > But what would be nice to test, is a setup where the workloads running on > the rest of the system are latency-sensitive, and measure the impact of > CPU offline on them, with this patchset applied. That would tell us how > far is this useful in making CPU hotplug less disruptive on the system. > > Of course, it would be nice to also see whether we observe any reduction > in hotplug duration itself (point 1 above) on ARM platforms with lot > of CPUs. [This could potentially speed up suspend/resume, which is used > rather heavily on ARM platforms]. > > The benefits from this patchset over mainline (both in terms of points > 1 and 2 above) is expected to increase, with increasing number of CPUs in > the system. > Adding Vincent to CC, who had previously evaluated the performance and latency implications of CPU hotplug on ARM platforms, IIRC. Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 02/08/2013 09:11 PM, Russell King - ARM Linux wrote: > On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote: >> On 02/07/2013 09:44 AM, Rusty Russell wrote: >>> "Srivatsa S. Bhat" writes: On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote: Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c latency] # online CPUsMainline (with stop-m/c) This patchset (no stop-m/c) 8 17.04 7.73 16 18.05 6.44 32 17.31 7.39 64 32.40 9.28 128 98.23 7.35 >>> >>> Nice! >> >> Thank you :-) >> >>> I wonder how the ARM guys feel with their quad-cpu systems... >>> >> >> That would be definitely interesting to know :-) > > That depends what exactly you'd like tested (and how) and whether you'd > like it to be a test-chip based quad core, or an OMAP dual-core SoC. > The effect of stop_machine() doesn't really depend on the CPU architecture used underneath or the platform. It depends only on the _number_ of _logical_ CPUs used. And stop_machine() has 2 noticeable drawbacks: 1. It makes the hotplug operation itself slow 2. and it causes disruptions to the workloads running on the other CPUs by hijacking the entire machine for significant amounts of time. In my experiments (mentioned above), I tried to measure how my patchset improves (reduces) the duration of hotplug (CPU offline) itself. Which is also slightly indicative of the impact it has on the rest of the system. But what would be nice to test, is a setup where the workloads running on the rest of the system are latency-sensitive, and measure the impact of CPU offline on them, with this patchset applied. That would tell us how far is this useful in making CPU hotplug less disruptive on the system. Of course, it would be nice to also see whether we observe any reduction in hotplug duration itself (point 1 above) on ARM platforms with lot of CPUs. [This could potentially speed up suspend/resume, which is used rather heavily on ARM platforms]. The benefits from this patchset over mainline (both in terms of points 1 and 2 above) is expected to increase, with increasing number of CPUs in the system. Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote: > On 02/07/2013 09:44 AM, Rusty Russell wrote: > > "Srivatsa S. Bhat" writes: > >> On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote: > >> Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c > >> latency] > >> > >> # online CPUsMainline (with stop-m/c) This patchset (no stop-m/c) > >> > >> 8 17.04 7.73 > >> > >> 16 18.05 6.44 > >> > >> 32 17.31 7.39 > >> > >> 64 32.40 9.28 > >> > >> 128 98.23 7.35 > > > > Nice! > > Thank you :-) > > > I wonder how the ARM guys feel with their quad-cpu systems... > > > > That would be definitely interesting to know :-) That depends what exactly you'd like tested (and how) and whether you'd like it to be a test-chip based quad core, or an OMAP dual-core SoC. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote: On 02/07/2013 09:44 AM, Rusty Russell wrote: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com writes: On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote: Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c latency] # online CPUsMainline (with stop-m/c) This patchset (no stop-m/c) 8 17.04 7.73 16 18.05 6.44 32 17.31 7.39 64 32.40 9.28 128 98.23 7.35 Nice! Thank you :-) I wonder how the ARM guys feel with their quad-cpu systems... That would be definitely interesting to know :-) That depends what exactly you'd like tested (and how) and whether you'd like it to be a test-chip based quad core, or an OMAP dual-core SoC. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 02/08/2013 09:11 PM, Russell King - ARM Linux wrote: On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote: On 02/07/2013 09:44 AM, Rusty Russell wrote: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com writes: On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote: Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c latency] # online CPUsMainline (with stop-m/c) This patchset (no stop-m/c) 8 17.04 7.73 16 18.05 6.44 32 17.31 7.39 64 32.40 9.28 128 98.23 7.35 Nice! Thank you :-) I wonder how the ARM guys feel with their quad-cpu systems... That would be definitely interesting to know :-) That depends what exactly you'd like tested (and how) and whether you'd like it to be a test-chip based quad core, or an OMAP dual-core SoC. The effect of stop_machine() doesn't really depend on the CPU architecture used underneath or the platform. It depends only on the _number_ of _logical_ CPUs used. And stop_machine() has 2 noticeable drawbacks: 1. It makes the hotplug operation itself slow 2. and it causes disruptions to the workloads running on the other CPUs by hijacking the entire machine for significant amounts of time. In my experiments (mentioned above), I tried to measure how my patchset improves (reduces) the duration of hotplug (CPU offline) itself. Which is also slightly indicative of the impact it has on the rest of the system. But what would be nice to test, is a setup where the workloads running on the rest of the system are latency-sensitive, and measure the impact of CPU offline on them, with this patchset applied. That would tell us how far is this useful in making CPU hotplug less disruptive on the system. Of course, it would be nice to also see whether we observe any reduction in hotplug duration itself (point 1 above) on ARM platforms with lot of CPUs. [This could potentially speed up suspend/resume, which is used rather heavily on ARM platforms]. The benefits from this patchset over mainline (both in terms of points 1 and 2 above) is expected to increase, with increasing number of CPUs in the system. Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 02/08/2013 10:14 PM, Srivatsa S. Bhat wrote: On 02/08/2013 09:11 PM, Russell King - ARM Linux wrote: On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote: On 02/07/2013 09:44 AM, Rusty Russell wrote: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com writes: On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote: Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c latency] # online CPUsMainline (with stop-m/c) This patchset (no stop-m/c) 8 17.04 7.73 16 18.05 6.44 32 17.31 7.39 64 32.40 9.28 128 98.23 7.35 Nice! Thank you :-) I wonder how the ARM guys feel with their quad-cpu systems... That would be definitely interesting to know :-) That depends what exactly you'd like tested (and how) and whether you'd like it to be a test-chip based quad core, or an OMAP dual-core SoC. The effect of stop_machine() doesn't really depend on the CPU architecture used underneath or the platform. It depends only on the _number_ of _logical_ CPUs used. And stop_machine() has 2 noticeable drawbacks: 1. It makes the hotplug operation itself slow 2. and it causes disruptions to the workloads running on the other CPUs by hijacking the entire machine for significant amounts of time. In my experiments (mentioned above), I tried to measure how my patchset improves (reduces) the duration of hotplug (CPU offline) itself. Which is also slightly indicative of the impact it has on the rest of the system. But what would be nice to test, is a setup where the workloads running on the rest of the system are latency-sensitive, and measure the impact of CPU offline on them, with this patchset applied. That would tell us how far is this useful in making CPU hotplug less disruptive on the system. Of course, it would be nice to also see whether we observe any reduction in hotplug duration itself (point 1 above) on ARM platforms with lot of CPUs. [This could potentially speed up suspend/resume, which is used rather heavily on ARM platforms]. The benefits from this patchset over mainline (both in terms of points 1 and 2 above) is expected to increase, with increasing number of CPUs in the system. Adding Vincent to CC, who had previously evaluated the performance and latency implications of CPU hotplug on ARM platforms, IIRC. Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 02/07/2013 09:44 AM, Rusty Russell wrote: > "Srivatsa S. Bhat" writes: >> On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote: >> Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c >> latency] >> >> # online CPUsMainline (with stop-m/c) This patchset (no stop-m/c) >> >> 8 17.04 7.73 >> >> 16 18.05 6.44 >> >> 32 17.31 7.39 >> >> 64 32.40 9.28 >> >> 128 98.23 7.35 > > Nice! Thank you :-) > I wonder how the ARM guys feel with their quad-cpu systems... > That would be definitely interesting to know :-) Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
"Srivatsa S. Bhat" writes: > On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote: > Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c > latency] > > # online CPUsMainline (with stop-m/c) This patchset (no stop-m/c) > > 8 17.04 7.73 > > 16 18.05 6.44 > > 32 17.31 7.39 > > 64 32.40 9.28 > > 128 98.23 7.35 Nice! I wonder how the ARM guys feel with their quad-cpu systems... Thanks! Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com writes: On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote: Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c latency] # online CPUsMainline (with stop-m/c) This patchset (no stop-m/c) 8 17.04 7.73 16 18.05 6.44 32 17.31 7.39 64 32.40 9.28 128 98.23 7.35 Nice! I wonder how the ARM guys feel with their quad-cpu systems... Thanks! Rusty. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
On 02/07/2013 09:44 AM, Rusty Russell wrote: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com writes: On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote: Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c latency] # online CPUsMainline (with stop-m/c) This patchset (no stop-m/c) 8 17.04 7.73 16 18.05 6.44 32 17.31 7.39 64 32.40 9.28 128 98.23 7.35 Nice! Thank you :-) I wonder how the ARM guys feel with their quad-cpu systems... That would be definitely interesting to know :-) Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
Hi, This patchset removes CPU hotplug's dependence on stop_machine() from the CPU offline path and provides an alternative (set of APIs) to preempt_disable() to prevent CPUs from going offline, which can be invoked from atomic context. The motivation behind the removal of stop_machine() is to avoid its ill-effects and thus improve the design of CPU hotplug. (More description regarding this is available in the patches). All the users of preempt_disable()/local_irq_disable() who used to use it to prevent CPU offline, have been converted to the new primitives introduced in the patchset. Also, the CPU_DYING notifiers have been audited to check whether they can cope up with the removal of stop_machine() or whether they need to use new locks for synchronization (all CPU_DYING notifiers looked OK, without the need for any new locks). Applies on v3.8-rc4. It currently has some locking issues with cpu idle (on which even lockdep didn't provide any insight unfortunately). So for now, it works with CONFIG_CPU_IDLE=n. Overview of the patches: --- Patches 1 to 6 introduce a generic, flexible Per-CPU Reader-Writer Locking scheme. Patch 7 uses this synchronization mechanism to build the get/put_online_cpus_atomic() APIs which can be used from atomic context, to prevent CPUs from going offline. Patch 8 is a cleanup; it converts preprocessor macros to static inline functions. Patches 9 to 42 convert various call-sites to use the new APIs. Patch 43 is the one which actually removes stop_machine() from the CPU offline path. Patch 44 decouples stop_machine() and CPU hotplug from Kconfig. Patch 45 updates the documentation to reflect the new APIs. Changes in v5: -- Exposed a new generic locking scheme: Flexible Per-CPU Reader-Writer locks, based on the synchronization schemes already discussed in the previous versions, and used it in CPU hotplug, to implement the new APIs. Audited the CPU_DYING notifiers in the kernel source tree and replaced usages of preempt_disable() with the new get/put_online_cpus_atomic() APIs where necessary. Changes in v4: -- The synchronization scheme has been simplified quite a bit, which makes it look a lot less complex than before. Some highlights: * Implicit ACKs: The earlier design required the readers to explicitly ACK the writer's signal. The new design uses implicit ACKs instead. The reader switching over to rwlock implicitly tells the writer to stop waiting for that reader. * No atomic operations: Since we got rid of explicit ACKs, we no longer have the need for a reader and a writer to update the same counter. So we can get rid of atomic ops too. Changes in v3: -- * Dropped the _light() and _full() variants of the APIs. Provided a single interface: get/put_online_cpus_atomic(). * Completely redesigned the synchronization mechanism again, to make it fast and scalable at the reader-side in the fast-path (when no hotplug writers are active). This new scheme also ensures that there is no possibility of deadlocks due to circular locking dependency. In summary, this provides the scalability and speed of per-cpu rwlocks (without actually using them), while avoiding the downside (deadlock possibilities) which is inherent in any per-cpu locking scheme that is meant to compete with preempt_disable()/enable() in terms of flexibility. The problem with using per-cpu locking to replace preempt_disable()/enable was explained here: https://lkml.org/lkml/2012/12/6/290 Basically we use per-cpu counters (for scalability) when no writers are active, and then switch to global rwlocks (for lock-safety) when a writer becomes active. It is a slightly complex scheme, but it is based on standard principles of distributed algorithms. Changes in v2: - * Completely redesigned the synchronization scheme to avoid using any extra cpumasks. * Provided APIs for 2 types of atomic hotplug readers: "light" (for light-weight) and "full". We wish to have more "light" readers than the "full" ones, to avoid indirectly inducing the "stop_machine effect" without even actually using stop_machine(). And the patches show that it _is_ generally true: 5 patches deal with "light" readers, whereas only 1 patch deals with a "full" reader. Also, the "light" readers happen to be in very hot paths. So it makes a lot of sense to have such a distinction and a corresponding light-weight API. Links to previous versions: v4: https://lkml.org/lkml/2012/12/11/209 v3: https://lkml.org/lkml/2012/12/7/287 v2: https://lkml.org/lkml/2012/12/5/322 v1: https://lkml.org/lkml/2012/12/4/88 -- Paul E. McKenney (1): cpu: No more __stop_machine() in _cpu_down() Srivatsa S. Bhat (44): percpu_rwlock: Introduce the global reader-writer lock backend percpu_rwlock: Introduce per-CPU variables for the reader and the writer percpu_rwlock: Provide a way to define and init
[PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
Hi, This patchset removes CPU hotplug's dependence on stop_machine() from the CPU offline path and provides an alternative (set of APIs) to preempt_disable() to prevent CPUs from going offline, which can be invoked from atomic context. The motivation behind the removal of stop_machine() is to avoid its ill-effects and thus improve the design of CPU hotplug. (More description regarding this is available in the patches). All the users of preempt_disable()/local_irq_disable() who used to use it to prevent CPU offline, have been converted to the new primitives introduced in the patchset. Also, the CPU_DYING notifiers have been audited to check whether they can cope up with the removal of stop_machine() or whether they need to use new locks for synchronization (all CPU_DYING notifiers looked OK, without the need for any new locks). Applies on v3.8-rc4. It currently has some locking issues with cpu idle (on which even lockdep didn't provide any insight unfortunately). So for now, it works with CONFIG_CPU_IDLE=n. Overview of the patches: --- Patches 1 to 6 introduce a generic, flexible Per-CPU Reader-Writer Locking scheme. Patch 7 uses this synchronization mechanism to build the get/put_online_cpus_atomic() APIs which can be used from atomic context, to prevent CPUs from going offline. Patch 8 is a cleanup; it converts preprocessor macros to static inline functions. Patches 9 to 42 convert various call-sites to use the new APIs. Patch 43 is the one which actually removes stop_machine() from the CPU offline path. Patch 44 decouples stop_machine() and CPU hotplug from Kconfig. Patch 45 updates the documentation to reflect the new APIs. Changes in v5: -- Exposed a new generic locking scheme: Flexible Per-CPU Reader-Writer locks, based on the synchronization schemes already discussed in the previous versions, and used it in CPU hotplug, to implement the new APIs. Audited the CPU_DYING notifiers in the kernel source tree and replaced usages of preempt_disable() with the new get/put_online_cpus_atomic() APIs where necessary. Changes in v4: -- The synchronization scheme has been simplified quite a bit, which makes it look a lot less complex than before. Some highlights: * Implicit ACKs: The earlier design required the readers to explicitly ACK the writer's signal. The new design uses implicit ACKs instead. The reader switching over to rwlock implicitly tells the writer to stop waiting for that reader. * No atomic operations: Since we got rid of explicit ACKs, we no longer have the need for a reader and a writer to update the same counter. So we can get rid of atomic ops too. Changes in v3: -- * Dropped the _light() and _full() variants of the APIs. Provided a single interface: get/put_online_cpus_atomic(). * Completely redesigned the synchronization mechanism again, to make it fast and scalable at the reader-side in the fast-path (when no hotplug writers are active). This new scheme also ensures that there is no possibility of deadlocks due to circular locking dependency. In summary, this provides the scalability and speed of per-cpu rwlocks (without actually using them), while avoiding the downside (deadlock possibilities) which is inherent in any per-cpu locking scheme that is meant to compete with preempt_disable()/enable() in terms of flexibility. The problem with using per-cpu locking to replace preempt_disable()/enable was explained here: https://lkml.org/lkml/2012/12/6/290 Basically we use per-cpu counters (for scalability) when no writers are active, and then switch to global rwlocks (for lock-safety) when a writer becomes active. It is a slightly complex scheme, but it is based on standard principles of distributed algorithms. Changes in v2: - * Completely redesigned the synchronization scheme to avoid using any extra cpumasks. * Provided APIs for 2 types of atomic hotplug readers: light (for light-weight) and full. We wish to have more light readers than the full ones, to avoid indirectly inducing the stop_machine effect without even actually using stop_machine(). And the patches show that it _is_ generally true: 5 patches deal with light readers, whereas only 1 patch deals with a full reader. Also, the light readers happen to be in very hot paths. So it makes a lot of sense to have such a distinction and a corresponding light-weight API. Links to previous versions: v4: https://lkml.org/lkml/2012/12/11/209 v3: https://lkml.org/lkml/2012/12/7/287 v2: https://lkml.org/lkml/2012/12/5/322 v1: https://lkml.org/lkml/2012/12/4/88 -- Paul E. McKenney (1): cpu: No more __stop_machine() in _cpu_down() Srivatsa S. Bhat (44): percpu_rwlock: Introduce the global reader-writer lock backend percpu_rwlock: Introduce per-CPU variables for the reader and the writer percpu_rwlock: Provide a way to define and init percpu-rwlocks at