Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-19 Thread Vincent Guittot
On 18 February 2013 20:53, Steven Rostedt  wrote:
> On Mon, 2013-02-18 at 17:50 +0100, Vincent Guittot wrote:
>
>> yes for sure.
>> The problem is more linked to cpuidle and function tracer.
>>
>> cpu hotplug and function tracer work when cpuidle is disable.
>> cpu hotplug and cpuidle works if i don't enable function tracer.
>> my platform is dead as soon as I enable function tracer if cpuidle is
>> enabled. It looks like some notrace are missing in my platform driver
>> but we haven't completely fix the issue yet
>>
>
> You can bisect to find out exactly what function is the problem:
>
>  cat /debug/tracing/available_filter_functions > t
>
> f(t) {
>  num=`wc -l t`
>  sed -ne "1,${num}p" t > t1
>  let num=num+1
>  sed -ne "${num},$p" t > t2
>
>  cat t1 > /debug/tracing/set_ftrace_filter
>  # note this may take a long time to finish
>
>  echo function > /debug/tracing/current_tracer
>
>  
> }
>

Thanks, i'm going to have a look

Vincent

> -- Steve
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-19 Thread Vincent Guittot
On 18 February 2013 20:53, Steven Rostedt rost...@goodmis.org wrote:
 On Mon, 2013-02-18 at 17:50 +0100, Vincent Guittot wrote:

 yes for sure.
 The problem is more linked to cpuidle and function tracer.

 cpu hotplug and function tracer work when cpuidle is disable.
 cpu hotplug and cpuidle works if i don't enable function tracer.
 my platform is dead as soon as I enable function tracer if cpuidle is
 enabled. It looks like some notrace are missing in my platform driver
 but we haven't completely fix the issue yet


 You can bisect to find out exactly what function is the problem:

  cat /debug/tracing/available_filter_functions  t

 f(t) {
  num=`wc -l t`
  sed -ne 1,${num}p t  t1
  let num=num+1
  sed -ne ${num},$p t  t2

  cat t1  /debug/tracing/set_ftrace_filter
  # note this may take a long time to finish

  echo function  /debug/tracing/current_tracer

  failed? bisect f(t1), if not bisect f(t2)
 }


Thanks, i'm going to have a look

Vincent

 -- Steve



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Steven Rostedt
On Mon, 2013-02-18 at 17:50 +0100, Vincent Guittot wrote:

> yes for sure.
> The problem is more linked to cpuidle and function tracer.
> 
> cpu hotplug and function tracer work when cpuidle is disable.
> cpu hotplug and cpuidle works if i don't enable function tracer.
> my platform is dead as soon as I enable function tracer if cpuidle is
> enabled. It looks like some notrace are missing in my platform driver
> but we haven't completely fix the issue yet
> 

You can bisect to find out exactly what function is the problem:

 cat /debug/tracing/available_filter_functions > t
 
f(t) {
 num=`wc -l t`
 sed -ne "1,${num}p" t > t1
 let num=num+1
 sed -ne "${num},$p" t > t2

 cat t1 > /debug/tracing/set_ftrace_filter
 # note this may take a long time to finish

 echo function > /debug/tracing/current_tracer

 
}

-- Steve



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Steven Rostedt
On Mon, 2013-02-18 at 17:50 +0100, Vincent Guittot wrote:

> yes for sure.
> The problem is more linked to cpuidle and function tracer.
> 
> cpu hotplug and function tracer work when cpuidle is disable.
> cpu hotplug and cpuidle works if i don't enable function tracer.
> my platform is dead as soon as I enable function tracer if cpuidle is
> enabled. It looks like some notrace are missing in my platform driver
> but we haven't completely fix the issue yet
> 

You can bisect to find out exactly what function is the problem:

 cat /debug/tracing/available_filter_functions > t
 
f(t) {
 num=`wc -l t`
 sed -ne "1,${num}p" t > t1
 let num=num+1
 sed -ne "${num},$p" t > t2

 cat t1 > /debug/tracing/set_ftrace_filter
 # note this may take a long time to finish

 echo function > /debug/tracing/current_tracer

 
}




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Vincent Guittot
On 18 February 2013 16:30, Steven Rostedt  wrote:
> On Mon, 2013-02-18 at 11:58 +0100, Vincent Guittot wrote:
>
>> My tests have been done without cpuidle because i have some issues
>> with function tracer and cpuidle
>>
>> But the cpu hotplug and cpuidle work well when I run the tests without
>> enabling the function tracer
>>
>
> I know suspend and resume has issues with function tracing (because it
> makes things like calling smp_processor_id() crash the system), but I'm
> unaware of issues with hotplug itself. Could be some of the same issues.
>
> Can you give me more details, I'll try to investigate it.

yes for sure.
The problem is more linked to cpuidle and function tracer.

cpu hotplug and function tracer work when cpuidle is disable.
cpu hotplug and cpuidle works if i don't enable function tracer.
my platform is dead as soon as I enable function tracer if cpuidle is
enabled. It looks like some notrace are missing in my platform driver
but we haven't completely fix the issue yet

Vincent

>
> Thanks,
>
> -- Steve
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Steven Rostedt
On Mon, 2013-02-18 at 11:58 +0100, Vincent Guittot wrote:

> My tests have been done without cpuidle because i have some issues
> with function tracer and cpuidle
> 
> But the cpu hotplug and cpuidle work well when I run the tests without
> enabling the function tracer
> 

I know suspend and resume has issues with function tracing (because it
makes things like calling smp_processor_id() crash the system), but I'm
unaware of issues with hotplug itself. Could be some of the same issues.

Can you give me more details, I'll try to investigate it.

Thanks,

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Srivatsa S. Bhat
On 02/18/2013 04:24 PM, Thomas Gleixner wrote:
> On Mon, 18 Feb 2013, Srivatsa S. Bhat wrote:
>> Lockup observed while running this patchset, with CPU_IDLE and INTEL_IDLE 
>> turned
>> on in the .config:
>>
>>  smpboot: CPU 1 is now offline
>> Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 11
>> Pid: 0, comm: swapper/11 Not tainted 3.8.0-rc7+stpmch13-1 #8
>> Call Trace:
>>  [] do_raw_spin_lock+0x7e/0x150
>>  [] _raw_spin_lock_irqsave+0x61/0x70
>>  [] ? clockevents_notify+0x28/0x150
>>  [] ? _raw_spin_unlock_irqrestore+0x77/0x80
>>  [] clockevents_notify+0x28/0x150
>>  [] intel_idle+0xaf/0xe0
>>  [] ? disable_cpuidle+0x20/0x20
>>  [] cpuidle_enter+0x19/0x20
>>  [] cpuidle_wrap_enter+0x41/0xa0
>>  [] cpuidle_enter_tk+0x10/0x20
>>  [] cpuidle_enter_state+0x17/0x50
>>  [] cpuidle_idle_call+0xd9/0x290
>>  [] cpu_idle+0xe5/0x140
>>  [] start_secondary+0xdd/0xdf
> 
>> BUG: spinlock lockup suspected on CPU#2, migration/2/19
>>  lock: clockevents_lock+0x0/0x40, .magic: dead4ead, .owner: swapper/8/0, 
>> .owner_cpu: 8
> 
> Unfortunately there is no back trace for cpu8.

Yes :-(

I had run this several times hoping to get a backtrace on the lock-holder,
expecting trigger_all_cpu_backtrace() to get it right at least once. But I
hadn't succeeded even once.

> That's probably caused
> by the watchdog -> panic setting.
> 

Oh, ok..

> So we have no idea why cpu2 and 11 get stuck on the clockevents_lock
> and without that information it's impossible to decode.
> 

But thankfully, the issue seems to have been resolved by the diff I posted
in my previous mail, along with the fixes related to memory barriers.

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Vincent Guittot
On 18 February 2013 11:51, Srivatsa S. Bhat
 wrote:
> On 02/18/2013 04:04 PM, Srivatsa S. Bhat wrote:
>> On 02/18/2013 03:54 PM, Vincent Guittot wrote:
>>> On 15 February 2013 20:40, Srivatsa S. Bhat
>>>  wrote:
 Hi Vincent,

 On 02/15/2013 06:58 PM, Vincent Guittot wrote:
> Hi Srivatsa,
>
> I have run some tests with you branch (thanks Paul for the git tree)
> and you will find results below.
>

 Thank you very much for testing this patchset!

> The tests condition are:
> - 5 CPUs system in 2 clusters
> - The test plugs/unplugs CPU2 and it increases the system load each 20
> plug/unplug sequence with either more cyclictests threads
> - The test is done with all CPUs online and with only CPU0 and CPU2
>
> The main conclusion is that there is no differences with and without
> your patches with my stress tests. I'm not sure that it was the
> expected results but the cpu_down is already quite low : 4-5ms in
> average
>

 Atleast my patchset doesn't perform _worse_ than mainline, with respect
 to cpu_down duration :-)
>>>
>>> yes exactly and it has pass  more than 400 consecutive plug/unplug on
>>> an ARM platform
>>>
>>
>> Great! However, did you turn on CPU_IDLE during your tests?
>>
>> In my tests, I had turned off cpu idle in the .config, like I had mentioned 
>> in
>> the cover letter. I'm struggling to get it working with CPU_IDLE/INTEL_IDLE
>> turned on, because it gets into a lockup almost immediately. It appears that
>> the lock-holder of clockevents_lock never releases it, for some reason..
>> See below for the full log. Lockdep has not been useful in debugging this,
>> unfortunately :-(
>>
>
> Ah, nevermind, the following diff fixes it :-) I had applied this fix on v5
> and tested but it still had races where I used to hit the lockups. Now after
> I fixed all the memory barrier issues that Paul and Oleg pointed out in v5,
> I applied this fix again and tested it just now - it works beautifully! :-)

My tests have been done without cpuidle because i have some issues
with function tracer and cpuidle

But the cpu hotplug and cpuidle work well when I run the tests without
enabling the function tracer

Vincent

>
> I'll include this fix and post a v6 soon.
>
> Regards,
> Srivatsa S. Bhat
>
> --->
>
>
> diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> index 30b6de0..ca340fd 100644
> --- a/kernel/time/clockevents.c
> +++ b/kernel/time/clockevents.c
> @@ -17,6 +17,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include "tick-internal.h"
>
> @@ -431,6 +432,7 @@ void clockevents_notify(unsigned long reason, void *arg)
> unsigned long flags;
> int cpu;
>
> +   get_online_cpus_atomic();
> raw_spin_lock_irqsave(_lock, flags);
> clockevents_do_notify(reason, arg);
>
> @@ -459,6 +461,7 @@ void clockevents_notify(unsigned long reason, void *arg)
> break;
> }
> raw_spin_unlock_irqrestore(_lock, flags);
> +   put_online_cpus_atomic();
>  }
>  EXPORT_SYMBOL_GPL(clockevents_notify);
>  #endif
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Thomas Gleixner
On Mon, 18 Feb 2013, Srivatsa S. Bhat wrote:
> Lockup observed while running this patchset, with CPU_IDLE and INTEL_IDLE 
> turned
> on in the .config:
> 
>  smpboot: CPU 1 is now offline
> Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 11
> Pid: 0, comm: swapper/11 Not tainted 3.8.0-rc7+stpmch13-1 #8
> Call Trace:
>  [] do_raw_spin_lock+0x7e/0x150
>  [] _raw_spin_lock_irqsave+0x61/0x70
>  [] ? clockevents_notify+0x28/0x150
>  [] ? _raw_spin_unlock_irqrestore+0x77/0x80
>  [] clockevents_notify+0x28/0x150
>  [] intel_idle+0xaf/0xe0
>  [] ? disable_cpuidle+0x20/0x20
>  [] cpuidle_enter+0x19/0x20
>  [] cpuidle_wrap_enter+0x41/0xa0
>  [] cpuidle_enter_tk+0x10/0x20
>  [] cpuidle_enter_state+0x17/0x50
>  [] cpuidle_idle_call+0xd9/0x290
>  [] cpu_idle+0xe5/0x140
>  [] start_secondary+0xdd/0xdf

> BUG: spinlock lockup suspected on CPU#2, migration/2/19
>  lock: clockevents_lock+0x0/0x40, .magic: dead4ead, .owner: swapper/8/0, 
> .owner_cpu: 8

Unfortunately there is no back trace for cpu8. That's probably caused
by the watchdog -> panic setting.

So we have no idea why cpu2 and 11 get stuck on the clockevents_lock
and without that information it's impossible to decode.

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Srivatsa S. Bhat
On 02/18/2013 04:04 PM, Srivatsa S. Bhat wrote:
> On 02/18/2013 03:54 PM, Vincent Guittot wrote:
>> On 15 February 2013 20:40, Srivatsa S. Bhat
>>  wrote:
>>> Hi Vincent,
>>>
>>> On 02/15/2013 06:58 PM, Vincent Guittot wrote:
 Hi Srivatsa,

 I have run some tests with you branch (thanks Paul for the git tree)
 and you will find results below.

>>>
>>> Thank you very much for testing this patchset!
>>>
 The tests condition are:
 - 5 CPUs system in 2 clusters
 - The test plugs/unplugs CPU2 and it increases the system load each 20
 plug/unplug sequence with either more cyclictests threads
 - The test is done with all CPUs online and with only CPU0 and CPU2

 The main conclusion is that there is no differences with and without
 your patches with my stress tests. I'm not sure that it was the
 expected results but the cpu_down is already quite low : 4-5ms in
 average

>>>
>>> Atleast my patchset doesn't perform _worse_ than mainline, with respect
>>> to cpu_down duration :-)
>>
>> yes exactly and it has pass  more than 400 consecutive plug/unplug on
>> an ARM platform
>>
> 
> Great! However, did you turn on CPU_IDLE during your tests?
> 
> In my tests, I had turned off cpu idle in the .config, like I had mentioned in
> the cover letter. I'm struggling to get it working with CPU_IDLE/INTEL_IDLE
> turned on, because it gets into a lockup almost immediately. It appears that
> the lock-holder of clockevents_lock never releases it, for some reason..
> See below for the full log. Lockdep has not been useful in debugging this,
> unfortunately :-(
> 

Ah, nevermind, the following diff fixes it :-) I had applied this fix on v5
and tested but it still had races where I used to hit the lockups. Now after
I fixed all the memory barrier issues that Paul and Oleg pointed out in v5,
I applied this fix again and tested it just now - it works beautifully! :-)

I'll include this fix and post a v6 soon.

Regards,
Srivatsa S. Bhat

--->


diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 30b6de0..ca340fd 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "tick-internal.h"
 
@@ -431,6 +432,7 @@ void clockevents_notify(unsigned long reason, void *arg)
unsigned long flags;
int cpu;
 
+   get_online_cpus_atomic();
raw_spin_lock_irqsave(_lock, flags);
clockevents_do_notify(reason, arg);
 
@@ -459,6 +461,7 @@ void clockevents_notify(unsigned long reason, void *arg)
break;
}
raw_spin_unlock_irqrestore(_lock, flags);
+   put_online_cpus_atomic();
 }
 EXPORT_SYMBOL_GPL(clockevents_notify);
 #endif

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Srivatsa S. Bhat
On 02/18/2013 03:54 PM, Vincent Guittot wrote:
> On 15 February 2013 20:40, Srivatsa S. Bhat
>  wrote:
>> Hi Vincent,
>>
>> On 02/15/2013 06:58 PM, Vincent Guittot wrote:
>>> Hi Srivatsa,
>>>
>>> I have run some tests with you branch (thanks Paul for the git tree)
>>> and you will find results below.
>>>
>>
>> Thank you very much for testing this patchset!
>>
>>> The tests condition are:
>>> - 5 CPUs system in 2 clusters
>>> - The test plugs/unplugs CPU2 and it increases the system load each 20
>>> plug/unplug sequence with either more cyclictests threads
>>> - The test is done with all CPUs online and with only CPU0 and CPU2
>>>
>>> The main conclusion is that there is no differences with and without
>>> your patches with my stress tests. I'm not sure that it was the
>>> expected results but the cpu_down is already quite low : 4-5ms in
>>> average
>>>
>>
>> Atleast my patchset doesn't perform _worse_ than mainline, with respect
>> to cpu_down duration :-)
> 
> yes exactly and it has pass  more than 400 consecutive plug/unplug on
> an ARM platform
>

Great! However, did you turn on CPU_IDLE during your tests?

In my tests, I had turned off cpu idle in the .config, like I had mentioned in
the cover letter. I'm struggling to get it working with CPU_IDLE/INTEL_IDLE
turned on, because it gets into a lockup almost immediately. It appears that
the lock-holder of clockevents_lock never releases it, for some reason..
See below for the full log. Lockdep has not been useful in debugging this,
unfortunately :-(

>>
>> So, here is the analysis:
>> Stop-machine() doesn't really slow down CPU-down operation, if the rest
>> of the CPUs are mostly running in userspace all the time. Because, the
>> CPUs running userspace workloads cooperate very eagerly with the stop-machine
>> dance - they receive the resched IPI, and allow the per-cpu cpu-stopper
>> thread to monopolize the CPU, almost immediately.
>>
>> The scenario where stop-machine() takes longer to take effect is when
>> most of the online CPUs are running in kernelspace, because, then the
>> probability that they call preempt_disable() frequently (and hence inhibit
>> stop-machine) is higher. That's why, in my tests, I ran genload from LTP
>> which generated a lot of system-time (system-time in 'top' indicates activity
>> in kernelspace). Hence my patchset showed significant improvement over
>> mainline in my tests.
>>
> 
> ok, I hadn't noticed this important point for the test
> 
>> However, your test is very useful too, if we measure a different parameter:
>> the latency impact on the workloads running on the system (cyclic test).
>> One other important aim of this patchset is to make hotplug as less intrusive
>> as possible, for other workloads running on the system. So if you measure
>> the cyclictest numbers, I would expect my patchset to show better numbers
>> than mainline, when you do cpu-hotplug in parallel (same test that you did).
>> Mainline would run stop-machine and hence interrupt the cyclic test tasks
>> too often. My patchset wouldn't do that, and hence cyclic test should
>> ideally show better numbers.
> 
> In fact, I haven't looked at the results as i was more interested by
> the load that was generated
> 
>>
>> I'd really appreciate if you could try that out and let me know how it
>> goes.. :-) Thank you very much!
> 
> ok, I'm going to try to run a test series
> 

Great! Thank you :-)
Regards,
Srivatsa S. Bhat



Lockup observed while running this patchset, with CPU_IDLE and INTEL_IDLE turned
on in the .config:

 smpboot: CPU 1 is now offline
Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 11
Pid: 0, comm: swapper/11 Not tainted 3.8.0-rc7+stpmch13-1 #8
Call Trace:
   [] panic+0xc9/0x1ee
 [] watchdog_overflow_callback+0xb1/0xc0
 [] __perf_event_overflow+0x9c/0x330
 [] ? x86_perf_event_set_period+0xd8/0x160
 [] perf_event_overflow+0x14/0x20
 [] intel_pmu_handle_irq+0x1c4/0x360
 [] perf_event_nmi_handler+0x21/0x30
 [] nmi_handle+0xb6/0x200
 [] ? oops_begin+0xd0/0xd0
 [] default_do_nmi+0x68/0x220
 [] do_nmi+0xc0/0x110
 [] end_repeat_nmi+0x1e/0x2e
 [] ? delay_tsc+0x38/0xb0
 [] ? delay_tsc+0x38/0xb0
 [] ? delay_tsc+0x38/0xb0
 <>  [] __delay+0xf/0x20
 [] do_raw_spin_lock+0x7e/0x150
 [] _raw_spin_lock_irqsave+0x61/0x70
 [] ? clockevents_notify+0x28/0x150
 [] ? _raw_spin_unlock_irqrestore+0x77/0x80
 [] clockevents_notify+0x28/0x150
 [] intel_idle+0xaf/0xe0
 [] ? disable_cpuidle+0x20/0x20
 [] cpuidle_enter+0x19/0x20
 [] cpuidle_wrap_enter+0x41/0xa0
 [] cpuidle_enter_tk+0x10/0x20
 [] cpuidle_enter_state+0x17/0x50
 [] cpuidle_idle_call+0xd9/0x290
 [] cpu_idle+0xe5/0x140
 [] start_secondary+0xdd/0xdf
BUG: spinlock lockup suspected on CPU#2, migration/2/19
 lock: clockevents_lock+0x0/0x40, .magic: dead4ead, .owner: swapper/8/0, 
.owner_cpu: 8
Pid: 19, comm: migration/2 Not tainted 3.8.0-rc7+stpmch13-1 #8
Call Trace:
 [] spin_dump+0x78/0xc0
 [] 

Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Vincent Guittot
On 15 February 2013 20:40, Srivatsa S. Bhat
 wrote:
> Hi Vincent,
>
> On 02/15/2013 06:58 PM, Vincent Guittot wrote:
>> Hi Srivatsa,
>>
>> I have run some tests with you branch (thanks Paul for the git tree)
>> and you will find results below.
>>
>
> Thank you very much for testing this patchset!
>
>> The tests condition are:
>> - 5 CPUs system in 2 clusters
>> - The test plugs/unplugs CPU2 and it increases the system load each 20
>> plug/unplug sequence with either more cyclictests threads
>> - The test is done with all CPUs online and with only CPU0 and CPU2
>>
>> The main conclusion is that there is no differences with and without
>> your patches with my stress tests. I'm not sure that it was the
>> expected results but the cpu_down is already quite low : 4-5ms in
>> average
>>
>
> Atleast my patchset doesn't perform _worse_ than mainline, with respect
> to cpu_down duration :-)

yes exactly and it has pass  more than 400 consecutive plug/unplug on
an ARM platform

>
> So, here is the analysis:
> Stop-machine() doesn't really slow down CPU-down operation, if the rest
> of the CPUs are mostly running in userspace all the time. Because, the
> CPUs running userspace workloads cooperate very eagerly with the stop-machine
> dance - they receive the resched IPI, and allow the per-cpu cpu-stopper
> thread to monopolize the CPU, almost immediately.
>
> The scenario where stop-machine() takes longer to take effect is when
> most of the online CPUs are running in kernelspace, because, then the
> probability that they call preempt_disable() frequently (and hence inhibit
> stop-machine) is higher. That's why, in my tests, I ran genload from LTP
> which generated a lot of system-time (system-time in 'top' indicates activity
> in kernelspace). Hence my patchset showed significant improvement over
> mainline in my tests.
>

ok, I hadn't noticed this important point for the test

> However, your test is very useful too, if we measure a different parameter:
> the latency impact on the workloads running on the system (cyclic test).
> One other important aim of this patchset is to make hotplug as less intrusive
> as possible, for other workloads running on the system. So if you measure
> the cyclictest numbers, I would expect my patchset to show better numbers
> than mainline, when you do cpu-hotplug in parallel (same test that you did).
> Mainline would run stop-machine and hence interrupt the cyclic test tasks
> too often. My patchset wouldn't do that, and hence cyclic test should
> ideally show better numbers.

In fact, I haven't looked at the results as i was more interested by
the load that was generated

>
> I'd really appreciate if you could try that out and let me know how it
> goes.. :-) Thank you very much!

ok, I'm going to try to run a test series

Vincent
>
> Regards,
> Srivatsa S. Bhat
>
>>
>>
>> On 12 February 2013 04:58, Srivatsa S. Bhat
>>  wrote:
>>> On 02/12/2013 12:38 AM, Paul E. McKenney wrote:
 On Mon, Feb 11, 2013 at 05:53:41PM +0530, Srivatsa S. Bhat wrote:
> On 02/11/2013 05:28 PM, Vincent Guittot wrote:
>> On 8 February 2013 19:09, Srivatsa S. Bhat
>>  wrote:

 [ . . . ]

>>> Adding Vincent to CC, who had previously evaluated the performance and
>>> latency implications of CPU hotplug on ARM platforms, IIRC.
>>>
>>
>> Hi Srivatsa,
>>
>> I can try to run some of our stress tests on your patches.
>
> Great!
>
>> Have you
>> got a git tree that i can pull ?
>>
>
> Unfortunately, no, none at the moment..  :-(

 You do need to create an externally visible git tree.
>>>
>>> Ok, I'll do that soon.
>>>
  In the meantime,
 I have added your series at rcu/bhat.2013.01.21a on -rcu:

 git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git

 This should appear soon on a kernel.org mirror near you.  ;-)

>>>
>>> Thank you very much, Paul! :-)
>>>
>>> Regards,
>>> Srivatsa S. Bhat
>>>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Vincent Guittot
On 15 February 2013 20:40, Srivatsa S. Bhat
srivatsa.b...@linux.vnet.ibm.com wrote:
 Hi Vincent,

 On 02/15/2013 06:58 PM, Vincent Guittot wrote:
 Hi Srivatsa,

 I have run some tests with you branch (thanks Paul for the git tree)
 and you will find results below.


 Thank you very much for testing this patchset!

 The tests condition are:
 - 5 CPUs system in 2 clusters
 - The test plugs/unplugs CPU2 and it increases the system load each 20
 plug/unplug sequence with either more cyclictests threads
 - The test is done with all CPUs online and with only CPU0 and CPU2

 The main conclusion is that there is no differences with and without
 your patches with my stress tests. I'm not sure that it was the
 expected results but the cpu_down is already quite low : 4-5ms in
 average


 Atleast my patchset doesn't perform _worse_ than mainline, with respect
 to cpu_down duration :-)

yes exactly and it has pass  more than 400 consecutive plug/unplug on
an ARM platform


 So, here is the analysis:
 Stop-machine() doesn't really slow down CPU-down operation, if the rest
 of the CPUs are mostly running in userspace all the time. Because, the
 CPUs running userspace workloads cooperate very eagerly with the stop-machine
 dance - they receive the resched IPI, and allow the per-cpu cpu-stopper
 thread to monopolize the CPU, almost immediately.

 The scenario where stop-machine() takes longer to take effect is when
 most of the online CPUs are running in kernelspace, because, then the
 probability that they call preempt_disable() frequently (and hence inhibit
 stop-machine) is higher. That's why, in my tests, I ran genload from LTP
 which generated a lot of system-time (system-time in 'top' indicates activity
 in kernelspace). Hence my patchset showed significant improvement over
 mainline in my tests.


ok, I hadn't noticed this important point for the test

 However, your test is very useful too, if we measure a different parameter:
 the latency impact on the workloads running on the system (cyclic test).
 One other important aim of this patchset is to make hotplug as less intrusive
 as possible, for other workloads running on the system. So if you measure
 the cyclictest numbers, I would expect my patchset to show better numbers
 than mainline, when you do cpu-hotplug in parallel (same test that you did).
 Mainline would run stop-machine and hence interrupt the cyclic test tasks
 too often. My patchset wouldn't do that, and hence cyclic test should
 ideally show better numbers.

In fact, I haven't looked at the results as i was more interested by
the load that was generated


 I'd really appreciate if you could try that out and let me know how it
 goes.. :-) Thank you very much!

ok, I'm going to try to run a test series

Vincent

 Regards,
 Srivatsa S. Bhat



 On 12 February 2013 04:58, Srivatsa S. Bhat
 srivatsa.b...@linux.vnet.ibm.com wrote:
 On 02/12/2013 12:38 AM, Paul E. McKenney wrote:
 On Mon, Feb 11, 2013 at 05:53:41PM +0530, Srivatsa S. Bhat wrote:
 On 02/11/2013 05:28 PM, Vincent Guittot wrote:
 On 8 February 2013 19:09, Srivatsa S. Bhat
 srivatsa.b...@linux.vnet.ibm.com wrote:

 [ . . . ]

 Adding Vincent to CC, who had previously evaluated the performance and
 latency implications of CPU hotplug on ARM platforms, IIRC.


 Hi Srivatsa,

 I can try to run some of our stress tests on your patches.

 Great!

 Have you
 got a git tree that i can pull ?


 Unfortunately, no, none at the moment..  :-(

 You do need to create an externally visible git tree.

 Ok, I'll do that soon.

  In the meantime,
 I have added your series at rcu/bhat.2013.01.21a on -rcu:

 git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git

 This should appear soon on a kernel.org mirror near you.  ;-)


 Thank you very much, Paul! :-)

 Regards,
 Srivatsa S. Bhat



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Srivatsa S. Bhat
On 02/18/2013 03:54 PM, Vincent Guittot wrote:
 On 15 February 2013 20:40, Srivatsa S. Bhat
 srivatsa.b...@linux.vnet.ibm.com wrote:
 Hi Vincent,

 On 02/15/2013 06:58 PM, Vincent Guittot wrote:
 Hi Srivatsa,

 I have run some tests with you branch (thanks Paul for the git tree)
 and you will find results below.


 Thank you very much for testing this patchset!

 The tests condition are:
 - 5 CPUs system in 2 clusters
 - The test plugs/unplugs CPU2 and it increases the system load each 20
 plug/unplug sequence with either more cyclictests threads
 - The test is done with all CPUs online and with only CPU0 and CPU2

 The main conclusion is that there is no differences with and without
 your patches with my stress tests. I'm not sure that it was the
 expected results but the cpu_down is already quite low : 4-5ms in
 average


 Atleast my patchset doesn't perform _worse_ than mainline, with respect
 to cpu_down duration :-)
 
 yes exactly and it has pass  more than 400 consecutive plug/unplug on
 an ARM platform


Great! However, did you turn on CPU_IDLE during your tests?

In my tests, I had turned off cpu idle in the .config, like I had mentioned in
the cover letter. I'm struggling to get it working with CPU_IDLE/INTEL_IDLE
turned on, because it gets into a lockup almost immediately. It appears that
the lock-holder of clockevents_lock never releases it, for some reason..
See below for the full log. Lockdep has not been useful in debugging this,
unfortunately :-(


 So, here is the analysis:
 Stop-machine() doesn't really slow down CPU-down operation, if the rest
 of the CPUs are mostly running in userspace all the time. Because, the
 CPUs running userspace workloads cooperate very eagerly with the stop-machine
 dance - they receive the resched IPI, and allow the per-cpu cpu-stopper
 thread to monopolize the CPU, almost immediately.

 The scenario where stop-machine() takes longer to take effect is when
 most of the online CPUs are running in kernelspace, because, then the
 probability that they call preempt_disable() frequently (and hence inhibit
 stop-machine) is higher. That's why, in my tests, I ran genload from LTP
 which generated a lot of system-time (system-time in 'top' indicates activity
 in kernelspace). Hence my patchset showed significant improvement over
 mainline in my tests.

 
 ok, I hadn't noticed this important point for the test
 
 However, your test is very useful too, if we measure a different parameter:
 the latency impact on the workloads running on the system (cyclic test).
 One other important aim of this patchset is to make hotplug as less intrusive
 as possible, for other workloads running on the system. So if you measure
 the cyclictest numbers, I would expect my patchset to show better numbers
 than mainline, when you do cpu-hotplug in parallel (same test that you did).
 Mainline would run stop-machine and hence interrupt the cyclic test tasks
 too often. My patchset wouldn't do that, and hence cyclic test should
 ideally show better numbers.
 
 In fact, I haven't looked at the results as i was more interested by
 the load that was generated
 

 I'd really appreciate if you could try that out and let me know how it
 goes.. :-) Thank you very much!
 
 ok, I'm going to try to run a test series
 

Great! Thank you :-)
Regards,
Srivatsa S. Bhat



Lockup observed while running this patchset, with CPU_IDLE and INTEL_IDLE turned
on in the .config:

 smpboot: CPU 1 is now offline
Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 11
Pid: 0, comm: swapper/11 Not tainted 3.8.0-rc7+stpmch13-1 #8
Call Trace:
 NMI  [815a319e] panic+0xc9/0x1ee
 [810fdd41] watchdog_overflow_callback+0xb1/0xc0
 [8113ab5c] __perf_event_overflow+0x9c/0x330
 [81028a88] ? x86_perf_event_set_period+0xd8/0x160
 [8113b514] perf_event_overflow+0x14/0x20
 [8102ee54] intel_pmu_handle_irq+0x1c4/0x360
 [815a8ef1] perf_event_nmi_handler+0x21/0x30
 [815a8366] nmi_handle+0xb6/0x200
 [815a82b0] ? oops_begin+0xd0/0xd0
 [815a85c8] default_do_nmi+0x68/0x220
 [815a8840] do_nmi+0xc0/0x110
 [815a7911] end_repeat_nmi+0x1e/0x2e
 [812a3f98] ? delay_tsc+0x38/0xb0
 [812a3f98] ? delay_tsc+0x38/0xb0
 [812a3f98] ? delay_tsc+0x38/0xb0
 EOE  [812a3f1f] __delay+0xf/0x20
 [812aba1e] do_raw_spin_lock+0x7e/0x150
 [815a64c1] _raw_spin_lock_irqsave+0x61/0x70
 [810c0758] ? clockevents_notify+0x28/0x150
 [815a6d37] ? _raw_spin_unlock_irqrestore+0x77/0x80
 [810c0758] clockevents_notify+0x28/0x150
 [8130459f] intel_idle+0xaf/0xe0
 [81472ee0] ? disable_cpuidle+0x20/0x20
 [81472ef9] cpuidle_enter+0x19/0x20
 [814734c1] cpuidle_wrap_enter+0x41/0xa0
 [81473530] cpuidle_enter_tk+0x10/0x20
 [81472f17] cpuidle_enter_state+0x17/0x50
 [81473899] 

Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Srivatsa S. Bhat
On 02/18/2013 04:04 PM, Srivatsa S. Bhat wrote:
 On 02/18/2013 03:54 PM, Vincent Guittot wrote:
 On 15 February 2013 20:40, Srivatsa S. Bhat
 srivatsa.b...@linux.vnet.ibm.com wrote:
 Hi Vincent,

 On 02/15/2013 06:58 PM, Vincent Guittot wrote:
 Hi Srivatsa,

 I have run some tests with you branch (thanks Paul for the git tree)
 and you will find results below.


 Thank you very much for testing this patchset!

 The tests condition are:
 - 5 CPUs system in 2 clusters
 - The test plugs/unplugs CPU2 and it increases the system load each 20
 plug/unplug sequence with either more cyclictests threads
 - The test is done with all CPUs online and with only CPU0 and CPU2

 The main conclusion is that there is no differences with and without
 your patches with my stress tests. I'm not sure that it was the
 expected results but the cpu_down is already quite low : 4-5ms in
 average


 Atleast my patchset doesn't perform _worse_ than mainline, with respect
 to cpu_down duration :-)

 yes exactly and it has pass  more than 400 consecutive plug/unplug on
 an ARM platform

 
 Great! However, did you turn on CPU_IDLE during your tests?
 
 In my tests, I had turned off cpu idle in the .config, like I had mentioned in
 the cover letter. I'm struggling to get it working with CPU_IDLE/INTEL_IDLE
 turned on, because it gets into a lockup almost immediately. It appears that
 the lock-holder of clockevents_lock never releases it, for some reason..
 See below for the full log. Lockdep has not been useful in debugging this,
 unfortunately :-(
 

Ah, nevermind, the following diff fixes it :-) I had applied this fix on v5
and tested but it still had races where I used to hit the lockups. Now after
I fixed all the memory barrier issues that Paul and Oleg pointed out in v5,
I applied this fix again and tested it just now - it works beautifully! :-)

I'll include this fix and post a v6 soon.

Regards,
Srivatsa S. Bhat

---


diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 30b6de0..ca340fd 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -17,6 +17,7 @@
 #include linux/module.h
 #include linux/notifier.h
 #include linux/smp.h
+#include linux/cpu.h
 
 #include tick-internal.h
 
@@ -431,6 +432,7 @@ void clockevents_notify(unsigned long reason, void *arg)
unsigned long flags;
int cpu;
 
+   get_online_cpus_atomic();
raw_spin_lock_irqsave(clockevents_lock, flags);
clockevents_do_notify(reason, arg);
 
@@ -459,6 +461,7 @@ void clockevents_notify(unsigned long reason, void *arg)
break;
}
raw_spin_unlock_irqrestore(clockevents_lock, flags);
+   put_online_cpus_atomic();
 }
 EXPORT_SYMBOL_GPL(clockevents_notify);
 #endif

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Thomas Gleixner
On Mon, 18 Feb 2013, Srivatsa S. Bhat wrote:
 Lockup observed while running this patchset, with CPU_IDLE and INTEL_IDLE 
 turned
 on in the .config:
 
  smpboot: CPU 1 is now offline
 Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 11
 Pid: 0, comm: swapper/11 Not tainted 3.8.0-rc7+stpmch13-1 #8
 Call Trace:
  [812aba1e] do_raw_spin_lock+0x7e/0x150
  [815a64c1] _raw_spin_lock_irqsave+0x61/0x70
  [810c0758] ? clockevents_notify+0x28/0x150
  [815a6d37] ? _raw_spin_unlock_irqrestore+0x77/0x80
  [810c0758] clockevents_notify+0x28/0x150
  [8130459f] intel_idle+0xaf/0xe0
  [81472ee0] ? disable_cpuidle+0x20/0x20
  [81472ef9] cpuidle_enter+0x19/0x20
  [814734c1] cpuidle_wrap_enter+0x41/0xa0
  [81473530] cpuidle_enter_tk+0x10/0x20
  [81472f17] cpuidle_enter_state+0x17/0x50
  [81473899] cpuidle_idle_call+0xd9/0x290
  [810203d5] cpu_idle+0xe5/0x140
  [8159c603] start_secondary+0xdd/0xdf

 BUG: spinlock lockup suspected on CPU#2, migration/2/19
  lock: clockevents_lock+0x0/0x40, .magic: dead4ead, .owner: swapper/8/0, 
 .owner_cpu: 8

Unfortunately there is no back trace for cpu8. That's probably caused
by the watchdog - panic setting.

So we have no idea why cpu2 and 11 get stuck on the clockevents_lock
and without that information it's impossible to decode.

Thanks,

tglx

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Vincent Guittot
On 18 February 2013 11:51, Srivatsa S. Bhat
srivatsa.b...@linux.vnet.ibm.com wrote:
 On 02/18/2013 04:04 PM, Srivatsa S. Bhat wrote:
 On 02/18/2013 03:54 PM, Vincent Guittot wrote:
 On 15 February 2013 20:40, Srivatsa S. Bhat
 srivatsa.b...@linux.vnet.ibm.com wrote:
 Hi Vincent,

 On 02/15/2013 06:58 PM, Vincent Guittot wrote:
 Hi Srivatsa,

 I have run some tests with you branch (thanks Paul for the git tree)
 and you will find results below.


 Thank you very much for testing this patchset!

 The tests condition are:
 - 5 CPUs system in 2 clusters
 - The test plugs/unplugs CPU2 and it increases the system load each 20
 plug/unplug sequence with either more cyclictests threads
 - The test is done with all CPUs online and with only CPU0 and CPU2

 The main conclusion is that there is no differences with and without
 your patches with my stress tests. I'm not sure that it was the
 expected results but the cpu_down is already quite low : 4-5ms in
 average


 Atleast my patchset doesn't perform _worse_ than mainline, with respect
 to cpu_down duration :-)

 yes exactly and it has pass  more than 400 consecutive plug/unplug on
 an ARM platform


 Great! However, did you turn on CPU_IDLE during your tests?

 In my tests, I had turned off cpu idle in the .config, like I had mentioned 
 in
 the cover letter. I'm struggling to get it working with CPU_IDLE/INTEL_IDLE
 turned on, because it gets into a lockup almost immediately. It appears that
 the lock-holder of clockevents_lock never releases it, for some reason..
 See below for the full log. Lockdep has not been useful in debugging this,
 unfortunately :-(


 Ah, nevermind, the following diff fixes it :-) I had applied this fix on v5
 and tested but it still had races where I used to hit the lockups. Now after
 I fixed all the memory barrier issues that Paul and Oleg pointed out in v5,
 I applied this fix again and tested it just now - it works beautifully! :-)

My tests have been done without cpuidle because i have some issues
with function tracer and cpuidle

But the cpu hotplug and cpuidle work well when I run the tests without
enabling the function tracer

Vincent


 I'll include this fix and post a v6 soon.

 Regards,
 Srivatsa S. Bhat

 ---


 diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
 index 30b6de0..ca340fd 100644
 --- a/kernel/time/clockevents.c
 +++ b/kernel/time/clockevents.c
 @@ -17,6 +17,7 @@
  #include linux/module.h
  #include linux/notifier.h
  #include linux/smp.h
 +#include linux/cpu.h

  #include tick-internal.h

 @@ -431,6 +432,7 @@ void clockevents_notify(unsigned long reason, void *arg)
 unsigned long flags;
 int cpu;

 +   get_online_cpus_atomic();
 raw_spin_lock_irqsave(clockevents_lock, flags);
 clockevents_do_notify(reason, arg);

 @@ -459,6 +461,7 @@ void clockevents_notify(unsigned long reason, void *arg)
 break;
 }
 raw_spin_unlock_irqrestore(clockevents_lock, flags);
 +   put_online_cpus_atomic();
  }
  EXPORT_SYMBOL_GPL(clockevents_notify);
  #endif

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Srivatsa S. Bhat
On 02/18/2013 04:24 PM, Thomas Gleixner wrote:
 On Mon, 18 Feb 2013, Srivatsa S. Bhat wrote:
 Lockup observed while running this patchset, with CPU_IDLE and INTEL_IDLE 
 turned
 on in the .config:

  smpboot: CPU 1 is now offline
 Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 11
 Pid: 0, comm: swapper/11 Not tainted 3.8.0-rc7+stpmch13-1 #8
 Call Trace:
  [812aba1e] do_raw_spin_lock+0x7e/0x150
  [815a64c1] _raw_spin_lock_irqsave+0x61/0x70
  [810c0758] ? clockevents_notify+0x28/0x150
  [815a6d37] ? _raw_spin_unlock_irqrestore+0x77/0x80
  [810c0758] clockevents_notify+0x28/0x150
  [8130459f] intel_idle+0xaf/0xe0
  [81472ee0] ? disable_cpuidle+0x20/0x20
  [81472ef9] cpuidle_enter+0x19/0x20
  [814734c1] cpuidle_wrap_enter+0x41/0xa0
  [81473530] cpuidle_enter_tk+0x10/0x20
  [81472f17] cpuidle_enter_state+0x17/0x50
  [81473899] cpuidle_idle_call+0xd9/0x290
  [810203d5] cpu_idle+0xe5/0x140
  [8159c603] start_secondary+0xdd/0xdf
 
 BUG: spinlock lockup suspected on CPU#2, migration/2/19
  lock: clockevents_lock+0x0/0x40, .magic: dead4ead, .owner: swapper/8/0, 
 .owner_cpu: 8
 
 Unfortunately there is no back trace for cpu8.

Yes :-(

I had run this several times hoping to get a backtrace on the lock-holder,
expecting trigger_all_cpu_backtrace() to get it right at least once. But I
hadn't succeeded even once.

 That's probably caused
 by the watchdog - panic setting.
 

Oh, ok..

 So we have no idea why cpu2 and 11 get stuck on the clockevents_lock
 and without that information it's impossible to decode.
 

But thankfully, the issue seems to have been resolved by the diff I posted
in my previous mail, along with the fixes related to memory barriers.

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Steven Rostedt
On Mon, 2013-02-18 at 11:58 +0100, Vincent Guittot wrote:

 My tests have been done without cpuidle because i have some issues
 with function tracer and cpuidle
 
 But the cpu hotplug and cpuidle work well when I run the tests without
 enabling the function tracer
 

I know suspend and resume has issues with function tracing (because it
makes things like calling smp_processor_id() crash the system), but I'm
unaware of issues with hotplug itself. Could be some of the same issues.

Can you give me more details, I'll try to investigate it.

Thanks,

-- Steve


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Vincent Guittot
On 18 February 2013 16:30, Steven Rostedt rost...@goodmis.org wrote:
 On Mon, 2013-02-18 at 11:58 +0100, Vincent Guittot wrote:

 My tests have been done without cpuidle because i have some issues
 with function tracer and cpuidle

 But the cpu hotplug and cpuidle work well when I run the tests without
 enabling the function tracer


 I know suspend and resume has issues with function tracing (because it
 makes things like calling smp_processor_id() crash the system), but I'm
 unaware of issues with hotplug itself. Could be some of the same issues.

 Can you give me more details, I'll try to investigate it.

yes for sure.
The problem is more linked to cpuidle and function tracer.

cpu hotplug and function tracer work when cpuidle is disable.
cpu hotplug and cpuidle works if i don't enable function tracer.
my platform is dead as soon as I enable function tracer if cpuidle is
enabled. It looks like some notrace are missing in my platform driver
but we haven't completely fix the issue yet

Vincent


 Thanks,

 -- Steve


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Steven Rostedt
On Mon, 2013-02-18 at 17:50 +0100, Vincent Guittot wrote:

 yes for sure.
 The problem is more linked to cpuidle and function tracer.
 
 cpu hotplug and function tracer work when cpuidle is disable.
 cpu hotplug and cpuidle works if i don't enable function tracer.
 my platform is dead as soon as I enable function tracer if cpuidle is
 enabled. It looks like some notrace are missing in my platform driver
 but we haven't completely fix the issue yet
 

You can bisect to find out exactly what function is the problem:

 cat /debug/tracing/available_filter_functions  t
 
f(t) {
 num=`wc -l t`
 sed -ne 1,${num}p t  t1
 let num=num+1
 sed -ne ${num},$p t  t2

 cat t1  /debug/tracing/set_ftrace_filter
 # note this may take a long time to finish

 echo function  /debug/tracing/current_tracer

 failed? bisect f(t1), if not bisect f(t2)
}




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-18 Thread Steven Rostedt
On Mon, 2013-02-18 at 17:50 +0100, Vincent Guittot wrote:

 yes for sure.
 The problem is more linked to cpuidle and function tracer.
 
 cpu hotplug and function tracer work when cpuidle is disable.
 cpu hotplug and cpuidle works if i don't enable function tracer.
 my platform is dead as soon as I enable function tracer if cpuidle is
 enabled. It looks like some notrace are missing in my platform driver
 but we haven't completely fix the issue yet
 

You can bisect to find out exactly what function is the problem:

 cat /debug/tracing/available_filter_functions  t
 
f(t) {
 num=`wc -l t`
 sed -ne 1,${num}p t  t1
 let num=num+1
 sed -ne ${num},$p t  t2

 cat t1  /debug/tracing/set_ftrace_filter
 # note this may take a long time to finish

 echo function  /debug/tracing/current_tracer

 failed? bisect f(t1), if not bisect f(t2)
}

-- Steve



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-15 Thread Srivatsa S. Bhat
Hi Vincent,

On 02/15/2013 06:58 PM, Vincent Guittot wrote:
> Hi Srivatsa,
> 
> I have run some tests with you branch (thanks Paul for the git tree)
> and you will find results below.
>

Thank you very much for testing this patchset!
 
> The tests condition are:
> - 5 CPUs system in 2 clusters
> - The test plugs/unplugs CPU2 and it increases the system load each 20
> plug/unplug sequence with either more cyclictests threads
> - The test is done with all CPUs online and with only CPU0 and CPU2
> 
> The main conclusion is that there is no differences with and without
> your patches with my stress tests. I'm not sure that it was the
> expected results but the cpu_down is already quite low : 4-5ms in
> average
> 

Atleast my patchset doesn't perform _worse_ than mainline, with respect
to cpu_down duration :-)

So, here is the analysis:
Stop-machine() doesn't really slow down CPU-down operation, if the rest
of the CPUs are mostly running in userspace all the time. Because, the
CPUs running userspace workloads cooperate very eagerly with the stop-machine
dance - they receive the resched IPI, and allow the per-cpu cpu-stopper
thread to monopolize the CPU, almost immediately.

The scenario where stop-machine() takes longer to take effect is when
most of the online CPUs are running in kernelspace, because, then the
probability that they call preempt_disable() frequently (and hence inhibit
stop-machine) is higher. That's why, in my tests, I ran genload from LTP
which generated a lot of system-time (system-time in 'top' indicates activity
in kernelspace). Hence my patchset showed significant improvement over
mainline in my tests.

However, your test is very useful too, if we measure a different parameter:
the latency impact on the workloads running on the system (cyclic test).
One other important aim of this patchset is to make hotplug as less intrusive
as possible, for other workloads running on the system. So if you measure
the cyclictest numbers, I would expect my patchset to show better numbers
than mainline, when you do cpu-hotplug in parallel (same test that you did).
Mainline would run stop-machine and hence interrupt the cyclic test tasks
too often. My patchset wouldn't do that, and hence cyclic test should
ideally show better numbers.

I'd really appreciate if you could try that out and let me know how it
goes.. :-) Thank you very much!

Regards,
Srivatsa S. Bhat

> 
> 
> On 12 February 2013 04:58, Srivatsa S. Bhat
>  wrote:
>> On 02/12/2013 12:38 AM, Paul E. McKenney wrote:
>>> On Mon, Feb 11, 2013 at 05:53:41PM +0530, Srivatsa S. Bhat wrote:
 On 02/11/2013 05:28 PM, Vincent Guittot wrote:
> On 8 February 2013 19:09, Srivatsa S. Bhat
>  wrote:
>>>
>>> [ . . . ]
>>>
>> Adding Vincent to CC, who had previously evaluated the performance and
>> latency implications of CPU hotplug on ARM platforms, IIRC.
>>
>
> Hi Srivatsa,
>
> I can try to run some of our stress tests on your patches.

 Great!

> Have you
> got a git tree that i can pull ?
>

 Unfortunately, no, none at the moment..  :-(
>>>
>>> You do need to create an externally visible git tree.
>>
>> Ok, I'll do that soon.
>>
>>>  In the meantime,
>>> I have added your series at rcu/bhat.2013.01.21a on -rcu:
>>>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
>>>
>>> This should appear soon on a kernel.org mirror near you.  ;-)
>>>
>>
>> Thank you very much, Paul! :-)
>>
>> Regards,
>> Srivatsa S. Bhat
>>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-15 Thread Srivatsa S. Bhat
Hi Vincent,

On 02/15/2013 06:58 PM, Vincent Guittot wrote:
 Hi Srivatsa,
 
 I have run some tests with you branch (thanks Paul for the git tree)
 and you will find results below.


Thank you very much for testing this patchset!
 
 The tests condition are:
 - 5 CPUs system in 2 clusters
 - The test plugs/unplugs CPU2 and it increases the system load each 20
 plug/unplug sequence with either more cyclictests threads
 - The test is done with all CPUs online and with only CPU0 and CPU2
 
 The main conclusion is that there is no differences with and without
 your patches with my stress tests. I'm not sure that it was the
 expected results but the cpu_down is already quite low : 4-5ms in
 average
 

Atleast my patchset doesn't perform _worse_ than mainline, with respect
to cpu_down duration :-)

So, here is the analysis:
Stop-machine() doesn't really slow down CPU-down operation, if the rest
of the CPUs are mostly running in userspace all the time. Because, the
CPUs running userspace workloads cooperate very eagerly with the stop-machine
dance - they receive the resched IPI, and allow the per-cpu cpu-stopper
thread to monopolize the CPU, almost immediately.

The scenario where stop-machine() takes longer to take effect is when
most of the online CPUs are running in kernelspace, because, then the
probability that they call preempt_disable() frequently (and hence inhibit
stop-machine) is higher. That's why, in my tests, I ran genload from LTP
which generated a lot of system-time (system-time in 'top' indicates activity
in kernelspace). Hence my patchset showed significant improvement over
mainline in my tests.

However, your test is very useful too, if we measure a different parameter:
the latency impact on the workloads running on the system (cyclic test).
One other important aim of this patchset is to make hotplug as less intrusive
as possible, for other workloads running on the system. So if you measure
the cyclictest numbers, I would expect my patchset to show better numbers
than mainline, when you do cpu-hotplug in parallel (same test that you did).
Mainline would run stop-machine and hence interrupt the cyclic test tasks
too often. My patchset wouldn't do that, and hence cyclic test should
ideally show better numbers.

I'd really appreciate if you could try that out and let me know how it
goes.. :-) Thank you very much!

Regards,
Srivatsa S. Bhat

 
 
 On 12 February 2013 04:58, Srivatsa S. Bhat
 srivatsa.b...@linux.vnet.ibm.com wrote:
 On 02/12/2013 12:38 AM, Paul E. McKenney wrote:
 On Mon, Feb 11, 2013 at 05:53:41PM +0530, Srivatsa S. Bhat wrote:
 On 02/11/2013 05:28 PM, Vincent Guittot wrote:
 On 8 February 2013 19:09, Srivatsa S. Bhat
 srivatsa.b...@linux.vnet.ibm.com wrote:

 [ . . . ]

 Adding Vincent to CC, who had previously evaluated the performance and
 latency implications of CPU hotplug on ARM platforms, IIRC.


 Hi Srivatsa,

 I can try to run some of our stress tests on your patches.

 Great!

 Have you
 got a git tree that i can pull ?


 Unfortunately, no, none at the moment..  :-(

 You do need to create an externally visible git tree.

 Ok, I'll do that soon.

  In the meantime,
 I have added your series at rcu/bhat.2013.01.21a on -rcu:

 git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git

 This should appear soon on a kernel.org mirror near you.  ;-)


 Thank you very much, Paul! :-)

 Regards,
 Srivatsa S. Bhat



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-11 Thread Srivatsa S. Bhat
On 02/12/2013 12:38 AM, Paul E. McKenney wrote:
> On Mon, Feb 11, 2013 at 05:53:41PM +0530, Srivatsa S. Bhat wrote:
>> On 02/11/2013 05:28 PM, Vincent Guittot wrote:
>>> On 8 February 2013 19:09, Srivatsa S. Bhat
>>>  wrote:
> 
> [ . . . ]
> 
 Adding Vincent to CC, who had previously evaluated the performance and
 latency implications of CPU hotplug on ARM platforms, IIRC.

>>>
>>> Hi Srivatsa,
>>>
>>> I can try to run some of our stress tests on your patches.
>>
>> Great!
>>
>>> Have you
>>> got a git tree that i can pull ?
>>>
>>
>> Unfortunately, no, none at the moment..  :-(
> 
> You do need to create an externally visible git tree.

Ok, I'll do that soon.

>  In the meantime,
> I have added your series at rcu/bhat.2013.01.21a on -rcu:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
> 
> This should appear soon on a kernel.org mirror near you.  ;-)
>

Thank you very much, Paul! :-)

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-11 Thread Paul E. McKenney
On Mon, Feb 11, 2013 at 05:53:41PM +0530, Srivatsa S. Bhat wrote:
> On 02/11/2013 05:28 PM, Vincent Guittot wrote:
> > On 8 February 2013 19:09, Srivatsa S. Bhat
> >  wrote:

[ . . . ]

> >> Adding Vincent to CC, who had previously evaluated the performance and
> >> latency implications of CPU hotplug on ARM platforms, IIRC.
> >>
> >
> > Hi Srivatsa,
> > 
> > I can try to run some of our stress tests on your patches.
> 
> Great!
> 
> > Have you
> > got a git tree that i can pull ?
> > 
> 
> Unfortunately, no, none at the moment..  :-(

You do need to create an externally visible git tree.  In the meantime,
I have added your series at rcu/bhat.2013.01.21a on -rcu:

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git

This should appear soon on a kernel.org mirror near you.  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-11 Thread Srivatsa S. Bhat
On 02/11/2013 05:28 PM, Vincent Guittot wrote:
> On 8 February 2013 19:09, Srivatsa S. Bhat
>  wrote:
>> On 02/08/2013 10:14 PM, Srivatsa S. Bhat wrote:
>>> On 02/08/2013 09:11 PM, Russell King - ARM Linux wrote:
 On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote:
> On 02/07/2013 09:44 AM, Rusty Russell wrote:
>> "Srivatsa S. Bhat"  writes:
>>> On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote:
>>>  Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c 
>>> latency]
>>>
>>> # online CPUsMainline (with stop-m/c)   This patchset (no 
>>> stop-m/c)
>>>
>>>   8 17.04  7.73
>>>
>>>  16 18.05  6.44
>>>
>>>  32 17.31  7.39
>>>
>>>  64 32.40  9.28
>>>
>>> 128 98.23  7.35
>>
>> Nice!
>
> Thank you :-)
>
>>  I wonder how the ARM guys feel with their quad-cpu systems...
>>
>
> That would be definitely interesting to know :-)

 That depends what exactly you'd like tested (and how) and whether you'd
 like it to be a test-chip based quad core, or an OMAP dual-core SoC.

>>>
>>> The effect of stop_machine() doesn't really depend on the CPU architecture
>>> used underneath or the platform. It depends only on the _number_ of
>>> _logical_ CPUs used.
>>>
>>> And stop_machine() has 2 noticeable drawbacks:
>>> 1. It makes the hotplug operation itself slow
>>> 2. and it causes disruptions to the workloads running on the other
>>> CPUs by hijacking the entire machine for significant amounts of time.
>>>
>>> In my experiments (mentioned above), I tried to measure how my patchset
>>> improves (reduces) the duration of hotplug (CPU offline) itself. Which is
>>> also slightly indicative of the impact it has on the rest of the system.
>>>
>>> But what would be nice to test, is a setup where the workloads running on
>>> the rest of the system are latency-sensitive, and measure the impact of
>>> CPU offline on them, with this patchset applied. That would tell us how
>>> far is this useful in making CPU hotplug less disruptive on the system.
>>>
>>> Of course, it would be nice to also see whether we observe any reduction
>>> in hotplug duration itself (point 1 above) on ARM platforms with lot
>>> of CPUs. [This could potentially speed up suspend/resume, which is used
>>> rather heavily on ARM platforms].
>>>
>>> The benefits from this patchset over mainline (both in terms of points
>>> 1 and 2 above) is expected to increase, with increasing number of CPUs in
>>> the system.
>>>
>>
>> Adding Vincent to CC, who had previously evaluated the performance and
>> latency implications of CPU hotplug on ARM platforms, IIRC.
>>
>
> Hi Srivatsa,
> 
> I can try to run some of our stress tests on your patches.

Great!

> Have you
> got a git tree that i can pull ?
> 

Unfortunately, no, none at the moment..  :-(

Regards,
Srivatsa S. Bhat


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-11 Thread Vincent Guittot
Hi Srivatsa,

I can try to run some of our stress tests on your patches. Have you
got a git tree that i can pull ?

Regards,
Vincent

On 8 February 2013 19:09, Srivatsa S. Bhat
 wrote:
> On 02/08/2013 10:14 PM, Srivatsa S. Bhat wrote:
>> On 02/08/2013 09:11 PM, Russell King - ARM Linux wrote:
>>> On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote:
 On 02/07/2013 09:44 AM, Rusty Russell wrote:
> "Srivatsa S. Bhat"  writes:
>> On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote:
>>  Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c 
>> latency]
>>
>> # online CPUsMainline (with stop-m/c)   This patchset (no 
>> stop-m/c)
>>
>>   8 17.04  7.73
>>
>>  16 18.05  6.44
>>
>>  32 17.31  7.39
>>
>>  64 32.40  9.28
>>
>> 128 98.23  7.35
>
> Nice!

 Thank you :-)

>  I wonder how the ARM guys feel with their quad-cpu systems...
>

 That would be definitely interesting to know :-)
>>>
>>> That depends what exactly you'd like tested (and how) and whether you'd
>>> like it to be a test-chip based quad core, or an OMAP dual-core SoC.
>>>
>>
>> The effect of stop_machine() doesn't really depend on the CPU architecture
>> used underneath or the platform. It depends only on the _number_ of
>> _logical_ CPUs used.
>>
>> And stop_machine() has 2 noticeable drawbacks:
>> 1. It makes the hotplug operation itself slow
>> 2. and it causes disruptions to the workloads running on the other
>> CPUs by hijacking the entire machine for significant amounts of time.
>>
>> In my experiments (mentioned above), I tried to measure how my patchset
>> improves (reduces) the duration of hotplug (CPU offline) itself. Which is
>> also slightly indicative of the impact it has on the rest of the system.
>>
>> But what would be nice to test, is a setup where the workloads running on
>> the rest of the system are latency-sensitive, and measure the impact of
>> CPU offline on them, with this patchset applied. That would tell us how
>> far is this useful in making CPU hotplug less disruptive on the system.
>>
>> Of course, it would be nice to also see whether we observe any reduction
>> in hotplug duration itself (point 1 above) on ARM platforms with lot
>> of CPUs. [This could potentially speed up suspend/resume, which is used
>> rather heavily on ARM platforms].
>>
>> The benefits from this patchset over mainline (both in terms of points
>> 1 and 2 above) is expected to increase, with increasing number of CPUs in
>> the system.
>>
>
> Adding Vincent to CC, who had previously evaluated the performance and
> latency implications of CPU hotplug on ARM platforms, IIRC.
>
> Regards,
> Srivatsa S. Bhat
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-11 Thread Vincent Guittot
Hi Srivatsa,

I can try to run some of our stress tests on your patches. Have you
got a git tree that i can pull ?

Regards,
Vincent

On 8 February 2013 19:09, Srivatsa S. Bhat
srivatsa.b...@linux.vnet.ibm.com wrote:
 On 02/08/2013 10:14 PM, Srivatsa S. Bhat wrote:
 On 02/08/2013 09:11 PM, Russell King - ARM Linux wrote:
 On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote:
 On 02/07/2013 09:44 AM, Rusty Russell wrote:
 Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com writes:
 On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote:
  Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c 
 latency]

 # online CPUsMainline (with stop-m/c)   This patchset (no 
 stop-m/c)

   8 17.04  7.73

  16 18.05  6.44

  32 17.31  7.39

  64 32.40  9.28

 128 98.23  7.35

 Nice!

 Thank you :-)

  I wonder how the ARM guys feel with their quad-cpu systems...


 That would be definitely interesting to know :-)

 That depends what exactly you'd like tested (and how) and whether you'd
 like it to be a test-chip based quad core, or an OMAP dual-core SoC.


 The effect of stop_machine() doesn't really depend on the CPU architecture
 used underneath or the platform. It depends only on the _number_ of
 _logical_ CPUs used.

 And stop_machine() has 2 noticeable drawbacks:
 1. It makes the hotplug operation itself slow
 2. and it causes disruptions to the workloads running on the other
 CPUs by hijacking the entire machine for significant amounts of time.

 In my experiments (mentioned above), I tried to measure how my patchset
 improves (reduces) the duration of hotplug (CPU offline) itself. Which is
 also slightly indicative of the impact it has on the rest of the system.

 But what would be nice to test, is a setup where the workloads running on
 the rest of the system are latency-sensitive, and measure the impact of
 CPU offline on them, with this patchset applied. That would tell us how
 far is this useful in making CPU hotplug less disruptive on the system.

 Of course, it would be nice to also see whether we observe any reduction
 in hotplug duration itself (point 1 above) on ARM platforms with lot
 of CPUs. [This could potentially speed up suspend/resume, which is used
 rather heavily on ARM platforms].

 The benefits from this patchset over mainline (both in terms of points
 1 and 2 above) is expected to increase, with increasing number of CPUs in
 the system.


 Adding Vincent to CC, who had previously evaluated the performance and
 latency implications of CPU hotplug on ARM platforms, IIRC.

 Regards,
 Srivatsa S. Bhat

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-11 Thread Srivatsa S. Bhat
On 02/11/2013 05:28 PM, Vincent Guittot wrote:
 On 8 February 2013 19:09, Srivatsa S. Bhat
 srivatsa.b...@linux.vnet.ibm.com wrote:
 On 02/08/2013 10:14 PM, Srivatsa S. Bhat wrote:
 On 02/08/2013 09:11 PM, Russell King - ARM Linux wrote:
 On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote:
 On 02/07/2013 09:44 AM, Rusty Russell wrote:
 Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com writes:
 On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote:
  Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c 
 latency]

 # online CPUsMainline (with stop-m/c)   This patchset (no 
 stop-m/c)

   8 17.04  7.73

  16 18.05  6.44

  32 17.31  7.39

  64 32.40  9.28

 128 98.23  7.35

 Nice!

 Thank you :-)

  I wonder how the ARM guys feel with their quad-cpu systems...


 That would be definitely interesting to know :-)

 That depends what exactly you'd like tested (and how) and whether you'd
 like it to be a test-chip based quad core, or an OMAP dual-core SoC.


 The effect of stop_machine() doesn't really depend on the CPU architecture
 used underneath or the platform. It depends only on the _number_ of
 _logical_ CPUs used.

 And stop_machine() has 2 noticeable drawbacks:
 1. It makes the hotplug operation itself slow
 2. and it causes disruptions to the workloads running on the other
 CPUs by hijacking the entire machine for significant amounts of time.

 In my experiments (mentioned above), I tried to measure how my patchset
 improves (reduces) the duration of hotplug (CPU offline) itself. Which is
 also slightly indicative of the impact it has on the rest of the system.

 But what would be nice to test, is a setup where the workloads running on
 the rest of the system are latency-sensitive, and measure the impact of
 CPU offline on them, with this patchset applied. That would tell us how
 far is this useful in making CPU hotplug less disruptive on the system.

 Of course, it would be nice to also see whether we observe any reduction
 in hotplug duration itself (point 1 above) on ARM platforms with lot
 of CPUs. [This could potentially speed up suspend/resume, which is used
 rather heavily on ARM platforms].

 The benefits from this patchset over mainline (both in terms of points
 1 and 2 above) is expected to increase, with increasing number of CPUs in
 the system.


 Adding Vincent to CC, who had previously evaluated the performance and
 latency implications of CPU hotplug on ARM platforms, IIRC.


 Hi Srivatsa,
 
 I can try to run some of our stress tests on your patches.

Great!

 Have you
 got a git tree that i can pull ?
 

Unfortunately, no, none at the moment..  :-(

Regards,
Srivatsa S. Bhat


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-11 Thread Paul E. McKenney
On Mon, Feb 11, 2013 at 05:53:41PM +0530, Srivatsa S. Bhat wrote:
 On 02/11/2013 05:28 PM, Vincent Guittot wrote:
  On 8 February 2013 19:09, Srivatsa S. Bhat
  srivatsa.b...@linux.vnet.ibm.com wrote:

[ . . . ]

  Adding Vincent to CC, who had previously evaluated the performance and
  latency implications of CPU hotplug on ARM platforms, IIRC.
 
 
  Hi Srivatsa,
  
  I can try to run some of our stress tests on your patches.
 
 Great!
 
  Have you
  got a git tree that i can pull ?
  
 
 Unfortunately, no, none at the moment..  :-(

You do need to create an externally visible git tree.  In the meantime,
I have added your series at rcu/bhat.2013.01.21a on -rcu:

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git

This should appear soon on a kernel.org mirror near you.  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-11 Thread Srivatsa S. Bhat
On 02/12/2013 12:38 AM, Paul E. McKenney wrote:
 On Mon, Feb 11, 2013 at 05:53:41PM +0530, Srivatsa S. Bhat wrote:
 On 02/11/2013 05:28 PM, Vincent Guittot wrote:
 On 8 February 2013 19:09, Srivatsa S. Bhat
 srivatsa.b...@linux.vnet.ibm.com wrote:
 
 [ . . . ]
 
 Adding Vincent to CC, who had previously evaluated the performance and
 latency implications of CPU hotplug on ARM platforms, IIRC.


 Hi Srivatsa,

 I can try to run some of our stress tests on your patches.

 Great!

 Have you
 got a git tree that i can pull ?


 Unfortunately, no, none at the moment..  :-(
 
 You do need to create an externally visible git tree.

Ok, I'll do that soon.

  In the meantime,
 I have added your series at rcu/bhat.2013.01.21a on -rcu:
 
 git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
 
 This should appear soon on a kernel.org mirror near you.  ;-)


Thank you very much, Paul! :-)

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-08 Thread Srivatsa S. Bhat
On 02/08/2013 10:14 PM, Srivatsa S. Bhat wrote:
> On 02/08/2013 09:11 PM, Russell King - ARM Linux wrote:
>> On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote:
>>> On 02/07/2013 09:44 AM, Rusty Russell wrote:
 "Srivatsa S. Bhat"  writes:
> On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote:
>  Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c 
> latency]
>
> # online CPUsMainline (with stop-m/c)   This patchset (no 
> stop-m/c)
>
>   8 17.04  7.73
>
>  16 18.05  6.44
>
>  32 17.31  7.39
>
>  64 32.40  9.28
>
> 128 98.23  7.35

 Nice!
>>>
>>> Thank you :-)
>>>
  I wonder how the ARM guys feel with their quad-cpu systems...

>>>
>>> That would be definitely interesting to know :-)
>>
>> That depends what exactly you'd like tested (and how) and whether you'd
>> like it to be a test-chip based quad core, or an OMAP dual-core SoC.
>>
> 
> The effect of stop_machine() doesn't really depend on the CPU architecture
> used underneath or the platform. It depends only on the _number_ of
> _logical_ CPUs used.
> 
> And stop_machine() has 2 noticeable drawbacks:
> 1. It makes the hotplug operation itself slow
> 2. and it causes disruptions to the workloads running on the other
> CPUs by hijacking the entire machine for significant amounts of time.
> 
> In my experiments (mentioned above), I tried to measure how my patchset
> improves (reduces) the duration of hotplug (CPU offline) itself. Which is
> also slightly indicative of the impact it has on the rest of the system.
> 
> But what would be nice to test, is a setup where the workloads running on
> the rest of the system are latency-sensitive, and measure the impact of
> CPU offline on them, with this patchset applied. That would tell us how
> far is this useful in making CPU hotplug less disruptive on the system.
> 
> Of course, it would be nice to also see whether we observe any reduction
> in hotplug duration itself (point 1 above) on ARM platforms with lot
> of CPUs. [This could potentially speed up suspend/resume, which is used
> rather heavily on ARM platforms].
> 
> The benefits from this patchset over mainline (both in terms of points
> 1 and 2 above) is expected to increase, with increasing number of CPUs in
> the system.
> 

Adding Vincent to CC, who had previously evaluated the performance and
latency implications of CPU hotplug on ARM platforms, IIRC.

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-08 Thread Srivatsa S. Bhat
On 02/08/2013 09:11 PM, Russell King - ARM Linux wrote:
> On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote:
>> On 02/07/2013 09:44 AM, Rusty Russell wrote:
>>> "Srivatsa S. Bhat"  writes:
 On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote:
  Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c 
 latency]

 # online CPUsMainline (with stop-m/c)   This patchset (no stop-m/c)

   8 17.04  7.73

  16 18.05  6.44

  32 17.31  7.39

  64 32.40  9.28

 128 98.23  7.35
>>>
>>> Nice!
>>
>> Thank you :-)
>>
>>>  I wonder how the ARM guys feel with their quad-cpu systems...
>>>
>>
>> That would be definitely interesting to know :-)
> 
> That depends what exactly you'd like tested (and how) and whether you'd
> like it to be a test-chip based quad core, or an OMAP dual-core SoC.
> 

The effect of stop_machine() doesn't really depend on the CPU architecture
used underneath or the platform. It depends only on the _number_ of
_logical_ CPUs used.

And stop_machine() has 2 noticeable drawbacks:
1. It makes the hotplug operation itself slow
2. and it causes disruptions to the workloads running on the other
CPUs by hijacking the entire machine for significant amounts of time.

In my experiments (mentioned above), I tried to measure how my patchset
improves (reduces) the duration of hotplug (CPU offline) itself. Which is
also slightly indicative of the impact it has on the rest of the system.

But what would be nice to test, is a setup where the workloads running on
the rest of the system are latency-sensitive, and measure the impact of
CPU offline on them, with this patchset applied. That would tell us how
far is this useful in making CPU hotplug less disruptive on the system.

Of course, it would be nice to also see whether we observe any reduction
in hotplug duration itself (point 1 above) on ARM platforms with lot
of CPUs. [This could potentially speed up suspend/resume, which is used
rather heavily on ARM platforms].

The benefits from this patchset over mainline (both in terms of points
1 and 2 above) is expected to increase, with increasing number of CPUs in
the system.

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-08 Thread Russell King - ARM Linux
On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote:
> On 02/07/2013 09:44 AM, Rusty Russell wrote:
> > "Srivatsa S. Bhat"  writes:
> >> On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote:
> >>  Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c 
> >> latency]
> >>
> >> # online CPUsMainline (with stop-m/c)   This patchset (no stop-m/c)
> >>
> >>   8 17.04  7.73
> >>
> >>  16 18.05  6.44
> >>
> >>  32 17.31  7.39
> >>
> >>  64 32.40  9.28
> >>
> >> 128 98.23  7.35
> > 
> > Nice!
> 
> Thank you :-)
> 
> >  I wonder how the ARM guys feel with their quad-cpu systems...
> > 
> 
> That would be definitely interesting to know :-)

That depends what exactly you'd like tested (and how) and whether you'd
like it to be a test-chip based quad core, or an OMAP dual-core SoC.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-08 Thread Russell King - ARM Linux
On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote:
 On 02/07/2013 09:44 AM, Rusty Russell wrote:
  Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com writes:
  On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote:
   Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c 
  latency]
 
  # online CPUsMainline (with stop-m/c)   This patchset (no stop-m/c)
 
8 17.04  7.73
 
   16 18.05  6.44
 
   32 17.31  7.39
 
   64 32.40  9.28
 
  128 98.23  7.35
  
  Nice!
 
 Thank you :-)
 
   I wonder how the ARM guys feel with their quad-cpu systems...
  
 
 That would be definitely interesting to know :-)

That depends what exactly you'd like tested (and how) and whether you'd
like it to be a test-chip based quad core, or an OMAP dual-core SoC.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-08 Thread Srivatsa S. Bhat
On 02/08/2013 09:11 PM, Russell King - ARM Linux wrote:
 On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote:
 On 02/07/2013 09:44 AM, Rusty Russell wrote:
 Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com writes:
 On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote:
  Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c 
 latency]

 # online CPUsMainline (with stop-m/c)   This patchset (no stop-m/c)

   8 17.04  7.73

  16 18.05  6.44

  32 17.31  7.39

  64 32.40  9.28

 128 98.23  7.35

 Nice!

 Thank you :-)

  I wonder how the ARM guys feel with their quad-cpu systems...


 That would be definitely interesting to know :-)
 
 That depends what exactly you'd like tested (and how) and whether you'd
 like it to be a test-chip based quad core, or an OMAP dual-core SoC.
 

The effect of stop_machine() doesn't really depend on the CPU architecture
used underneath or the platform. It depends only on the _number_ of
_logical_ CPUs used.

And stop_machine() has 2 noticeable drawbacks:
1. It makes the hotplug operation itself slow
2. and it causes disruptions to the workloads running on the other
CPUs by hijacking the entire machine for significant amounts of time.

In my experiments (mentioned above), I tried to measure how my patchset
improves (reduces) the duration of hotplug (CPU offline) itself. Which is
also slightly indicative of the impact it has on the rest of the system.

But what would be nice to test, is a setup where the workloads running on
the rest of the system are latency-sensitive, and measure the impact of
CPU offline on them, with this patchset applied. That would tell us how
far is this useful in making CPU hotplug less disruptive on the system.

Of course, it would be nice to also see whether we observe any reduction
in hotplug duration itself (point 1 above) on ARM platforms with lot
of CPUs. [This could potentially speed up suspend/resume, which is used
rather heavily on ARM platforms].

The benefits from this patchset over mainline (both in terms of points
1 and 2 above) is expected to increase, with increasing number of CPUs in
the system.

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-08 Thread Srivatsa S. Bhat
On 02/08/2013 10:14 PM, Srivatsa S. Bhat wrote:
 On 02/08/2013 09:11 PM, Russell King - ARM Linux wrote:
 On Thu, Feb 07, 2013 at 11:41:34AM +0530, Srivatsa S. Bhat wrote:
 On 02/07/2013 09:44 AM, Rusty Russell wrote:
 Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com writes:
 On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote:
  Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c 
 latency]

 # online CPUsMainline (with stop-m/c)   This patchset (no 
 stop-m/c)

   8 17.04  7.73

  16 18.05  6.44

  32 17.31  7.39

  64 32.40  9.28

 128 98.23  7.35

 Nice!

 Thank you :-)

  I wonder how the ARM guys feel with their quad-cpu systems...


 That would be definitely interesting to know :-)

 That depends what exactly you'd like tested (and how) and whether you'd
 like it to be a test-chip based quad core, or an OMAP dual-core SoC.

 
 The effect of stop_machine() doesn't really depend on the CPU architecture
 used underneath or the platform. It depends only on the _number_ of
 _logical_ CPUs used.
 
 And stop_machine() has 2 noticeable drawbacks:
 1. It makes the hotplug operation itself slow
 2. and it causes disruptions to the workloads running on the other
 CPUs by hijacking the entire machine for significant amounts of time.
 
 In my experiments (mentioned above), I tried to measure how my patchset
 improves (reduces) the duration of hotplug (CPU offline) itself. Which is
 also slightly indicative of the impact it has on the rest of the system.
 
 But what would be nice to test, is a setup where the workloads running on
 the rest of the system are latency-sensitive, and measure the impact of
 CPU offline on them, with this patchset applied. That would tell us how
 far is this useful in making CPU hotplug less disruptive on the system.
 
 Of course, it would be nice to also see whether we observe any reduction
 in hotplug duration itself (point 1 above) on ARM platforms with lot
 of CPUs. [This could potentially speed up suspend/resume, which is used
 rather heavily on ARM platforms].
 
 The benefits from this patchset over mainline (both in terms of points
 1 and 2 above) is expected to increase, with increasing number of CPUs in
 the system.
 

Adding Vincent to CC, who had previously evaluated the performance and
latency implications of CPU hotplug on ARM platforms, IIRC.

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-06 Thread Srivatsa S. Bhat
On 02/07/2013 09:44 AM, Rusty Russell wrote:
> "Srivatsa S. Bhat"  writes:
>> On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote:
>>  Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c 
>> latency]
>>
>> # online CPUsMainline (with stop-m/c)   This patchset (no stop-m/c)
>>
>>   8 17.04  7.73
>>
>>  16 18.05  6.44
>>
>>  32 17.31  7.39
>>
>>  64 32.40  9.28
>>
>> 128 98.23  7.35
> 
> Nice!

Thank you :-)

>  I wonder how the ARM guys feel with their quad-cpu systems...
> 

That would be definitely interesting to know :-)

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-06 Thread Rusty Russell
"Srivatsa S. Bhat"  writes:
> On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote:
>  Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c 
> latency]
>
> # online CPUsMainline (with stop-m/c)   This patchset (no stop-m/c)
>
>   8 17.04  7.73
>
>  16 18.05  6.44
>
>  32 17.31  7.39
>
>  64 32.40  9.28
>
> 128 98.23  7.35

Nice!  I wonder how the ARM guys feel with their quad-cpu systems...

Thanks!
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-06 Thread Rusty Russell
Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com writes:
 On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote:
  Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c 
 latency]

 # online CPUsMainline (with stop-m/c)   This patchset (no stop-m/c)

   8 17.04  7.73

  16 18.05  6.44

  32 17.31  7.39

  64 32.40  9.28

 128 98.23  7.35

Nice!  I wonder how the ARM guys feel with their quad-cpu systems...

Thanks!
Rusty.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-02-06 Thread Srivatsa S. Bhat
On 02/07/2013 09:44 AM, Rusty Russell wrote:
 Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com writes:
 On 01/22/2013 01:03 PM, Srivatsa S. Bhat wrote:
  Avg. latency of 1 CPU offline (ms) [stop-cpu/stop-m/c 
 latency]

 # online CPUsMainline (with stop-m/c)   This patchset (no stop-m/c)

   8 17.04  7.73

  16 18.05  6.44

  32 17.31  7.39

  64 32.40  9.28

 128 98.23  7.35
 
 Nice!

Thank you :-)

  I wonder how the ARM guys feel with their quad-cpu systems...
 

That would be definitely interesting to know :-)

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-01-21 Thread Srivatsa S. Bhat
Hi,

This patchset removes CPU hotplug's dependence on stop_machine() from the CPU
offline path and provides an alternative (set of APIs) to preempt_disable() to
prevent CPUs from going offline, which can be invoked from atomic context.
The motivation behind the removal of stop_machine() is to avoid its ill-effects
and thus improve the design of CPU hotplug. (More description regarding this
is available in the patches).

All the users of preempt_disable()/local_irq_disable() who used to use it to
prevent CPU offline, have been converted to the new primitives introduced in the
patchset. Also, the CPU_DYING notifiers have been audited to check whether
they can cope up with the removal of stop_machine() or whether they need to
use new locks for synchronization (all CPU_DYING notifiers looked OK, without
the need for any new locks).

Applies on v3.8-rc4. It currently has some locking issues with cpu idle (on
which even lockdep didn't provide any insight unfortunately). So for now, it
works with CONFIG_CPU_IDLE=n.

Overview of the patches:
---

Patches 1 to 6 introduce a generic, flexible Per-CPU Reader-Writer Locking
scheme.

Patch 7 uses this synchronization mechanism to build the
get/put_online_cpus_atomic() APIs which can be used from atomic context, to
prevent CPUs from going offline.

Patch 8 is a cleanup; it converts preprocessor macros to static inline
functions.

Patches 9 to 42 convert various call-sites to use the new APIs.

Patch 43 is the one which actually removes stop_machine() from the CPU
offline path.

Patch 44 decouples stop_machine() and CPU hotplug from Kconfig.

Patch 45 updates the documentation to reflect the new APIs.


Changes in v5:
--
  Exposed a new generic locking scheme: Flexible Per-CPU Reader-Writer locks,
  based on the synchronization schemes already discussed in the previous
  versions, and used it in CPU hotplug, to implement the new APIs.

  Audited the CPU_DYING notifiers in the kernel source tree and replaced
  usages of preempt_disable() with the new get/put_online_cpus_atomic() APIs
  where necessary.


Changes in v4:
--
  The synchronization scheme has been simplified quite a bit, which makes it
  look a lot less complex than before. Some highlights:

* Implicit ACKs:

  The earlier design required the readers to explicitly ACK the writer's
  signal. The new design uses implicit ACKs instead. The reader switching
  over to rwlock implicitly tells the writer to stop waiting for that reader.

* No atomic operations:

  Since we got rid of explicit ACKs, we no longer have the need for a reader
  and a writer to update the same counter. So we can get rid of atomic ops
  too.

Changes in v3:
--
* Dropped the _light() and _full() variants of the APIs. Provided a single
  interface: get/put_online_cpus_atomic().

* Completely redesigned the synchronization mechanism again, to make it
  fast and scalable at the reader-side in the fast-path (when no hotplug
  writers are active). This new scheme also ensures that there is no
  possibility of deadlocks due to circular locking dependency.
  In summary, this provides the scalability and speed of per-cpu rwlocks
  (without actually using them), while avoiding the downside (deadlock
  possibilities) which is inherent in any per-cpu locking scheme that is
  meant to compete with preempt_disable()/enable() in terms of flexibility.

  The problem with using per-cpu locking to replace preempt_disable()/enable
  was explained here:
  https://lkml.org/lkml/2012/12/6/290

  Basically we use per-cpu counters (for scalability) when no writers are
  active, and then switch to global rwlocks (for lock-safety) when a writer
  becomes active. It is a slightly complex scheme, but it is based on
  standard principles of distributed algorithms.

Changes in v2:
-
* Completely redesigned the synchronization scheme to avoid using any extra
  cpumasks.

* Provided APIs for 2 types of atomic hotplug readers: "light" (for
  light-weight) and "full". We wish to have more "light" readers than
  the "full" ones, to avoid indirectly inducing the "stop_machine effect"
  without even actually using stop_machine().

  And the patches show that it _is_ generally true: 5 patches deal with
  "light" readers, whereas only 1 patch deals with a "full" reader.

  Also, the "light" readers happen to be in very hot paths. So it makes a
  lot of sense to have such a distinction and a corresponding light-weight
  API.

Links to previous versions:
v4: https://lkml.org/lkml/2012/12/11/209
v3: https://lkml.org/lkml/2012/12/7/287
v2: https://lkml.org/lkml/2012/12/5/322
v1: https://lkml.org/lkml/2012/12/4/88

--

Paul E. McKenney (1):
  cpu: No more __stop_machine() in _cpu_down()

Srivatsa S. Bhat (44):
  percpu_rwlock: Introduce the global reader-writer lock backend
  percpu_rwlock: Introduce per-CPU variables for the reader and the writer
  percpu_rwlock: Provide a way to define and init 

[PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug

2013-01-21 Thread Srivatsa S. Bhat
Hi,

This patchset removes CPU hotplug's dependence on stop_machine() from the CPU
offline path and provides an alternative (set of APIs) to preempt_disable() to
prevent CPUs from going offline, which can be invoked from atomic context.
The motivation behind the removal of stop_machine() is to avoid its ill-effects
and thus improve the design of CPU hotplug. (More description regarding this
is available in the patches).

All the users of preempt_disable()/local_irq_disable() who used to use it to
prevent CPU offline, have been converted to the new primitives introduced in the
patchset. Also, the CPU_DYING notifiers have been audited to check whether
they can cope up with the removal of stop_machine() or whether they need to
use new locks for synchronization (all CPU_DYING notifiers looked OK, without
the need for any new locks).

Applies on v3.8-rc4. It currently has some locking issues with cpu idle (on
which even lockdep didn't provide any insight unfortunately). So for now, it
works with CONFIG_CPU_IDLE=n.

Overview of the patches:
---

Patches 1 to 6 introduce a generic, flexible Per-CPU Reader-Writer Locking
scheme.

Patch 7 uses this synchronization mechanism to build the
get/put_online_cpus_atomic() APIs which can be used from atomic context, to
prevent CPUs from going offline.

Patch 8 is a cleanup; it converts preprocessor macros to static inline
functions.

Patches 9 to 42 convert various call-sites to use the new APIs.

Patch 43 is the one which actually removes stop_machine() from the CPU
offline path.

Patch 44 decouples stop_machine() and CPU hotplug from Kconfig.

Patch 45 updates the documentation to reflect the new APIs.


Changes in v5:
--
  Exposed a new generic locking scheme: Flexible Per-CPU Reader-Writer locks,
  based on the synchronization schemes already discussed in the previous
  versions, and used it in CPU hotplug, to implement the new APIs.

  Audited the CPU_DYING notifiers in the kernel source tree and replaced
  usages of preempt_disable() with the new get/put_online_cpus_atomic() APIs
  where necessary.


Changes in v4:
--
  The synchronization scheme has been simplified quite a bit, which makes it
  look a lot less complex than before. Some highlights:

* Implicit ACKs:

  The earlier design required the readers to explicitly ACK the writer's
  signal. The new design uses implicit ACKs instead. The reader switching
  over to rwlock implicitly tells the writer to stop waiting for that reader.

* No atomic operations:

  Since we got rid of explicit ACKs, we no longer have the need for a reader
  and a writer to update the same counter. So we can get rid of atomic ops
  too.

Changes in v3:
--
* Dropped the _light() and _full() variants of the APIs. Provided a single
  interface: get/put_online_cpus_atomic().

* Completely redesigned the synchronization mechanism again, to make it
  fast and scalable at the reader-side in the fast-path (when no hotplug
  writers are active). This new scheme also ensures that there is no
  possibility of deadlocks due to circular locking dependency.
  In summary, this provides the scalability and speed of per-cpu rwlocks
  (without actually using them), while avoiding the downside (deadlock
  possibilities) which is inherent in any per-cpu locking scheme that is
  meant to compete with preempt_disable()/enable() in terms of flexibility.

  The problem with using per-cpu locking to replace preempt_disable()/enable
  was explained here:
  https://lkml.org/lkml/2012/12/6/290

  Basically we use per-cpu counters (for scalability) when no writers are
  active, and then switch to global rwlocks (for lock-safety) when a writer
  becomes active. It is a slightly complex scheme, but it is based on
  standard principles of distributed algorithms.

Changes in v2:
-
* Completely redesigned the synchronization scheme to avoid using any extra
  cpumasks.

* Provided APIs for 2 types of atomic hotplug readers: light (for
  light-weight) and full. We wish to have more light readers than
  the full ones, to avoid indirectly inducing the stop_machine effect
  without even actually using stop_machine().

  And the patches show that it _is_ generally true: 5 patches deal with
  light readers, whereas only 1 patch deals with a full reader.

  Also, the light readers happen to be in very hot paths. So it makes a
  lot of sense to have such a distinction and a corresponding light-weight
  API.

Links to previous versions:
v4: https://lkml.org/lkml/2012/12/11/209
v3: https://lkml.org/lkml/2012/12/7/287
v2: https://lkml.org/lkml/2012/12/5/322
v1: https://lkml.org/lkml/2012/12/4/88

--

Paul E. McKenney (1):
  cpu: No more __stop_machine() in _cpu_down()

Srivatsa S. Bhat (44):
  percpu_rwlock: Introduce the global reader-writer lock backend
  percpu_rwlock: Introduce per-CPU variables for the reader and the writer
  percpu_rwlock: Provide a way to define and init percpu-rwlocks at