On 05/23/2014 03:01 AM, Borislav Petkov wrote:
> On Fri, May 23, 2014 at 02:43:31AM +0530, Srivatsa S. Bhat wrote:
>>>> After you move the cmci_rediscover() call, it is now in a place where we
>>>> are
>>>> no longer ignoring frozen (i.e. th
On 05/22/2014 06:02 PM, Peter Zijlstra wrote:
> On Thu, May 22, 2014 at 05:24:33PM +0530, Srivatsa S. Bhat wrote:
>> Yeah, its complicated and perhaps we can do much better than that. But I'll
>> try to explain why there are so many different locks in the existing code.
>>
[..
y it should
be called specifically in the POST_DEAD stage only. So I'm sure we can get rid
of that one way or other easily.
Regards,
Srivatsa S. Bhat
>> So we were working around some interaction between cpu hotplug and frozen.
>> Do we no longer need to do that?
>
> Hm
On 05/22/2014 03:38 PM, Borislav Petkov wrote:
> On Thu, May 22, 2014 at 03:13:46PM +0530, Srivatsa S. Bhat wrote:
>> On 05/22/2014 02:53 PM, Borislav Petkov wrote:
>>> From: Borislav Petkov
>>>
>>> So 009f225ef050 ("powercap, intel-rapl: Fix CPU hotplug
to remove the get/put_online_cpus()
from intel-rapl driver.
Jacob/Srinivas, is the above assumption correct for rapl?
Regards,
Srivatsa S. Bhat
> Let me do what was supposed to be done.
>
> Cc: Srinivas Pandruvada
> Cc: Ingo Molnar
> Cc: Jacob Pan
> Cc: Srivatsa S. Bhat
>
()
from intel-rapl driver.
Jacob/Srinivas, is the above assumption correct for rapl?
Regards,
Srivatsa S. Bhat
Let me do what was supposed to be done.
Cc: Srinivas Pandruvada srinivas.pandruv...@linux.intel.com
Cc: Ingo Molnar mi...@kernel.org
Cc: Jacob Pan jacob.jun@linux.intel.com
Cc
On 05/22/2014 03:38 PM, Borislav Petkov wrote:
On Thu, May 22, 2014 at 03:13:46PM +0530, Srivatsa S. Bhat wrote:
On 05/22/2014 02:53 PM, Borislav Petkov wrote:
From: Borislav Petkov b...@suse.de
So 009f225ef050 (powercap, intel-rapl: Fix CPU hotplug callback
registration) says how get_
..
the original commit 88ccbedd9 didn't explain that.
Either way, cmci_rediscover() doesn't seem to have any reason why it should
be called specifically in the POST_DEAD stage only. So I'm sure we can get rid
of that one way or other easily.
Regards,
Srivatsa S. Bhat
So we were working around some
On 05/22/2014 06:02 PM, Peter Zijlstra wrote:
On Thu, May 22, 2014 at 05:24:33PM +0530, Srivatsa S. Bhat wrote:
Yeah, its complicated and perhaps we can do much better than that. But I'll
try to explain why there are so many different locks in the existing code.
[...]
So I think we can
On 05/23/2014 03:01 AM, Borislav Petkov wrote:
On Fri, May 23, 2014 at 02:43:31AM +0530, Srivatsa S. Bhat wrote:
After you move the cmci_rediscover() call, it is now in a place where we
are
no longer ignoring frozen (i.e. the old placement did the rediscover even
if the
CPU_TASKS_FROZEN
banks at the
end of CPU_DEAD.
Signed-off-by: Borislav Petkov b...@suse.de
Reviewed-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
Regards,
Srivatsa S. Bhat
---
arch/x86/kernel/cpu/mcheck/mce.c | 9 -
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/x86
: Paul Gortmaker paul.gortma...@windriver.com
Cc: Rafael J. Wysocki rafael.j.wyso...@intel.com
Cc: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
Cc: Toshi Kani toshi.k...@hp.com
Link: http://lkml.kernel.org/r/53758b12.8060...@cn.fujitsu.com
Signed-off-by: Ingo Molnar mi...@kernel.org
Reviewed
On 05/21/2014 06:31 PM, Rafael J. Wysocki wrote:
> On Wednesday, May 21, 2014 06:02:51 PM Srivatsa S. Bhat wrote:
>> On 05/05/2014 02:39 PM, Srivatsa S. Bhat wrote:
>>> On 05/05/2014 02:23 PM, Meelis Roos wrote:
>>>>>
>>>>> Changes in v3:
>>&
On 05/05/2014 02:39 PM, Srivatsa S. Bhat wrote:
> On 05/05/2014 02:23 PM, Meelis Roos wrote:
>>>
>>> Changes in v3:
>>> Expanded the comment in the code to briefly mention why ASYNC_NOTIFICATION
>>> drivers are left out from the check, as suggested by
On 05/05/2014 02:39 PM, Srivatsa S. Bhat wrote:
On 05/05/2014 02:23 PM, Meelis Roos wrote:
Changes in v3:
Expanded the comment in the code to briefly mention why ASYNC_NOTIFICATION
drivers are left out from the check, as suggested by Gautham R. Shenoy.
No code changes in this version.
v2
On 05/21/2014 06:31 PM, Rafael J. Wysocki wrote:
On Wednesday, May 21, 2014 06:02:51 PM Srivatsa S. Bhat wrote:
On 05/05/2014 02:39 PM, Srivatsa S. Bhat wrote:
On 05/05/2014 02:23 PM, Meelis Roos wrote:
Changes in v3:
Expanded the comment in the code to briefly mention why ASYNC_NOTIFICATION
On 05/20/2014 04:08 PM, Srivatsa S. Bhat wrote:
> On 05/20/2014 04:01 PM, Srivatsa S. Bhat wrote:
>> On 05/20/2014 03:55 PM, Peter Zijlstra wrote:
>>> On Tue, May 20, 2014 at 03:39:59PM +0530, Srivatsa S. Bhat wrote:
>>>>> The multi_cpu_stop() path isn't exclus
On 05/20/2014 04:01 PM, Srivatsa S. Bhat wrote:
> On 05/20/2014 03:55 PM, Peter Zijlstra wrote:
>> On Tue, May 20, 2014 at 03:39:59PM +0530, Srivatsa S. Bhat wrote:
>>>> The multi_cpu_stop() path isn't exclusive to hotplug, so your changelog
>>>> is wrong or the p
> However, includes only on SMP configs and hence UP
> builds fail. Fix this by directly including in setup.c
> unconditionally.
>
> Reported-by: Geert Uytterhoeven
> Signed-off-by: Shreyas B. Prabhu
Reviewed-by: Srivatsa S. Bhat
Regards,
Srivatsa S. Bhat
> ---
>
On 05/20/2014 03:55 PM, Peter Zijlstra wrote:
> On Tue, May 20, 2014 at 03:39:59PM +0530, Srivatsa S. Bhat wrote:
>>> The multi_cpu_stop() path isn't exclusive to hotplug, so your changelog
>>> is wrong or the patch is.
>>>
>>
>> Yes, I know that multi_cpu
On 05/20/2014 03:12 PM, Peter Zijlstra wrote:
> On Tue, May 20, 2014 at 01:52:41AM +0530, Srivatsa S. Bhat wrote:
>> From: Srivatsa S. Bhat
>> [PATCH v5 UPDATEDv3 3/3] CPU hotplug, smp: Flush any pending IPI callbacks
>> before CPU offline
>>
>> During CPU offl
On 05/20/2014 03:12 PM, Peter Zijlstra wrote:
On Tue, May 20, 2014 at 01:52:41AM +0530, Srivatsa S. Bhat wrote:
From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
[PATCH v5 UPDATEDv3 3/3] CPU hotplug, smp: Flush any pending IPI callbacks
before CPU offline
During CPU offline
On 05/20/2014 03:55 PM, Peter Zijlstra wrote:
On Tue, May 20, 2014 at 03:39:59PM +0530, Srivatsa S. Bhat wrote:
The multi_cpu_stop() path isn't exclusive to hotplug, so your changelog
is wrong or the patch is.
Yes, I know that multi_cpu_stop() isn't exclusive to hotplug. That's why
I have
, linux/smp.h includes asm/smp.h only on SMP configs and hence UP
builds fail. Fix this by directly including asm/smp.h in setup.c
unconditionally.
Reported-by: Geert Uytterhoeven ge...@linux-m68k.org
Signed-off-by: Shreyas B. Prabhu shre...@linux.vnet.ibm.com
Reviewed-by: Srivatsa S. Bhat
On 05/20/2014 04:01 PM, Srivatsa S. Bhat wrote:
On 05/20/2014 03:55 PM, Peter Zijlstra wrote:
On Tue, May 20, 2014 at 03:39:59PM +0530, Srivatsa S. Bhat wrote:
The multi_cpu_stop() path isn't exclusive to hotplug, so your changelog
is wrong or the patch is.
Yes, I know that multi_cpu_stop
On 05/20/2014 04:08 PM, Srivatsa S. Bhat wrote:
On 05/20/2014 04:01 PM, Srivatsa S. Bhat wrote:
On 05/20/2014 03:55 PM, Peter Zijlstra wrote:
On Tue, May 20, 2014 at 03:39:59PM +0530, Srivatsa S. Bhat wrote:
The multi_cpu_stop() path isn't exclusive to hotplug, so your changelog
is wrong
On 05/20/2014 01:19 AM, Srivatsa S. Bhat wrote:
> On 05/19/2014 09:48 PM, Oleg Nesterov wrote:
>> On 05/19, Srivatsa S. Bhat wrote:
>>>
>>> However, an IPI sent much earlier might arrive late on the target CPU
>>> (possibly _after_ the CPU has gone offline) du
On 05/19/2014 09:48 PM, Oleg Nesterov wrote:
> On 05/19, Srivatsa S. Bhat wrote:
>>
>> However, an IPI sent much earlier might arrive late on the target CPU
>> (possibly _after_ the CPU has gone offline) due to hardware latencies,
>> and due to this, the smp-cal
On 05/16/2014 01:13 AM, Srivatsa S. Bhat wrote:
> ---
>
> From: Srivatsa S. Bhat
> [PATCH v5 UPDATED 3/3] CPU hotplug, smp: Flush any pending IPI callbacks
> before CPU offline
>
[...]
> diff --git a/ke
On 05/16/2014 01:13 AM, Srivatsa S. Bhat wrote:
---
From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
[PATCH v5 UPDATED 3/3] CPU hotplug, smp: Flush any pending IPI callbacks
before CPU offline
[...]
diff --git
On 05/19/2014 09:48 PM, Oleg Nesterov wrote:
On 05/19, Srivatsa S. Bhat wrote:
However, an IPI sent much earlier might arrive late on the target CPU
(possibly _after_ the CPU has gone offline) due to hardware latencies,
and due to this, the smp-call-function callbacks queued on the outgoing
On 05/20/2014 01:19 AM, Srivatsa S. Bhat wrote:
On 05/19/2014 09:48 PM, Oleg Nesterov wrote:
On 05/19, Srivatsa S. Bhat wrote:
However, an IPI sent much earlier might arrive late on the target CPU
(possibly _after_ the CPU has gone offline) due to hardware latencies,
and due to this, the smp
lly different either in terms of quantity or the
order of the allocation, right?
If that's the case, then what happened to the freed memory? Did the
page-cache or other caching mechanism launder most of that so soon, that
we are forced to rely on reclaim to allocate memory during resume? Isn't
that
expect that whatever memory was freed
during suspend would naturally remain available during resume as well.
Thoughts?
Regards,
Srivatsa S. Bhat
---
From: Johannes Weiner han...@cmpxchg.org
Subject: [patch] mm: page_alloc: warn about higher-order allocations during
suspend
Higher-order
On 05/16/2014 12:47 AM, Tejun Heo wrote:
> On Fri, May 16, 2014 at 12:43:44AM +0530, Srivatsa S. Bhat wrote:
>> During CPU offline, stop-machine is used to take control over all the online
>> CPUs (via the per-cpu stopper thread) and then run take_cpu_down() on the CPU
>>
ficient there. Thanks!
-------
From: Srivatsa S. Bhat
[PATCH v5 UPDATED 1/3] smp: Print more useful debug info upon receiving IPI on
an offline CPU
Today the smp-call-function code just prints a warning if we get an IPI on
an offline CPU. This info is suff
On 05/16/2014 01:06 AM, Paul E. McKenney wrote:
> On Fri, May 16, 2014 at 12:56:11AM +0530, Srivatsa S. Bhat wrote:
>> On 05/16/2014 12:49 AM, Tejun Heo wrote:
>>> Hello,
>>>
>>> On Fri, May 16, 2014 at 12:44:13AM +0530, Srivatsa S. Bhat wrote:
>>>
On 05/16/2014 12:49 AM, Joe Perches wrote:
> On Fri, 2014-05-16 at 00:43 +0530, Srivatsa S. Bhat wrote:
>> Today the smp-call-function code just prints a warning if we get an IPI on
>> an offline CPU. This info is sufficient to let us know that something went
>> wrong, but
On 05/16/2014 12:49 AM, Tejun Heo wrote:
> Hello,
>
> On Fri, May 16, 2014 at 12:44:13AM +0530, Srivatsa S. Bhat wrote:
>> /*
>> + * flush_smp_call_function_queue - Flush any pending smp-call-function
>
> Don't we need a blank line here?
>
Hmm? That sentence conti
IPIs can be sent by the other CPUs to the
outgoing CPU at that point, because they will all be executing the stop-machine
code with interrupts disabled.
Suggested-by: Frederic Weisbecker
Signed-off-by: Srivatsa S. Bhat
---
include/linux/smp.h |2 ++
kernel/smp.c | 32
the cpu_online_mask, and hence
future invocations of smp_call_function() and friends will automatically
prune that CPU out. Thus, we can guarantee that no CPU will end up
*inadvertently* sending IPIs to an offline CPU.
Signed-off-by: Srivatsa S. Bhat
---
kernel/s
linked list and print out the payload (i.e., the name
of the function which was supposed to be executed by the target CPU). This
would give us an insight as to who might have sent the IPI and help us debug
this further.
Signed-off-by: Srivatsa S. Bhat
---
kernel/smp.c | 18 +++---
1
this framework to ensure that the CPU going offline always disables
its interrupts last. Suggested by Tejun Heo.
v1 and v2:
https://lkml.org/lkml/2014/5/6/474
Srivatsa S. Bhat (3):
smp: Print more useful debug info upon receiving IPI on an offline CPU
CPU hotplug, stop-machine: Plug race
On 05/13/2014 09:27 PM, Frederic Weisbecker wrote:
> On Tue, May 13, 2014 at 02:32:00PM +0530, Srivatsa S. Bhat wrote:
>>
>> kernel/stop_machine.c | 39 ++-
>> 1 file changed, 34 insertions(+), 5 deletions(-)
>>
>> diff --git
On 05/13/2014 09:08 PM, Frederic Weisbecker wrote:
> On Mon, May 12, 2014 at 02:06:49AM +0530, Srivatsa S. Bhat wrote:
>> Today the smp-call-function code just prints a warning if we get an IPI on
>> an offline CPU. This info is sufficient to let us know that something went
>
On 05/13/2014 09:08 PM, Frederic Weisbecker wrote:
On Mon, May 12, 2014 at 02:06:49AM +0530, Srivatsa S. Bhat wrote:
Today the smp-call-function code just prints a warning if we get an IPI on
an offline CPU. This info is sufficient to let us know that something went
wrong, but often it is very
On 05/13/2014 09:27 PM, Frederic Weisbecker wrote:
On Tue, May 13, 2014 at 02:32:00PM +0530, Srivatsa S. Bhat wrote:
kernel/stop_machine.c | 39 ++-
1 file changed, 34 insertions(+), 5 deletions(-)
diff --git a/kernel/stop_machine.c b/kernel
this framework to ensure that the CPU going offline always disables
its interrupts last. Suggested by Tejun Heo.
v1 and v2:
https://lkml.org/lkml/2014/5/6/474
Srivatsa S. Bhat (3):
smp: Print more useful debug info upon receiving IPI on an offline CPU
CPU hotplug, stop-machine: Plug race
linked list and print out the payload (i.e., the name
of the function which was supposed to be executed by the target CPU). This
would give us an insight as to who might have sent the IPI and help us debug
this further.
Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
---
kernel
future invocations of smp_call_function() and friends will automatically
prune that CPU out. Thus, we can guarantee that no CPU will end up
*inadvertently* sending IPIs to an offline CPU.
Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
---
kernel/stop_machine.c | 39
IPIs can be sent by the other CPUs to the
outgoing CPU at that point, because they will all be executing the stop-machine
code with interrupts disabled.
Suggested-by: Frederic Weisbecker fweis...@gmail.com
Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
---
include/linux/smp.h
On 05/16/2014 12:49 AM, Tejun Heo wrote:
Hello,
On Fri, May 16, 2014 at 12:44:13AM +0530, Srivatsa S. Bhat wrote:
/*
+ * flush_smp_call_function_queue - Flush any pending smp-call-function
Don't we need a blank line here?
Hmm? That sentence continues on the next line, hence I didn't
On 05/16/2014 12:49 AM, Joe Perches wrote:
On Fri, 2014-05-16 at 00:43 +0530, Srivatsa S. Bhat wrote:
Today the smp-call-function code just prints a warning if we get an IPI on
an offline CPU. This info is sufficient to let us know that something went
wrong, but often it is very hard to debug
On 05/16/2014 01:06 AM, Paul E. McKenney wrote:
On Fri, May 16, 2014 at 12:56:11AM +0530, Srivatsa S. Bhat wrote:
On 05/16/2014 12:49 AM, Tejun Heo wrote:
Hello,
On Fri, May 16, 2014 at 12:44:13AM +0530, Srivatsa S. Bhat wrote:
/*
+ * flush_smp_call_function_queue - Flush any pending smp
!
---
From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
[PATCH v5 UPDATED 1/3] smp: Print more useful debug info upon receiving IPI on
an offline CPU
Today the smp-call-function code just prints a warning if we get an IPI on
an offline CPU. This info
On 05/16/2014 12:47 AM, Tejun Heo wrote:
On Fri, May 16, 2014 at 12:43:44AM +0530, Srivatsa S. Bhat wrote:
During CPU offline, stop-machine is used to take control over all the online
CPUs (via the per-cpu stopper thread) and then run take_cpu_down() on the CPU
that is to be taken offline
On 05/13/2014 02:27 AM, Tejun Heo wrote:
> Hello,
>
> On Mon, May 12, 2014 at 02:07:04AM +0530, Srivatsa S. Bhat wrote:
>> @@ -189,10 +191,27 @@ static int multi_cpu_stop(void *data)
>> do {
>> /* Chill out and ensure we re-read multi_stop_state. *
On 05/13/2014 02:27 AM, Tejun Heo wrote:
Hello,
On Mon, May 12, 2014 at 02:07:04AM +0530, Srivatsa S. Bhat wrote:
@@ -189,10 +191,27 @@ static int multi_cpu_stop(void *data)
do {
/* Chill out and ensure we re-read multi_stop_state. */
cpu_relax
the cpu_online_mask, and hence
future invocations of smp_call_function() and friends will automatically
prune that CPU out. Thus, we can guarantee that no CPU will end up
*inadvertently* sending IPIs to an offline CPU.
Signed-off-by: Srivatsa S. Bhat
---
kernel/stop_machine.c | 25 +
linked list and print out the payload (i.e., the name
of the function which was supposed to be executed by the target CPU). This
would give us an insight as to who might have sent the IPI and help us debug
this further.
Signed-off-by: Srivatsa S. Bhat
---
kernel/smp.c | 18 ++
1
:
MULTI_STOP_DISABLE_IRQ_INACTIVE and MULTI_STOP_DISABLE_IRQ_ACTIVE, and
used this framework to ensure that the CPU going offline always disables
its interrupts last. Suggested by Tejun Heo.
v1 and v2:
https://lkml.org/lkml/2014/5/6/474
Srivatsa S. Bhat (2):
smp: Print more useful debug info upon receiving
On 05/10/2014 08:36 AM, Tejun Heo wrote:
> On Wed, May 07, 2014 at 03:31:51AM +0530, Srivatsa S. Bhat wrote:
>> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
>> index 01fbae5..7abb361 100644
>> --- a/kernel/stop_machine.c
>> +++ b/kernel/stop_machine.c
>
On 05/10/2014 08:36 AM, Tejun Heo wrote:
On Wed, May 07, 2014 at 03:31:51AM +0530, Srivatsa S. Bhat wrote:
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 01fbae5..7abb361 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -165,12 +165,13 @@ static void
linked list and print out the payload (i.e., the name
of the function which was supposed to be executed by the target CPU). This
would give us an insight as to who might have sent the IPI and help us debug
this further.
Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
---
kernel
:
MULTI_STOP_DISABLE_IRQ_INACTIVE and MULTI_STOP_DISABLE_IRQ_ACTIVE, and
used this framework to ensure that the CPU going offline always disables
its interrupts last. Suggested by Tejun Heo.
v1 and v2:
https://lkml.org/lkml/2014/5/6/474
Srivatsa S. Bhat (2):
smp: Print more useful debug info upon receiving
future invocations of smp_call_function() and friends will automatically
prune that CPU out. Thus, we can guarantee that no CPU will end up
*inadvertently* sending IPIs to an offline CPU.
Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
---
kernel/stop_machine.c | 25
ane.org/gmane.linux.kernel/1435249
Attempt to upstream that patchset in parts, v3:
http://lwn.net/Articles/556727/
Generic SMP boot/cpu-hotplug framework to consolidate arch/ code:
https://lwn.net/Articles/500185/
But, luckily the recent work to fix the notifier deadlock mess actually
went upstr
CPU hotplug problem
to fix! :-)
https://lkml.org/lkml/2014/3/10/522
Regards,
Srivatsa S. Bhat
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please
>
> Ok, I'll open code it and add an appropriate comment explaining the
> synchronization.
>
How about this?
-------
From: Srivatsa S. Bhat
[PATCH v2 2/2] CPU hotplug, stop-machine: Plug race-window that leads to
"
s an insight as to who might have sent the IPI and help us debug
this further.
Signed-off-by: Srivatsa S. Bhat
---
kernel/smp.c | 18 ++
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/kernel/smp.c b/kernel/smp.c
index 06d574e..f864921 100644
On 05/07/2014 02:12 AM, Tejun Heo wrote:
> On Tue, May 06, 2014 at 01:40:54PM -0700, Andrew Morton wrote:
>> On Tue, 06 May 2014 23:33:03 +0530 "Srivatsa S. Bhat"
>> wrote:
>>
>>> --- a/kernel/stop_machine.c
>>> +++ b/kernel/stop_machine.c
>&
On 05/07/2014 02:04 AM, Andrew Morton wrote:
> On Tue, 06 May 2014 23:32:51 +0530 "Srivatsa S. Bhat"
> wrote:
>
>> Today the smp-call-function code just prints a warning if we get an IPI on
>> an offline CPU. This info is sufficient to let us know that som
linked list and print out the payload (i.e., the name
of the function which was supposed to be executed by the target CPU). This
would give us an insight as to who might have sent the IPI and help us debug
this further.
Signed-off-by: Srivatsa S. Bhat
---
kernel/smp.c | 15 ++-
1 file
ot;holding area" for the CPUs marked
as 'active_cpus', and use this infrastructure to let the other CPUs
progress from one stage to the next, before allowing the active_cpus to
do the same thing.
Signed-off-by: Srivatsa S. Bhat
---
kernel/stop_machine.c | 22 +++---
1 file changed,
that there is
a race-window which makes the IPI _receiver_ the culprit, not the sender.
Patch 2 fixes that race and hence this should put an end to most of the
hard-to-debug IPI-to-offline-CPU issues.
Srivatsa S. Bhat (2):
smp: Print more useful debug info upon receiving IPI on an offline CPU
linked list and print out the payload (i.e., the name
of the function which was supposed to be executed by the target CPU). This
would give us an insight as to who might have sent the IPI and help us debug
this further.
Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
---
kernel
marked
as 'active_cpus', and use this infrastructure to let the other CPUs
progress from one stage to the next, before allowing the active_cpus to
do the same thing.
Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
---
kernel/stop_machine.c | 22 +++---
1 file
that there is
a race-window which makes the IPI _receiver_ the culprit, not the sender.
Patch 2 fixes that race and hence this should put an end to most of the
hard-to-debug IPI-to-offline-CPU issues.
Srivatsa S. Bhat (2):
smp: Print more useful debug info upon receiving IPI on an offline CPU
On 05/07/2014 02:04 AM, Andrew Morton wrote:
On Tue, 06 May 2014 23:32:51 +0530 Srivatsa S. Bhat
srivatsa.b...@linux.vnet.ibm.com wrote:
Today the smp-call-function code just prints a warning if we get an IPI on
an offline CPU. This info is sufficient to let us know that something went
On 05/07/2014 02:12 AM, Tejun Heo wrote:
On Tue, May 06, 2014 at 01:40:54PM -0700, Andrew Morton wrote:
On Tue, 06 May 2014 23:33:03 +0530 Srivatsa S. Bhat
srivatsa.b...@linux.vnet.ibm.com wrote:
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -165,12 +165,21 @@ static void
to hand-code things. Like this?
Yeah, this version looks better. Sorry for missing this earlier.
I'll incorporate this in my next version of the patchset.
Here is the updated patch:
-
From: Srivatsa S. Bhat srivatsa.b
this?
---
From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
[PATCH v2 2/2] CPU hotplug, stop-machine: Plug race-window that leads to
IPI-to-offline-CPU
During CPU offline, stop-machine is used to take control over all
t; v2: https://lkml.org/lkml/2014/4/29/283
>> v1: https://lkml.org/lkml/2014/4/28/469
>
> Seems to work on VIA EPIA with 3.15-rc2 (no other cpufreq fixes):
>
Great! Thanks a lot for testing this!
Rafael/Viresh, can you please add Meelis' Tested-by while taking
this patch?
for invoking _begin()/_end() and hence there shouldn't
be any conflicts which lead to double invocations. So, we can skip these
drivers, since the probability that such drivers will hit this problem is
extremely low, as outlined above.
Signed-off-by: Srivatsa S. Bhat
Acked-by: Viresh Kumar
for invoking _begin()/_end() and hence there shouldn't
be any conflicts which lead to double invocations. So, we can skip these
drivers, since the probability that such drivers will hit this problem is
extremely low, as outlined above.
Signed-off-by: Srivatsa S. Bhat srivatsa.b
://lkml.org/lkml/2014/4/28/469
Seems to work on VIA EPIA with 3.15-rc2 (no other cpufreq fixes):
Great! Thanks a lot for testing this!
Rafael/Viresh, can you please add Meelis' Tested-by while taking
this patch?
Thank you!
Regards,
Srivatsa S. Bhat
[8.250959] [ cut here
On 05/01/2014 12:48 AM, Srivatsa S. Bhat wrote:
> On 05/01/2014 12:46 AM, Srivatsa S. Bhat wrote:
>> On 04/29/2014 03:29 PM, Srivatsa S. Bhat wrote:
>>> On 04/29/2014 03:55 AM, Linus Torvalds wrote:
>>>> On Mon, Apr 28, 2014 at 3:14 PM, Davidlohr Bueso wrote:
>
On 05/01/2014 12:46 AM, Srivatsa S. Bhat wrote:
> On 04/29/2014 03:29 PM, Srivatsa S. Bhat wrote:
>> On 04/29/2014 03:55 AM, Linus Torvalds wrote:
>>> On Mon, Apr 28, 2014 at 3:14 PM, Davidlohr Bueso wrote:
>>>>
>>>> I think that returning some
On 05/01/2014 12:46 AM, Srivatsa S. Bhat wrote:
On 04/29/2014 03:29 PM, Srivatsa S. Bhat wrote:
On 04/29/2014 03:55 AM, Linus Torvalds wrote:
On Mon, Apr 28, 2014 at 3:14 PM, Davidlohr Bueso davidl...@hp.com wrote:
I think that returning some stale/bogus vma is causing those segfaults
On 05/01/2014 12:48 AM, Srivatsa S. Bhat wrote:
On 05/01/2014 12:46 AM, Srivatsa S. Bhat wrote:
On 04/29/2014 03:29 PM, Srivatsa S. Bhat wrote:
On 04/29/2014 03:55 AM, Linus Torvalds wrote:
On Mon, Apr 28, 2014 at 3:14 PM, Davidlohr Bueso davidl...@hp.com wrote:
I think that returning some
On 04/29/2014 06:39 PM, Meelis Roos wrote:
>>
>> Signed-off-by: Srivatsa S. Bhat
>> ---
>>
>> v2: Removed the coverage of ASYNC_NOTIFICATION drivers, in order to avoid
>> false-positives.
>
> I am confused - on top of what patches should I test it?
&
On 04/29/2014 06:22 PM, Oleg Nesterov wrote:
> On 04/29, Srivatsa S. Bhat wrote:
>>
>> I guess I'll hold off on testing this fix until I get to reproduce
>> the bug more reliably..
>
> perhaps the
for invoking _begin()/_end() and hence there shouldn't
be any conflicts which lead to double invocations. So, we can skip these
drivers, since the probability that such drivers will hit this problem is
extremely low, as outlined above.
Signed-off-by: Srivatsa S. Bhat
---
v2: Removed the coverage
gt; - the mm of a thread changed
>>
>>This is exec, use_mm(), and fork() (and fork really only just
>> because we copy the vmacache).
>>
>>exec and fork do that "vmacache_flush(tsk)", which is why I was
>> looking at use_mm().
>
> Here'
which have a higher likelihood of hitting this (like running multi-
threaded workloads or whatever), I could probably give it a try as well.
Thank you!
Regards,
Srivatsa S. Bhat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord
On 04/29/2014 05:30 AM, Dave Jones wrote:
> On Tue, Apr 29, 2014 at 12:48:14AM +0530, Srivatsa S. Bhat wrote:
> > Hi,
> >
> > I hit this during boot on v3.15-rc3, just once so far.
> > Subsequent reboots went fine, and a few quick runs of multi-
> > thr
On 04/29/2014 01:34 PM, Viresh Kumar wrote:
> On 29 April 2014 13:05, Srivatsa S. Bhat
> wrote:
>> On 04/29/2014 12:19 PM, Viresh Kumar wrote:
>>> + WARN_ON(!(cpufreq_driver->flags & CPUFREQ_ASYNC_NOTIFICATION)
>>> &
tasks are special" check.
>>
>> Srivatsa, are you doing something peculiar on that system that would
>> trigger this? I see some kdump failures in the log, anything else?
>
> Is this perhaps a KVM guest? fwiw I see CONFIG_KVM_ASYNC_PF=y which is a
> u
ome kdump failures in the log, anything else?
>
No, it was just plain booting. The machine is simply configured with
kdump, that's all. I'm surprised that so many processes got segfaults
during boot. Looks like an mm bug in the kernel.
Regards,
Srivatsa S. Bhat
--
To unsubscribe from this
On 04/29/2014 12:19 PM, Viresh Kumar wrote:
> On 29 April 2014 11:46, Srivatsa S. Bhat
> wrote:
>> Yes, I'm aware that this corner case doesn't work well with my debug
>
> Don't know if its a corner case, it may be the most obvious case for
> some :)
>
Yeah, it could
201 - 300 of 2690 matches
Mail list logo