dvh...@linux.vnet.ibm.com wrote on 09/02/2010 01:04:28 AM:
Subject
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with
CONFIG_PREEMPT on pseries
With this in place, we no longer see the preempt_count dropping below
zero. However, if I offline/online a CPU about 246 times I
On 09/01/2010 09:06 PM, Steven Rostedt wrote:
On Thu, 2010-09-02 at 11:02 +1000, Michael Neuling wrote:
We need to call smp_startup_cpu on boot when we the cpus are still in
FW. smp_startup_cpu does this for us on boot.
I'm wondering if we just need to move the test down a bit to make sure
In message 1283400367.2356.69.ca...@gandalf.stny.rr.com you wrote:
On Thu, 2010-09-02 at 11:02 +1000, Michael Neuling wrote:
We need to call smp_startup_cpu on boot when we the cpus are still in
FW. smp_startup_cpu does this for us on boot.
I'm wondering if we just need to move the
On 09/02/2010 04:04 PM, Michael Neuling wrote:
In message 1283400367.2356.69.ca...@gandalf.stny.rr.com you wrote:
On Thu, 2010-09-02 at 11:02 +1000, Michael Neuling wrote:
We need to call smp_startup_cpu on boot when we the cpus are still in
FW. smp_startup_cpu does this for us on boot.
On 08/31/2010 10:54 PM, Michael Ellerman wrote:
On Tue, 2010-08-31 at 00:12 -0700, Darren Hart wrote:
..
When running with the function plugin I had to stop the trace
immediately before entering start_secondary after an online or my traces
would not include the pseries_mach_cpu_die function,
On 09/01/2010 08:10 AM, Darren Hart wrote:
On 08/31/2010 10:54 PM, Michael Ellerman wrote:
On Tue, 2010-08-31 at 00:12 -0700, Darren Hart wrote:
..
When running with the function plugin I had to stop the trace
immediately before entering start_secondary after an online or my traces
would
On Wed, 2010-09-01 at 11:47 -0700, Darren Hart wrote:
from tip/rt/2.6.33 causes the preempt_count() to change across the cede
call. This patch appears to prevents the proxy preempt_count assignment
from happening. This non-local-cpu assignment to 0 would cause an
underrun of preempt_count()
On 09/01/2010 12:59 PM, Steven Rostedt wrote:
On Wed, 2010-09-01 at 11:47 -0700, Darren Hart wrote:
from tip/rt/2.6.33 causes the preempt_count() to change across the cede
call. This patch appears to prevents the proxy preempt_count assignment
from happening. This non-local-cpu assignment
In message 4c7ebaa8.7030...@us.ibm.com you wrote:
On 09/01/2010 12:59 PM, Steven Rostedt wrote:
On Wed, 2010-09-01 at 11:47 -0700, Darren Hart wrote:
from tip/rt/2.6.33 causes the preempt_count() to change across the cede
call. This patch appears to prevents the proxy preempt_count
+ /* Check to see if the CPU out of FW already for kexec */
Wow, that comment is shit. The checkin comment in
aef40e87d866355ffd279ab21021de733242d0d5 is much better.
This comment is really confusing to me. I _think_ it is saying that this test
determines if the CPU is done executing
On Thu, 2010-09-02 at 11:02 +1000, Michael Neuling wrote:
We need to call smp_startup_cpu on boot when we the cpus are still in
FW. smp_startup_cpu does this for us on boot.
I'm wondering if we just need to move the test down a bit to make sure
the preempt_count is set. I've not been
On 08/19/2010 08:58 AM, Ankita Garg wrote:
Hi Darren,
On Thu, Jul 22, 2010 at 11:24:13AM -0700, Darren Hart wrote:
With some instrumentation we were able to determine that the
preempt_count() appears to change across the extended_cede_processor()
call. Specifically across the
On Tue, 2010-08-31 at 00:12 -0700, Darren Hart wrote:
..
When running with the function plugin I had to stop the trace
immediately before entering start_secondary after an online or my traces
would not include the pseries_mach_cpu_die function, nor the tracing I
added there (possibly buffer
On 08/19/2010 08:58 AM, Ankita Garg wrote:
Hi Darren,
On Thu, Jul 22, 2010 at 11:24:13AM -0700, Darren Hart wrote:
With some instrumentation we were able to determine that the
preempt_count() appears to change across the extended_cede_processor()
call. Specifically across the
Hi Darren,
On Thu, Jul 22, 2010 at 11:24:13AM -0700, Darren Hart wrote:
With some instrumentation we were able to determine that the
preempt_count() appears to change across the extended_cede_processor()
call. Specifically across the plpar_hcall_norets(H_CEDE) call. On
PREEMPT_RT we call
Ankita Garg ank...@in.ibm.com wrote on 08/19/2010 10:58:24 AM:
Subject
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with
CONFIG_PREEMPT on pseries
Hi Darren,
On Thu, Jul 22, 2010 at 11:24:13AM -0700, Darren Hart wrote:
With some instrumentation we were able to determine
On 07/22/2010 11:38 AM, Thomas Gleixner wrote:
On Thu, 22 Jul 2010, Darren Hart wrote:
Also of interest is that this path
cpu_idle()-cpu_die()-pseries_mach_cpu_die() to start_secondary()
enters with a preempt_count=1 if it wasn't corrupted across the hcall.
That triggers the problem as well.
On 08/05/2010 10:09 PM, Vaidyanathan Srinivasan wrote:
* Darren Hartdvh...@us.ibm.com [2010-08-05 19:19:00]:
On 07/22/2010 10:09 PM, Benjamin Herrenschmidt wrote:
On Thu, 2010-07-22 at 21:44 -0700, Darren Hart wrote:
suggestion I updated the instrumentation to display the
* Darren Hart dvh...@us.ibm.com [2010-08-04 21:45:51]:
On 07/23/2010 12:07 AM, Vaidyanathan Srinivasan wrote:
* Benjamin Herrenschmidtb...@kernel.crashing.org [2010-07-23 15:11:00]:
On Fri, 2010-07-23 at 10:38 +0530, Vaidyanathan Srinivasan wrote:
Yes. extended_cede_processor() will
On Thu, 5 Aug 2010, Vaidyanathan Srinivasan wrote:
* Darren Hart dvh...@us.ibm.com [2010-08-04 21:45:51]:
On 07/23/2010 12:07 AM, Vaidyanathan Srinivasan wrote:
* Benjamin Herrenschmidtb...@kernel.crashing.org [2010-07-23 15:11:00]:
On Fri, 2010-07-23 at 10:38 +0530, Vaidyanathan
On 07/22/2010 10:09 PM, Benjamin Herrenschmidt wrote:
On Thu, 2010-07-22 at 21:44 -0700, Darren Hart wrote:
suggestion I updated the instrumentation to display the
local_save_flags and irqs_disabled_flags:
Jul 22 23:36:58 igoort1 kernel: local flags: 0, irqs disabled: 1
Jul 22 23:36:58
* Darren Hart dvh...@us.ibm.com [2010-08-05 19:19:00]:
On 07/22/2010 10:09 PM, Benjamin Herrenschmidt wrote:
On Thu, 2010-07-22 at 21:44 -0700, Darren Hart wrote:
suggestion I updated the instrumentation to display the
local_save_flags and irqs_disabled_flags:
Jul 22 23:36:58
On 07/22/2010 03:25 PM, Benjamin Herrenschmidt wrote:
On Thu, 2010-07-22 at 11:24 -0700, Darren Hart wrote:
1) How can the preempt_count() get mangled across the H_CEDE hcall?
2) Should we call preempt_enable() in cpu_idle() prior to cpu_die() ?
The preempt count is on the thread info at the
On 07/23/2010 12:07 AM, Vaidyanathan Srinivasan wrote:
* Benjamin Herrenschmidtb...@kernel.crashing.org [2010-07-23 15:11:00]:
On Fri, 2010-07-23 at 10:38 +0530, Vaidyanathan Srinivasan wrote:
Yes. extended_cede_processor() will return with interrupts enabled in
the cpu. (This is done by
* Benjamin Herrenschmidt b...@kernel.crashing.org [2010-07-23 15:11:00]:
On Fri, 2010-07-23 at 10:38 +0530, Vaidyanathan Srinivasan wrote:
Yes. extended_cede_processor() will return with interrupts enabled in
the cpu. (This is done by the hypervisor). Under normal cases we
cannot be
dvh...@linux.vnet.ibm.com wrote on 07/22/2010 06:57:18 PM:
Subject
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with
CONFIG_PREEMPT on pseries
On 07/22/2010 03:25 PM, Benjamin Herrenschmidt wrote:
On Thu, 2010-07-22 at 11:24 -0700, Darren Hart wrote:
1) How can
While testing CPU offline/online, we hit various preempt_count related
bugs. Various hacks have been employed for several theoretical corner
cases. One situation however is perfectly repeatable on 2.6.33.6 with
CONFIG_PREEMPT=y.
BUG: scheduling while atomic: swapper/0/0x0065
Modules linked
On 07/22/2010 11:24 AM, Darren Hart wrote:
The following patch is most certainly not correct, but it does eliminate
the situation on mainline 100% of the time (there is still a 25%
reproduction rate on PREEMPT_RT). Can someone comment on:
Apologies. This particular issue is also 100%
On Thu, 22 Jul 2010, Darren Hart wrote:
Also of interest is that this path
cpu_idle()-cpu_die()-pseries_mach_cpu_die() to start_secondary()
enters with a preempt_count=1 if it wasn't corrupted across the hcall.
That triggers the problem as well. preempt_count needs to be 0 when
entering
On Thu, 2010-07-22 at 11:24 -0700, Darren Hart wrote:
1) How can the preempt_count() get mangled across the H_CEDE hcall?
2) Should we call preempt_enable() in cpu_idle() prior to cpu_die() ?
The preempt count is on the thread info at the bottom of the stack.
Can you check the stack pointers
On 07/22/2010 03:25 PM, Benjamin Herrenschmidt wrote:
On Thu, 2010-07-22 at 11:24 -0700, Darren Hart wrote:
1) How can the preempt_count() get mangled across the H_CEDE hcall?
2) Should we call preempt_enable() in cpu_idle() prior to cpu_die() ?
The preempt count is on the thread info at
On 07/22/2010 04:57 PM, Darren Hart wrote:
On 07/22/2010 03:25 PM, Benjamin Herrenschmidt wrote:
On Thu, 2010-07-22 at 11:24 -0700, Darren Hart wrote:
1) How can the preempt_count() get mangled across the H_CEDE hcall?
2) Should we call preempt_enable() in cpu_idle() prior to cpu_die() ?
* Darren Hart dvh...@us.ibm.com [2010-07-22 21:44:04]:
On 07/22/2010 04:57 PM, Darren Hart wrote:
On 07/22/2010 03:25 PM, Benjamin Herrenschmidt wrote:
On Thu, 2010-07-22 at 11:24 -0700, Darren Hart wrote:
1) How can the preempt_count() get mangled across the H_CEDE hcall?
2) Should we
On Thu, 2010-07-22 at 21:44 -0700, Darren Hart wrote:
suggestion I updated the instrumentation to display the
local_save_flags and irqs_disabled_flags:
Jul 22 23:36:58 igoort1 kernel: local flags: 0, irqs disabled: 1
Jul 22 23:36:58 igoort1 kernel: before H_CEDE current-stack:
On Fri, 2010-07-23 at 10:38 +0530, Vaidyanathan Srinivasan wrote:
Yes. extended_cede_processor() will return with interrupts enabled in
the cpu. (This is done by the hypervisor). Under normal cases we
cannot be interrupted because no IO interrupts are routed to us after
xics_teardown_cpu()
35 matches
Mail list logo