Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-28 Thread Paul E. McKenney
On Thu, Jun 28, 2018 at 03:06:46PM +0200, Peter Zijlstra wrote: > On Thu, Jun 28, 2018 at 05:38:33AM -0700, Paul E. McKenney wrote: > > Please let me try again. > > > > The approach you are suggesting, clever though it is, disables a check > > >

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-28 Thread Paul E. McKenney
On Thu, Jun 28, 2018 at 03:06:46PM +0200, Peter Zijlstra wrote: > On Thu, Jun 28, 2018 at 05:38:33AM -0700, Paul E. McKenney wrote: > > Please let me try again. > > > > The approach you are suggesting, clever though it is, disables a check > > >

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-28 Thread Peter Zijlstra
On Thu, Jun 28, 2018 at 05:38:33AM -0700, Paul E. McKenney wrote: > Please let me try again. > > The approach you are suggesting, clever though it is, disables a check https://lkml.kernel.org/r/20180627094633.gg2...@hirez.programming.kicks-ass.net Is the one we're talking about, right? That

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-28 Thread Peter Zijlstra
On Thu, Jun 28, 2018 at 05:38:33AM -0700, Paul E. McKenney wrote: > Please let me try again. > > The approach you are suggesting, clever though it is, disables a check https://lkml.kernel.org/r/20180627094633.gg2...@hirez.programming.kicks-ass.net Is the one we're talking about, right? That

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-28 Thread Paul E. McKenney
On Thu, Jun 28, 2018 at 10:26:53AM +0200, Peter Zijlstra wrote: > On Wed, Jun 27, 2018 at 10:13:34PM -0700, Paul E. McKenney wrote: > > On Wed, Jun 27, 2018 at 07:51:34PM +0200, Peter Zijlstra wrote: > > > On Wed, Jun 27, 2018 at 08:57:21AM -0700, Paul E. McKenney wrote: > > > > > Another variant,

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-28 Thread Paul E. McKenney
On Thu, Jun 28, 2018 at 10:26:53AM +0200, Peter Zijlstra wrote: > On Wed, Jun 27, 2018 at 10:13:34PM -0700, Paul E. McKenney wrote: > > On Wed, Jun 27, 2018 at 07:51:34PM +0200, Peter Zijlstra wrote: > > > On Wed, Jun 27, 2018 at 08:57:21AM -0700, Paul E. McKenney wrote: > > > > > Another variant,

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-28 Thread Peter Zijlstra
On Wed, Jun 27, 2018 at 10:13:34PM -0700, Paul E. McKenney wrote: > On Wed, Jun 27, 2018 at 07:51:34PM +0200, Peter Zijlstra wrote: > > On Wed, Jun 27, 2018 at 08:57:21AM -0700, Paul E. McKenney wrote: > > > > Another variant, which simply skips the wakeup whever ran on an offline > > > > CPU,

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-28 Thread Peter Zijlstra
On Wed, Jun 27, 2018 at 10:13:34PM -0700, Paul E. McKenney wrote: > On Wed, Jun 27, 2018 at 07:51:34PM +0200, Peter Zijlstra wrote: > > On Wed, Jun 27, 2018 at 08:57:21AM -0700, Paul E. McKenney wrote: > > > > Another variant, which simply skips the wakeup whever ran on an offline > > > > CPU,

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-27 Thread Paul E. McKenney
On Wed, Jun 27, 2018 at 07:51:34PM +0200, Peter Zijlstra wrote: > On Wed, Jun 27, 2018 at 08:57:21AM -0700, Paul E. McKenney wrote: > > > Another variant, which simply skips the wakeup whever ran on an offline > > > CPU, relying on the wakeup from rcutree_migrate_callbacks() right after > > > the

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-27 Thread Paul E. McKenney
On Wed, Jun 27, 2018 at 07:51:34PM +0200, Peter Zijlstra wrote: > On Wed, Jun 27, 2018 at 08:57:21AM -0700, Paul E. McKenney wrote: > > > Another variant, which simply skips the wakeup whever ran on an offline > > > CPU, relying on the wakeup from rcutree_migrate_callbacks() right after > > > the

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-27 Thread Peter Zijlstra
On Wed, Jun 27, 2018 at 08:57:21AM -0700, Paul E. McKenney wrote: > > Another variant, which simply skips the wakeup whever ran on an offline > > CPU, relying on the wakeup from rcutree_migrate_callbacks() right after > > the CPU really is dead. > > Cute! ;-) > > And a much smaller change. > >

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-27 Thread Peter Zijlstra
On Wed, Jun 27, 2018 at 08:57:21AM -0700, Paul E. McKenney wrote: > > Another variant, which simply skips the wakeup whever ran on an offline > > CPU, relying on the wakeup from rcutree_migrate_callbacks() right after > > the CPU really is dead. > > Cute! ;-) > > And a much smaller change. > >

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-27 Thread Paul E. McKenney
On Wed, Jun 27, 2018 at 11:46:33AM +0200, Peter Zijlstra wrote: > On Wed, Jun 27, 2018 at 11:11:06AM +0200, Peter Zijlstra wrote: > > On Tue, Jun 26, 2018 at 04:40:04PM -0700, Paul E. McKenney wrote: > > > The options I have considered are as follows: > > > > > 2.Stick with the

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-27 Thread Paul E. McKenney
On Wed, Jun 27, 2018 at 11:46:33AM +0200, Peter Zijlstra wrote: > On Wed, Jun 27, 2018 at 11:11:06AM +0200, Peter Zijlstra wrote: > > On Tue, Jun 26, 2018 at 04:40:04PM -0700, Paul E. McKenney wrote: > > > The options I have considered are as follows: > > > > > 2.Stick with the

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-27 Thread Paul E. McKenney
On Wed, Jun 27, 2018 at 11:11:06AM +0200, Peter Zijlstra wrote: > On Tue, Jun 26, 2018 at 04:40:04PM -0700, Paul E. McKenney wrote: > > The options I have considered are as follows: > > > 2. Stick with the no-failsafe approach, but rely on RCU's grace-period > > kthread to wake up later due

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-27 Thread Paul E. McKenney
On Wed, Jun 27, 2018 at 11:11:06AM +0200, Peter Zijlstra wrote: > On Tue, Jun 26, 2018 at 04:40:04PM -0700, Paul E. McKenney wrote: > > The options I have considered are as follows: > > > 2. Stick with the no-failsafe approach, but rely on RCU's grace-period > > kthread to wake up later due

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-27 Thread Paul E. McKenney
On Wed, Jun 27, 2018 at 10:33:35AM +0200, Peter Zijlstra wrote: > On Tue, Jun 26, 2018 at 04:40:04PM -0700, Paul E. McKenney wrote: > > The options I have considered are as follows: > > > > 1. Stick with the no-failsafe approach, adding the lock as shown > > in this patch. (I obviously

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-27 Thread Paul E. McKenney
On Wed, Jun 27, 2018 at 10:33:35AM +0200, Peter Zijlstra wrote: > On Tue, Jun 26, 2018 at 04:40:04PM -0700, Paul E. McKenney wrote: > > The options I have considered are as follows: > > > > 1. Stick with the no-failsafe approach, adding the lock as shown > > in this patch. (I obviously

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-27 Thread Peter Zijlstra
On Wed, Jun 27, 2018 at 11:11:06AM +0200, Peter Zijlstra wrote: > On Tue, Jun 26, 2018 at 04:40:04PM -0700, Paul E. McKenney wrote: > > The options I have considered are as follows: > > > 2. Stick with the no-failsafe approach, but rely on RCU's grace-period > > kthread to wake up later due

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-27 Thread Peter Zijlstra
On Wed, Jun 27, 2018 at 11:11:06AM +0200, Peter Zijlstra wrote: > On Tue, Jun 26, 2018 at 04:40:04PM -0700, Paul E. McKenney wrote: > > The options I have considered are as follows: > > > 2. Stick with the no-failsafe approach, but rely on RCU's grace-period > > kthread to wake up later due

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-27 Thread Peter Zijlstra
On Tue, Jun 26, 2018 at 04:40:04PM -0700, Paul E. McKenney wrote: > The options I have considered are as follows: > 2.Stick with the no-failsafe approach, but rely on RCU's grace-period > kthread to wake up later due to its timed wait during the > force-quiescent-state process.

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-27 Thread Peter Zijlstra
On Tue, Jun 26, 2018 at 04:40:04PM -0700, Paul E. McKenney wrote: > The options I have considered are as follows: > 2.Stick with the no-failsafe approach, but rely on RCU's grace-period > kthread to wake up later due to its timed wait during the > force-quiescent-state process.

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-27 Thread Peter Zijlstra
On Tue, Jun 26, 2018 at 04:40:04PM -0700, Paul E. McKenney wrote: > The options I have considered are as follows: > > 1.Stick with the no-failsafe approach, adding the lock as shown > in this patch. (I obviously prefer this approach.) > > 2.Stick with the no-failsafe approach, but

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-27 Thread Peter Zijlstra
On Tue, Jun 26, 2018 at 04:40:04PM -0700, Paul E. McKenney wrote: > The options I have considered are as follows: > > 1.Stick with the no-failsafe approach, adding the lock as shown > in this patch. (I obviously prefer this approach.) > > 2.Stick with the no-failsafe approach, but

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Paul E. McKenney
On Tue, Jun 26, 2018 at 10:32:25PM +0200, Peter Zijlstra wrote: > On Tue, Jun 26, 2018 at 01:26:15PM -0700, Paul E. McKenney wrote: > > commit 2e5b2ff4047b138d6b56e4e3ba91bc47503cdebe > > Author: Paul E. McKenney > > Date: Fri May 25 19:23:09 2018 -0700 > > > > rcu: Fix grace-period hangs

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Paul E. McKenney
On Tue, Jun 26, 2018 at 10:32:25PM +0200, Peter Zijlstra wrote: > On Tue, Jun 26, 2018 at 01:26:15PM -0700, Paul E. McKenney wrote: > > commit 2e5b2ff4047b138d6b56e4e3ba91bc47503cdebe > > Author: Paul E. McKenney > > Date: Fri May 25 19:23:09 2018 -0700 > > > > rcu: Fix grace-period hangs

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Paul E. McKenney
On Tue, Jun 26, 2018 at 09:48:07PM +0200, Peter Zijlstra wrote: > On Tue, Jun 26, 2018 at 11:19:37AM -0700, Paul E. McKenney wrote: > > The initial reason for cacheline_internodealigned_in_smp was that > > some of the fields can be accessed by random CPUs, while others are > > used more

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Paul E. McKenney
On Tue, Jun 26, 2018 at 09:48:07PM +0200, Peter Zijlstra wrote: > On Tue, Jun 26, 2018 at 11:19:37AM -0700, Paul E. McKenney wrote: > > The initial reason for cacheline_internodealigned_in_smp was that > > some of the fields can be accessed by random CPUs, while others are > > used more

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Peter Zijlstra
On Tue, Jun 26, 2018 at 01:26:15PM -0700, Paul E. McKenney wrote: > commit 2e5b2ff4047b138d6b56e4e3ba91bc47503cdebe > Author: Paul E. McKenney > Date: Fri May 25 19:23:09 2018 -0700 > > rcu: Fix grace-period hangs due to race with CPU offline > > Without special fail-safe

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Peter Zijlstra
On Tue, Jun 26, 2018 at 01:26:15PM -0700, Paul E. McKenney wrote: > commit 2e5b2ff4047b138d6b56e4e3ba91bc47503cdebe > Author: Paul E. McKenney > Date: Fri May 25 19:23:09 2018 -0700 > > rcu: Fix grace-period hangs due to race with CPU offline > > Without special fail-safe

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Paul E. McKenney
On Tue, Jun 26, 2018 at 11:29:50AM -0700, Paul E. McKenney wrote: > On Tue, Jun 26, 2018 at 07:51:19PM +0200, Peter Zijlstra wrote: > > On Tue, Jun 26, 2018 at 10:10:39AM -0700, Paul E. McKenney wrote: > > > Without special fail-safe quiescent-state-propagation checks, grace-period > > > hangs can

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Paul E. McKenney
On Tue, Jun 26, 2018 at 11:29:50AM -0700, Paul E. McKenney wrote: > On Tue, Jun 26, 2018 at 07:51:19PM +0200, Peter Zijlstra wrote: > > On Tue, Jun 26, 2018 at 10:10:39AM -0700, Paul E. McKenney wrote: > > > Without special fail-safe quiescent-state-propagation checks, grace-period > > > hangs can

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Peter Zijlstra
On Tue, Jun 26, 2018 at 11:29:50AM -0700, Paul E. McKenney wrote: > On Tue, Jun 26, 2018 at 07:51:19PM +0200, Peter Zijlstra wrote: > > On Tue, Jun 26, 2018 at 10:10:39AM -0700, Paul E. McKenney wrote: > > > Without special fail-safe quiescent-state-propagation checks, grace-period > > > hangs can

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Peter Zijlstra
On Tue, Jun 26, 2018 at 11:29:50AM -0700, Paul E. McKenney wrote: > On Tue, Jun 26, 2018 at 07:51:19PM +0200, Peter Zijlstra wrote: > > On Tue, Jun 26, 2018 at 10:10:39AM -0700, Paul E. McKenney wrote: > > > Without special fail-safe quiescent-state-propagation checks, grace-period > > > hangs can

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Peter Zijlstra
On Tue, Jun 26, 2018 at 11:19:37AM -0700, Paul E. McKenney wrote: > The initial reason for cacheline_internodealigned_in_smp was that > some of the fields can be accessed by random CPUs, while others are > used more locally, give or take our usual contention over the handling > of CPU numbers.

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Peter Zijlstra
On Tue, Jun 26, 2018 at 11:19:37AM -0700, Paul E. McKenney wrote: > The initial reason for cacheline_internodealigned_in_smp was that > some of the fields can be accessed by random CPUs, while others are > used more locally, give or take our usual contention over the handling > of CPU numbers.

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Paul E. McKenney
On Tue, Jun 26, 2018 at 07:51:19PM +0200, Peter Zijlstra wrote: > On Tue, Jun 26, 2018 at 10:10:39AM -0700, Paul E. McKenney wrote: > > Without special fail-safe quiescent-state-propagation checks, grace-period > > hangs can result from the following scenario: > > > > 1. CPU 1 goes offline. > >

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Paul E. McKenney
On Tue, Jun 26, 2018 at 07:51:19PM +0200, Peter Zijlstra wrote: > On Tue, Jun 26, 2018 at 10:10:39AM -0700, Paul E. McKenney wrote: > > Without special fail-safe quiescent-state-propagation checks, grace-period > > hangs can result from the following scenario: > > > > 1. CPU 1 goes offline. > >

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Paul E. McKenney
On Tue, Jun 26, 2018 at 07:44:24PM +0200, Peter Zijlstra wrote: > On Tue, Jun 26, 2018 at 10:10:39AM -0700, Paul E. McKenney wrote: > > diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h > > index 3def94fc9c74..6683da6e4ecc 100644 > > --- a/kernel/rcu/tree.h > > +++ b/kernel/rcu/tree.h > > @@

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Paul E. McKenney
On Tue, Jun 26, 2018 at 07:44:24PM +0200, Peter Zijlstra wrote: > On Tue, Jun 26, 2018 at 10:10:39AM -0700, Paul E. McKenney wrote: > > diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h > > index 3def94fc9c74..6683da6e4ecc 100644 > > --- a/kernel/rcu/tree.h > > +++ b/kernel/rcu/tree.h > > @@

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Peter Zijlstra
On Tue, Jun 26, 2018 at 10:10:39AM -0700, Paul E. McKenney wrote: > Without special fail-safe quiescent-state-propagation checks, grace-period > hangs can result from the following scenario: > > 1.CPU 1 goes offline. > > 2.Because CPU 1 is the only CPU in the system blocking the current

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Peter Zijlstra
On Tue, Jun 26, 2018 at 10:10:39AM -0700, Paul E. McKenney wrote: > Without special fail-safe quiescent-state-propagation checks, grace-period > hangs can result from the following scenario: > > 1.CPU 1 goes offline. > > 2.Because CPU 1 is the only CPU in the system blocking the current

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Peter Zijlstra
On Tue, Jun 26, 2018 at 10:10:39AM -0700, Paul E. McKenney wrote: > diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h > index 3def94fc9c74..6683da6e4ecc 100644 > --- a/kernel/rcu/tree.h > +++ b/kernel/rcu/tree.h > @@ -363,6 +363,10 @@ struct rcu_state { > const char *name;

Re: [PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Peter Zijlstra
On Tue, Jun 26, 2018 at 10:10:39AM -0700, Paul E. McKenney wrote: > diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h > index 3def94fc9c74..6683da6e4ecc 100644 > --- a/kernel/rcu/tree.h > +++ b/kernel/rcu/tree.h > @@ -363,6 +363,10 @@ struct rcu_state { > const char *name;

[PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Paul E. McKenney
Without special fail-safe quiescent-state-propagation checks, grace-period hangs can result from the following scenario: 1. CPU 1 goes offline. 2. Because CPU 1 is the only CPU in the system blocking the current grace period, as soon as rcu_cleanup_dying_idle_cpu()'s call to

[PATCH tip/core/rcu 13/22] rcu: Fix grace-period hangs due to race with CPU offline

2018-06-26 Thread Paul E. McKenney
Without special fail-safe quiescent-state-propagation checks, grace-period hangs can result from the following scenario: 1. CPU 1 goes offline. 2. Because CPU 1 is the only CPU in the system blocking the current grace period, as soon as rcu_cleanup_dying_idle_cpu()'s call to