Re: [PATCHv2] rcu: tree: correctly handle sparse possible CPUs

2016-05-20 Thread kbuild test robot
Hi,

[auto build test ERROR on rcu/rcu/next]
[cannot apply to v4.6 next-20160519]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Mark-Rutland/rcu-tree-correctly-handle-sparse-possible-CPUs/20160517-182533
base:   https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
rcu/next
config: x86_64-randconfig-s5-05172218 (attached as .config)
compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

Note: the 
linux-review/Mark-Rutland/rcu-tree-correctly-handle-sparse-possible-CPUs/20160517-182533
 HEAD 63c5ae5d92c6952c029738ca1dd3382ce8b1cf4d builds fine.
  It only hurts bisectibility.

All errors (new ones prefixed by >>):

   In file included from kernel/rcu/tree.c:4209:0:
   kernel/rcu/tree_plugin.h: In function 'rcu_boost_kthread_setaffinity':
   kernel/rcu/tree_plugin.h:1168:2: error: implicit declaration of function 
'for_each_leaf_node_cpu_bit' [-Werror=implicit-function-declaration]
 for_each_leaf_node_cpu_bit(rnp, cpu, bit)
 ^~
>> kernel/rcu/tree_plugin.h:1169:3: error: expected ';' before 'if'
  if ((mask & bit) && cpu != outgoingcpu)
  ^~
   kernel/rcu/tree_plugin.h:1159:16: warning: unused variable 'mask' 
[-Wunused-variable]
 unsigned long mask = rcu_rnp_online_cpus(rnp);
   ^~~~
   cc1: some warnings being treated as errors

vim +1169 kernel/rcu/tree_plugin.h

  1162  int cpu;
  1163  
  1164  if (!t)
  1165  return;
  1166  if (!zalloc_cpumask_var(&cm, GFP_KERNEL))
  1167  return;
> 1168  for_each_leaf_node_cpu_bit(rnp, cpu, bit)
> 1169  if ((mask & bit) && cpu != outgoingcpu)
  1170  cpumask_set_cpu(cpu, cm);
  1171  if (cpumask_weight(cm) == 0)
  1172  cpumask_setall(cm);

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: [PATCHv2] rcu: tree: correctly handle sparse possible CPUs

2016-05-18 Thread Paul E. McKenney
On Wed, May 18, 2016 at 07:15:09PM +0100, Mark Rutland wrote:
> On Wed, May 18, 2016 at 02:02:36PM +0200, Arnd Bergmann wrote:
> > It's the missing "possible_" that Mark mentioned in his reply on Friday.
> 
> Actually, that was this morning. My VM on my laptop had a stale date due to
> suspend/resume of the host. :/
> 
> I should be back at a real computer by Friday, and can respin the patch to fix
> the issue Andrey pointed out.
> 
> Thanks for the fixup, and sorry for the confusion!
> 
> Mark.
> 
> > Please fold the fixup below into the patch if you want to get it to build.
> > 
> > Signed-off-by: Arnd Bergmann  > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > index fd6b0f701bed..bb137b0ef6f3 100644
> > --- a/kernel/rcu/tree_plugin.h
> > +++ b/kernel/rcu/tree_plugin.h
> > @@ -1165,7 +1165,7 @@ static void rcu_boost_kthread_setaffinity(struct 
> > rcu_node *rnp, int outgoingcpu)
> > return;
> > if (!zalloc_cpumask_var(&cm, GFP_KERNEL))
> > return;
> > -   for_each_leaf_node_cpu_bit(rnp, cpu, bit)
> > +   for_each_leaf_node_possible_cpu_bit(rnp, cpu, bit)
> > if ((mask & bit) && cpu != outgoingcpu)
> > cpumask_set_cpu(cpu, cm);
> > if (cpumask_weight(cm) == 0)
> > 
> 



Re: [PATCHv2] rcu: tree: correctly handle sparse possible CPUs

2016-05-18 Thread Mark Rutland
On Wed, May 18, 2016 at 02:02:36PM +0200, Arnd Bergmann wrote:
> It's the missing "possible_" that Mark mentioned in his reply on Friday.

Actually, that was this morning. My VM on my laptop had a stale date due to
suspend/resume of the host. :/

I should be back at a real computer by Friday, and can respin the patch to fix
the issue Andrey pointed out.

Thanks for the fixup, and sorry for the confusion!

Mark.

> Please fold the fixup below into the patch if you want to get it to build.
> 
> Signed-off-by: Arnd Bergmann  
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index fd6b0f701bed..bb137b0ef6f3 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -1165,7 +1165,7 @@ static void rcu_boost_kthread_setaffinity(struct 
> rcu_node *rnp, int outgoingcpu)
>   return;
>   if (!zalloc_cpumask_var(&cm, GFP_KERNEL))
>   return;
> - for_each_leaf_node_cpu_bit(rnp, cpu, bit)
> + for_each_leaf_node_possible_cpu_bit(rnp, cpu, bit)
>   if ((mask & bit) && cpu != outgoingcpu)
>   cpumask_set_cpu(cpu, cm);
>   if (cpumask_weight(cm) == 0)
> 


Re: [PATCHv2] rcu: tree: correctly handle sparse possible CPUs

2016-05-18 Thread Arnd Bergmann
On Tuesday 17 May 2016 17:12:51 Paul E. McKenney wrote:
> And some build errors:
> 
> In file included from 
> /home/paulmck/public_git/linux-rcu/kernel/rcu/tree.c:4209:0:
> /home/paulmck/public_git/linux-rcu/kernel/rcu/tree_plugin.h: In function 
> ‘rcu_boost_kthread_setaffinity’:
> /home/paulmck/public_git/linux-rcu/kernel/rcu/tree_plugin.h:1168:2: error: 
> implicit declaration of function ‘for_each_leaf_node_cpu_bit’ 
> [-Werror=implicit-function-declaration]
>   for_each_leaf_node_cpu_bit(rnp, cpu, bit)
>   ^
> /home/paulmck/public_git/linux-rcu/kernel/rcu/tree_plugin.h:1169:3: error: 
> expected ‘;’ before ‘if’
>if ((mask & bit) && cpu != outgoingcpu)
>^
> /home/paulmck/public_git/linux-rcu/kernel/rcu/tree_plugin.h:1159:16: warning: 
> unused variable ‘mask’ [-Wunused-variable]
>   unsigned long mask = rcu_rnp_online_cpus(rnp);
> ^
> 
> Please see below for the .config.
> 
> I have dropped the patch from my tree, looking forward to getting an
> update that fixes the build errors.
> 

It's the missing "possible_" that Mark mentioned in his reply on Friday.

Please fold the fixup below into the patch if you want to get it to build.

Signed-off-by: Arnd Bergmann 

Re: [PATCHv2] rcu: tree: correctly handle sparse possible CPUs

2016-05-17 Thread Mark Rutland
On Tue, May 17, 2016 at 12:01:06PM -0700, Paul E. McKenney wrote:
> On Tue, May 17, 2016 at 11:22:10AM +0100, Mark Rutland wrote:
> >  /*
> > + * Iterate over all possible CPUs in a leaf RCU node.
> > + */
> > +#define for_each_leaf_node_possible_cpu(rnp, cpu) \
> > +   for ((cpu) = rnp->grplo; \
> > +cpu <= rnp->grphi; \
> > +cpu = cpumask_next((cpu), cpu_possible_mask))
> 
> What if the rnp->grplo corresponds to a non-existent CPU?

Good point, I had evidently not considered that.

> Would something like this handle that possibility?
> 
> +#define for_each_leaf_node_possible_cpu(rnp, cpu) \
> + for ((cpu) = cpumask_next(rnp->grplo - 1, cpu_possible_mask); \
> +  cpu <= rnp->grphi; \
> +  cpu = cpumask_next((cpu), cpu_possible_mask))
> 
> Or maybe like this, with less duplicated code but very strange style:
> 
> +#define for_each_leaf_node_possible_cpu(rnp, cpu) \
> + for ((cpu) = rnp->grplo - 1; \
> +  cpu = cpumask_next((cpu), cpu_possible_mask), cpu <= rnp->grphi; 1)
> 
> The first one is probably far better, assuming that it works, but I could
> not resist inflicting the second one on you.  ;-)

:)

Those both look like they should work, I'll fold the former in.

> > +/*
> > + * Iterate over all possible CPUs in a leaf RCU node, at each step 
> > providing a
> > + * bit for comparison against rcu_node bitmasks.
> > + */
> > +#define for_each_leaf_node_possible_cpu_bit(rnp, cpu, bit) \
> > +   for ((cpu) = rnp->grplo, (bit) = 1; \
> > +cpu <= rnp->grphi; \
> > +cpu = cpumask_next((cpu), cpu_possible_mask), \
> > +  (bit) = 1UL << (cpu - rnp->grplo))
> 
> Same question here.

Likewise.

I'll also see about fixing the build issue you spotted in the other reply; that
appears to be a typo (missing 'possible_' in the macro invocation).

I'm away from my development machine at the moment, so that may not appear
until next week.

Thanks,
Mark.


Re: [PATCHv2] rcu: tree: correctly handle sparse possible CPUs

2016-05-17 Thread Paul E. McKenney
On Tue, May 17, 2016 at 12:01:06PM -0700, Paul E. McKenney wrote:
> On Tue, May 17, 2016 at 11:22:10AM +0100, Mark Rutland wrote:
> > In many cases in the RCU tree code, we iterate over the set of CPUs for
> > a leaf node described by rcu_node::grplo and rcu_node::grphi, checking
> > per-cpu data for each CPU in this range. However, if the set of possible
> > CPUs is sparse, some CPUs described in this range are not possible, and
> > thus no per-cpu region will have been allocated (or initialised) for
> > them by the generic percpu code.
> > 
> > Erroneous accesses to a per-cpu area for these !possible CPUs may fault
> > or may hit other data depending on the addressed generated when the
> > erroneous per cpu offset is applied. In practice, both cases have been
> > observed on arm64 hardware (the former being silent, but detectable with
> > additional patches).
> > 
> > To avoid issues resulting from this, we must iterate over the set of
> > *possible* cpus for a given leaf node. This patch adds new helpers to
> > enable this (also unifying and simplifying some related bitmask
> > manipulation logic), and moves the RCU tree code over to them.
> > 
> > Without this patch, running reboot at a shell can result in an oops
> > like:
> 
> Very good, this one applies cleanly and I have queued it for review
> and testing.
> 
> One question below, though.

And some build errors:

In file included from 
/home/paulmck/public_git/linux-rcu/kernel/rcu/tree.c:4209:0:
/home/paulmck/public_git/linux-rcu/kernel/rcu/tree_plugin.h: In function 
‘rcu_boost_kthread_setaffinity’:
/home/paulmck/public_git/linux-rcu/kernel/rcu/tree_plugin.h:1168:2: error: 
implicit declaration of function ‘for_each_leaf_node_cpu_bit’ 
[-Werror=implicit-function-declaration]
  for_each_leaf_node_cpu_bit(rnp, cpu, bit)
  ^
/home/paulmck/public_git/linux-rcu/kernel/rcu/tree_plugin.h:1169:3: error: 
expected ‘;’ before ‘if’
   if ((mask & bit) && cpu != outgoingcpu)
   ^
/home/paulmck/public_git/linux-rcu/kernel/rcu/tree_plugin.h:1159:16: warning: 
unused variable ‘mask’ [-Wunused-variable]
  unsigned long mask = rcu_rnp_online_cpus(rnp);
^

Please see below for the .config.

I have dropped the patch from my tree, looking forward to getting an
update that fixes the build errors.

Thanx, Paul



#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 4.6.0-rc2 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_PERF_EVENTS_INTEL_UNCORE=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_HAVE_INTEL_TXT=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx 
-fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 
-fcall-saved-r11"
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_DEBUG_RODATA=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_IRQ_DOMAIN=y
CONFI

Re: [PATCHv2] rcu: tree: correctly handle sparse possible CPUs

2016-05-17 Thread Paul E. McKenney
On Tue, May 17, 2016 at 11:22:10AM +0100, Mark Rutland wrote:
> In many cases in the RCU tree code, we iterate over the set of CPUs for
> a leaf node described by rcu_node::grplo and rcu_node::grphi, checking
> per-cpu data for each CPU in this range. However, if the set of possible
> CPUs is sparse, some CPUs described in this range are not possible, and
> thus no per-cpu region will have been allocated (or initialised) for
> them by the generic percpu code.
> 
> Erroneous accesses to a per-cpu area for these !possible CPUs may fault
> or may hit other data depending on the addressed generated when the
> erroneous per cpu offset is applied. In practice, both cases have been
> observed on arm64 hardware (the former being silent, but detectable with
> additional patches).
> 
> To avoid issues resulting from this, we must iterate over the set of
> *possible* cpus for a given leaf node. This patch adds new helpers to
> enable this (also unifying and simplifying some related bitmask
> manipulation logic), and moves the RCU tree code over to them.
> 
> Without this patch, running reboot at a shell can result in an oops
> like:

Very good, this one applies cleanly and I have queued it for review
and testing.

One question below, though.

Thanx, Paul

> [ 3369.075979] Unable to handle kernel paging request at virtual address 
> ff8008b21b4c
> [ 3369.083881] pgd = ffc3ecdda000
> [ 3369.087270] [ff8008b21b4c] *pgd=0083eca48003, 
> *pud=0083eca48003, *pmd=
> [ 3369.096222] Internal error: Oops: 9607 [#1] PREEMPT SMP
> [ 3369.101781] Modules linked in:
> [ 3369.104825] CPU: 2 PID: 1817 Comm: NetworkManager Tainted: GW  
>  4.6.0+ #3
> [ 3369.121239] task: ffc0fa13e000 ti: ffc3eb94 task.ti: 
> ffc3eb94
> [ 3369.128708] PC is at sync_rcu_exp_select_cpus+0x188/0x510
> [ 3369.134094] LR is at sync_rcu_exp_select_cpus+0x104/0x510
> [ 3369.139479] pc : [] lr : [] pstate: 
> 21c5
> [ 3369.146860] sp : ffc3eb9435a0
> [ 3369.150162] x29: ffc3eb9435a0 x28: ff8008be4f88
> [ 3369.155465] x27: ff8008b66c80 x26: ffc3eceb2600
> [ 3369.160767] x25: 0001 x24: ff8008be4f88
> [ 3369.166070] x23: ff8008b51c3c x22: ff8008b66c80
> [ 3369.171371] x21: 0001 x20: ff8008b21b40
> [ 3369.176673] x19: ff8008b66c80 x18: 
> [ 3369.181975] x17: 007fa951a010 x16: ff80086a30f0
> [ 3369.187278] x15: 007fa9505590 x14: 
> [ 3369.192580] x13: ff8008b51000 x12: ffc3eb94
> [ 3369.197882] x11: 0006 x10: ff8008b51b78
> [ 3369.203184] x9 : 0001 x8 : ff8008be4000
> [ 3369.208486] x7 : ff8008b21b40 x6 : 1003
> [ 3369.213788] x5 :  x4 : ff8008b27280
> [ 3369.219090] x3 : ff8008b21b4c x2 : 0001
> [ 3369.224406] x1 : 0001 x0 : 0140
> ...
> [ 3369.972257] [] sync_rcu_exp_select_cpus+0x188/0x510
> [ 3369.978685] [] synchronize_rcu_expedited+0x64/0xa8
> [ 3369.985026] [] synchronize_net+0x24/0x30
> [ 3369.990499] [] dev_deactivate_many+0x28c/0x298
> [ 3369.996493] [] __dev_close_many+0x60/0xd0
> [ 3370.002052] [] __dev_close+0x28/0x40
> [ 3370.007178] [] __dev_change_flags+0x8c/0x158
> [ 3370.012999] [] dev_change_flags+0x20/0x60
> [ 3370.018558] [] do_setlink+0x288/0x918
> [ 3370.023771] [] rtnl_newlink+0x398/0x6a8
> [ 3370.029158] [] rtnetlink_rcv_msg+0xe4/0x220
> [ 3370.034891] [] netlink_rcv_skb+0xc4/0xf8
> [ 3370.040364] [] rtnetlink_rcv+0x2c/0x40
> [ 3370.045663] [] netlink_unicast+0x160/0x238
> [ 3370.051309] [] netlink_sendmsg+0x2f0/0x358
> [ 3370.056956] [] sock_sendmsg+0x18/0x30
> [ 3370.062168] [] ___sys_sendmsg+0x26c/0x280
> [ 3370.067728] [] __sys_sendmsg+0x44/0x88
> [ 3370.073027] [] SyS_sendmsg+0x10/0x20
> [ 3370.078153] [] el0_svc_naked+0x24/0x28
> 
> Signed-off-by: Mark Rutland 
> Reported-by: Dennis Chen 
> Cc: Paul E. McKenney 
> Cc: Catalin Marinas 
> Cc: Josh Triplett 
> Cc: Lai Jiangshan 
> Cc: Mathieu Desnoyers 
> Cc: Steve Capper 
> Cc: Steven Rostedt 
> Cc: Will Deacon 
> Cc: linux-kernel@vger.kernel.org
> ---
>  kernel/rcu/tree.c| 19 +--
>  kernel/rcu/tree.h| 18 ++
>  kernel/rcu/tree_exp.h| 12 
>  kernel/rcu/tree_plugin.h |  5 +++--
>  4 files changed, 34 insertions(+), 20 deletions(-)
> 
> Since v1 [1]:
>  * rebase to the -rcu rcu/dev branch.
>  * replace all occurences missed by v1.
>  * s/CPUS/CPUs/ in commit message. Gah.
> 
> Paul, I've given this a spin on arm64 and I've build-tested for x86.
> Things look fine, though this hasn't seen a thorough beating.
> 
> Mark.
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index afdcb7b..65ee19e 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1279,15 +1279,16 @@ static void rcu_check_gp_kthread_starvation(struct 
> rcu_state *rsp)
>  static void rcu_dum

[PATCHv2] rcu: tree: correctly handle sparse possible CPUs

2016-05-17 Thread Mark Rutland
In many cases in the RCU tree code, we iterate over the set of CPUs for
a leaf node described by rcu_node::grplo and rcu_node::grphi, checking
per-cpu data for each CPU in this range. However, if the set of possible
CPUs is sparse, some CPUs described in this range are not possible, and
thus no per-cpu region will have been allocated (or initialised) for
them by the generic percpu code.

Erroneous accesses to a per-cpu area for these !possible CPUs may fault
or may hit other data depending on the addressed generated when the
erroneous per cpu offset is applied. In practice, both cases have been
observed on arm64 hardware (the former being silent, but detectable with
additional patches).

To avoid issues resulting from this, we must iterate over the set of
*possible* cpus for a given leaf node. This patch adds new helpers to
enable this (also unifying and simplifying some related bitmask
manipulation logic), and moves the RCU tree code over to them.

Without this patch, running reboot at a shell can result in an oops
like:

[ 3369.075979] Unable to handle kernel paging request at virtual address 
ff8008b21b4c
[ 3369.083881] pgd = ffc3ecdda000
[ 3369.087270] [ff8008b21b4c] *pgd=0083eca48003, *pud=0083eca48003, 
*pmd=
[ 3369.096222] Internal error: Oops: 9607 [#1] PREEMPT SMP
[ 3369.101781] Modules linked in:
[ 3369.104825] CPU: 2 PID: 1817 Comm: NetworkManager Tainted: GW   
4.6.0+ #3
[ 3369.121239] task: ffc0fa13e000 ti: ffc3eb94 task.ti: 
ffc3eb94
[ 3369.128708] PC is at sync_rcu_exp_select_cpus+0x188/0x510
[ 3369.134094] LR is at sync_rcu_exp_select_cpus+0x104/0x510
[ 3369.139479] pc : [] lr : [] pstate: 
21c5
[ 3369.146860] sp : ffc3eb9435a0
[ 3369.150162] x29: ffc3eb9435a0 x28: ff8008be4f88
[ 3369.155465] x27: ff8008b66c80 x26: ffc3eceb2600
[ 3369.160767] x25: 0001 x24: ff8008be4f88
[ 3369.166070] x23: ff8008b51c3c x22: ff8008b66c80
[ 3369.171371] x21: 0001 x20: ff8008b21b40
[ 3369.176673] x19: ff8008b66c80 x18: 
[ 3369.181975] x17: 007fa951a010 x16: ff80086a30f0
[ 3369.187278] x15: 007fa9505590 x14: 
[ 3369.192580] x13: ff8008b51000 x12: ffc3eb94
[ 3369.197882] x11: 0006 x10: ff8008b51b78
[ 3369.203184] x9 : 0001 x8 : ff8008be4000
[ 3369.208486] x7 : ff8008b21b40 x6 : 1003
[ 3369.213788] x5 :  x4 : ff8008b27280
[ 3369.219090] x3 : ff8008b21b4c x2 : 0001
[ 3369.224406] x1 : 0001 x0 : 0140
...
[ 3369.972257] [] sync_rcu_exp_select_cpus+0x188/0x510
[ 3369.978685] [] synchronize_rcu_expedited+0x64/0xa8
[ 3369.985026] [] synchronize_net+0x24/0x30
[ 3369.990499] [] dev_deactivate_many+0x28c/0x298
[ 3369.996493] [] __dev_close_many+0x60/0xd0
[ 3370.002052] [] __dev_close+0x28/0x40
[ 3370.007178] [] __dev_change_flags+0x8c/0x158
[ 3370.012999] [] dev_change_flags+0x20/0x60
[ 3370.018558] [] do_setlink+0x288/0x918
[ 3370.023771] [] rtnl_newlink+0x398/0x6a8
[ 3370.029158] [] rtnetlink_rcv_msg+0xe4/0x220
[ 3370.034891] [] netlink_rcv_skb+0xc4/0xf8
[ 3370.040364] [] rtnetlink_rcv+0x2c/0x40
[ 3370.045663] [] netlink_unicast+0x160/0x238
[ 3370.051309] [] netlink_sendmsg+0x2f0/0x358
[ 3370.056956] [] sock_sendmsg+0x18/0x30
[ 3370.062168] [] ___sys_sendmsg+0x26c/0x280
[ 3370.067728] [] __sys_sendmsg+0x44/0x88
[ 3370.073027] [] SyS_sendmsg+0x10/0x20
[ 3370.078153] [] el0_svc_naked+0x24/0x28

Signed-off-by: Mark Rutland 
Reported-by: Dennis Chen 
Cc: Paul E. McKenney 
Cc: Catalin Marinas 
Cc: Josh Triplett 
Cc: Lai Jiangshan 
Cc: Mathieu Desnoyers 
Cc: Steve Capper 
Cc: Steven Rostedt 
Cc: Will Deacon 
Cc: linux-kernel@vger.kernel.org
---
 kernel/rcu/tree.c| 19 +--
 kernel/rcu/tree.h| 18 ++
 kernel/rcu/tree_exp.h| 12 
 kernel/rcu/tree_plugin.h |  5 +++--
 4 files changed, 34 insertions(+), 20 deletions(-)

Since v1 [1]:
 * rebase to the -rcu rcu/dev branch.
 * replace all occurences missed by v1.
 * s/CPUS/CPUs/ in commit message. Gah.

Paul, I've given this a spin on arm64 and I've build-tested for x86.
Things look fine, though this hasn't seen a thorough beating.

Mark.

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index afdcb7b..65ee19e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1279,15 +1279,16 @@ static void rcu_check_gp_kthread_starvation(struct 
rcu_state *rsp)
 static void rcu_dump_cpu_stacks(struct rcu_state *rsp)
 {
int cpu;
+   unsigned long bit;
unsigned long flags;
struct rcu_node *rnp;
 
rcu_for_each_leaf_node(rsp, rnp) {
raw_spin_lock_irqsave_rcu_node(rnp, flags);
if (rnp->qsmask != 0) {
-   for (cpu = 0; cpu <= rnp->grphi - rnp->grplo; cpu++)
-   if (rnp->qsmask & (1UL << cpu))
-