Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-30 Thread Paul E. McKenney
On Thu, Jun 30, 2016 at 03:31:57PM +0200, Geert Uytterhoeven wrote:
> Hi Paul,
> 
> On Thu, Jun 30, 2016 at 3:24 PM, Paul E. McKenney
>  wrote:
> > On Thu, Jun 30, 2016 at 09:58:51AM +0200, Geert Uytterhoeven wrote:
> >> On Thu, Jun 30, 2016 at 9:47 AM, Joonsoo Kim  
> >> wrote:
> >> > On Wed, Jun 29, 2016 at 11:12:08AM -0700, Paul E. McKenney wrote:
> >> >> On Wed, Jun 29, 2016 at 07:52:06PM +0200, Geert Uytterhoeven wrote:
> >> >> > On Wed, Jun 29, 2016 at 6:44 PM, Paul E. McKenney
> >> >> >  wrote:
> >> >> > > On Wed, Jun 29, 2016 at 04:54:44PM +0200, Geert Uytterhoeven wrote:
> >> >> > >> On Thu, Jun 23, 2016 at 4:53 AM, Paul E. McKenney
> >> >> > >>  wrote:
> >> >> > >> > On Wed, Jun 22, 2016 at 07:47:42PM -0700, Paul E. McKenney wrote:
> >> >> > >
> >> >> > > [ . . . ]
> >> >> > >
> >> >> > >> > @@ -4720,11 +4720,18 @@ static void __init 
> >> >> > >> > rcu_dump_rcu_node_tree(struct rcu_state *rsp)
> >> >> > >> > pr_info(" ");
> >> >> > >> > level = rnp->level;
> >> >> > >> > }
> >> >> > >> > -   pr_cont("%d:%d ^%d  ", rnp->grplo, rnp->grphi, 
> >> >> > >> > rnp->grpnum);
> >> >> > >> > +   pr_cont("%d:%d/%#lx/%#lx ^%d  ", rnp->grplo, 
> >> >> > >> > rnp->grphi,
> >> >> > >> > +   rnp->qsmask,
> >> >> > >> > +   rnp->qsmaskinit | rnp->qsmaskinitnext, 
> >> >> > >> > rnp->grpnum);
> >> >> > >> > }
> >> >> > >> > pr_cont("\n");
> >> >> > >> >  }
> >> >> > >>
> >> >> > >> For me it always crashes during the 37th call of 
> >> >> > >> synchronize_sched() in
> >> >> > >> setup_kmem_cache_node(), which is the first call after secondary 
> >> >> > >> CPU bring up.
> >> >> > >> With your and my debug code, I get:
> >> >> > >>
> >> >> > >>   CPU: Testing write buffer coherency: ok
> >> >> > >>   CPU0: thread -1, cpu 0, socket 0, mpidr 8000
> >> >> > >>   Setting up static identity map for 0x4010 - 0x40100058
> >> >> > >>   cnt = 36, sync
> >> >> > >>   CPU1: thread -1, cpu 1, socket 0, mpidr 8001
> >> >> > >>   Brought up 2 CPUs
> >> >> > >>   SMP: Total of 2 processors activated (2132.00 BogoMIPS).
> >> >> > >>   CPU: All CPU(s) started in SVC mode.
> >> >> > >>   rcu_node tree layout dump
> >> >> > >>0:1/0x0/0x3 ^0
> >> >> > >
> >> >> > > Thank you for running this!
> >> >> > >
> >> >> > > OK, so RCU knows about both CPUs (the "0x3"), and the previous
> >> >> > > grace period has seen quiescent states from both of them (the 
> >> >> > > "0x0").
> >> >> > > That would indicate that your synchronize_sched() showed up when 
> >> >> > > RCU was
> >> >> > > idle, so it had to start a new grace period.  It also rules out 
> >> >> > > failure
> >> >> > > modes where RCU thinks that there are more CPUs than really exist.
> >> >> > > (Don't laugh, such things have really happened.)
> >> >> > >
> >> >> > >>   devtmpfs: initialized
> >> >> > >>   VFP support v0.3: implementor 41 architecture 3 part 30 variant 
> >> >> > >> 9 rev 1
> >> >> > >>   clocksource: jiffies: mask: 0x max_cycles: 0x,
> >> >> > >> max_idle_ns: 1911260446275 ns
> >> >> > >>
> >> >> > >> I hope it helps. Thanks!
> >> >> > >
> >> >> > > I am going to guess that this was the first grace period since the 
> >> >> > > second
> >> >> > > CPU came online.  When there only on CPU online, synchronize_sched()
> >> >> > > is a no-op.
> >> >> > >
> >> >> > > OK, this showed some things that aren't a problem.  What might the
> >> >> > > problem be?
> >> >> > >
> >> >> > > o   The grace-period kthread has not yet started.  It -should- 
> >> >> > > start
> >> >> > > at early_initcall() time, but who knows?  Adding code to 
> >> >> > > print
> >> >> > > out that kthread's task_struct address.
> >> >> > >
> >> >> > > o   The grace-period kthread might not be responding to wakeups.
> >> >> > > Checking this requires that a grace period be in progress,
> >> >> > > so please put a call_rcu_sched() just before the call to
> >> >> > > rcu_dump_rcu_node_tree().  (Sample code below.)  Adding code
> >> >> > > to my patch to print out more GP-kthread state as well.
> >> >> > >
> >> >> > > o   One of the CPUs might not be responding to RCU.  That 
> >> >> > > -should-
> >> >> > > result in an RCU CPU stall warning, so I will ignore this
> >> >> > > possibility for the moment.
> >> >> > >
> >> >> > > That said, do you have some way to determine whether 
> >> >> > > scheduling
> >> >> > > clock interrupts are really happening?  Without these 
> >> >> > > interrupts,
> >> >> > > no RCU CPU stall warnings.
> >> >> >
> >> >> > I believe there are no clocksources yet. The jiffies clocksource is 
> >> >> > the first
> >> >> > clocksource found, and that happens after the first call to
> >> >> > 

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-30 Thread Geert Uytterhoeven
Hi Paul,

On Thu, Jun 30, 2016 at 6:52 PM, Paul E. McKenney
 wrote:
> On Thu, Jun 30, 2016 at 05:53:42PM +0200, Geert Uytterhoeven wrote:
>> +void rcu_dump_rcu_sched_tree(void)
>> +{
>> + struct rcu_head rh;
>> + unsigned long flags;
>> +
>> + rcu_dump_rcu_node_tree(_sched_state);  /* Initial state. */
>> + local_irq_save(flags);
>> + // call_rcu(, do_nothing_cb);
>> + local_irq_restore(flags);
>> + // schedule_timeout_uninterruptible(5 * HZ);  /* Or whatever delay. */
>> + rcu_dump_rcu_node_tree(_sched_state); /* GP state. */
>> + //synchronize_sched();  /* Probably hangs. */
>> + //rcu_barrier();  /* Drop RCU's references to rh before return. */
>> +}

>>
>> When enabling any of the 4 commented-out lines in rcu_dump_rcu_sched_tree(),
>> it will lock up.
>
> OK, but that includes schedule_timeout_uninterruptible(5 * HZ), right?

Yes it does.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-30 Thread Paul E. McKenney
On Thu, Jun 30, 2016 at 05:53:42PM +0200, Geert Uytterhoeven wrote:
> On Thu, 30 Jun 2016, Paul E. McKenney wrote:
> > On Thu, Jun 30, 2016 at 03:31:57PM +0200, Geert Uytterhoeven wrote:
> > > On Thu, Jun 30, 2016 at 3:24 PM, Paul E. McKenney
> > >  wrote:
> > > > On Thu, Jun 30, 2016 at 09:58:51AM +0200, Geert Uytterhoeven wrote:
> > > >> On Thu, Jun 30, 2016 at 9:47 AM, Joonsoo Kim  
> > > >> wrote:
> > > >> > On Wed, Jun 29, 2016 at 11:12:08AM -0700, Paul E. McKenney wrote:
> > > >> >> On Wed, Jun 29, 2016 at 07:52:06PM +0200, Geert Uytterhoeven wrote:
> > > >> >> > On Wed, Jun 29, 2016 at 6:44 PM, Paul E. McKenney
> > > >> >> >  wrote:
> > > >> >> > > On Wed, Jun 29, 2016 at 04:54:44PM +0200, Geert Uytterhoeven 
> > > >> >> > > wrote:
> > > >> >> > >> On Thu, Jun 23, 2016 at 4:53 AM, Paul E. McKenney
> > > >> >> > >>  wrote:
> > > >> >> > >> > On Wed, Jun 22, 2016 at 07:47:42PM -0700, Paul E. McKenney 
> > > >> >> > >> > wrote:
> > > >> >> > >
> > > >> >> > > [ . . . ]
> > > >> >> > >
> > > >> >> > >> > @@ -4720,11 +4720,18 @@ static void __init 
> > > >> >> > >> > rcu_dump_rcu_node_tree(struct rcu_state *rsp)
> > > >> >> > >> > pr_info(" ");
> > > >> >> > >> > level = rnp->level;
> > > >> >> > >> > }
> > > >> >> > >> > -   pr_cont("%d:%d ^%d  ", rnp->grplo, 
> > > >> >> > >> > rnp->grphi, rnp->grpnum);
> > > >> >> > >> > +   pr_cont("%d:%d/%#lx/%#lx ^%d  ", rnp->grplo, 
> > > >> >> > >> > rnp->grphi,
> > > >> >> > >> > +   rnp->qsmask,
> > > >> >> > >> > +   rnp->qsmaskinit | 
> > > >> >> > >> > rnp->qsmaskinitnext, rnp->grpnum);
> > > >> >> > >> > }
> > > >> >> > >> > pr_cont("\n");
> > > >> >> > >> >  }
> > > >> >> > >>
> > > >> >> > >> For me it always crashes during the 37th call of 
> > > >> >> > >> synchronize_sched() in
> > > >> >> > >> setup_kmem_cache_node(), which is the first call after 
> > > >> >> > >> secondary CPU bring up.
> > > >> >> > >> With your and my debug code, I get:
> > > >> >> > >>
> > > >> >> > >>   CPU: Testing write buffer coherency: ok
> > > >> >> > >>   CPU0: thread -1, cpu 0, socket 0, mpidr 8000
> > > >> >> > >>   Setting up static identity map for 0x4010 - 0x40100058
> > > >> >> > >>   cnt = 36, sync
> > > >> >> > >>   CPU1: thread -1, cpu 1, socket 0, mpidr 8001
> > > >> >> > >>   Brought up 2 CPUs
> > > >> >> > >>   SMP: Total of 2 processors activated (2132.00 BogoMIPS).
> > > >> >> > >>   CPU: All CPU(s) started in SVC mode.
> > > >> >> > >>   rcu_node tree layout dump
> > > >> >> > >>0:1/0x0/0x3 ^0
> > > >> >> > >
> > > >> >> > > Thank you for running this!
> > > >> >> > >
> > > >> >> > > OK, so RCU knows about both CPUs (the "0x3"), and the previous
> > > >> >> > > grace period has seen quiescent states from both of them (the 
> > > >> >> > > "0x0").
> > > >> >> > > That would indicate that your synchronize_sched() showed up 
> > > >> >> > > when RCU was
> > > >> >> > > idle, so it had to start a new grace period.  It also rules out 
> > > >> >> > > failure
> > > >> >> > > modes where RCU thinks that there are more CPUs than really 
> > > >> >> > > exist.
> > > >> >> > > (Don't laugh, such things have really happened.)
> > > >> >> > >
> > > >> >> > >>   devtmpfs: initialized
> > > >> >> > >>   VFP support v0.3: implementor 41 architecture 3 part 30 
> > > >> >> > >> variant 9 rev 1
> > > >> >> > >>   clocksource: jiffies: mask: 0x max_cycles: 
> > > >> >> > >> 0x,
> > > >> >> > >> max_idle_ns: 1911260446275 ns
> > > >> >> > >>
> > > >> >> > >> I hope it helps. Thanks!
> > > >> >> > >
> > > >> >> > > I am going to guess that this was the first grace period since 
> > > >> >> > > the second
> > > >> >> > > CPU came online.  When there only on CPU online, 
> > > >> >> > > synchronize_sched()
> > > >> >> > > is a no-op.
> > > >> >> > >
> > > >> >> > > OK, this showed some things that aren't a problem.  What might 
> > > >> >> > > the
> > > >> >> > > problem be?
> > > >> >> > >
> > > >> >> > > o   The grace-period kthread has not yet started.  It 
> > > >> >> > > -should- start
> > > >> >> > > at early_initcall() time, but who knows?  Adding code 
> > > >> >> > > to print
> > > >> >> > > out that kthread's task_struct address.
> > > >> >> > >
> > > >> >> > > o   The grace-period kthread might not be responding to 
> > > >> >> > > wakeups.
> > > >> >> > > Checking this requires that a grace period be in 
> > > >> >> > > progress,
> > > >> >> > > so please put a call_rcu_sched() just before the call to
> > > >> >> > > rcu_dump_rcu_node_tree().  (Sample code below.)  Adding 
> > > >> >> > > code
> > > >> >> > > to my patch to print out more GP-kthread state as well.
> > > >> >> > >
> > > >> >> > > o   One of the CPUs might not 

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-30 Thread Geert Uytterhoeven
Hi Paul,

On Thu, Jun 30, 2016 at 3:24 PM, Paul E. McKenney
 wrote:
> On Thu, Jun 30, 2016 at 09:58:51AM +0200, Geert Uytterhoeven wrote:
>> On Thu, Jun 30, 2016 at 9:47 AM, Joonsoo Kim  wrote:
>> > On Wed, Jun 29, 2016 at 11:12:08AM -0700, Paul E. McKenney wrote:
>> >> On Wed, Jun 29, 2016 at 07:52:06PM +0200, Geert Uytterhoeven wrote:
>> >> > On Wed, Jun 29, 2016 at 6:44 PM, Paul E. McKenney
>> >> >  wrote:
>> >> > > On Wed, Jun 29, 2016 at 04:54:44PM +0200, Geert Uytterhoeven wrote:
>> >> > >> On Thu, Jun 23, 2016 at 4:53 AM, Paul E. McKenney
>> >> > >>  wrote:
>> >> > >> > On Wed, Jun 22, 2016 at 07:47:42PM -0700, Paul E. McKenney wrote:
>> >> > >
>> >> > > [ . . . ]
>> >> > >
>> >> > >> > @@ -4720,11 +4720,18 @@ static void __init 
>> >> > >> > rcu_dump_rcu_node_tree(struct rcu_state *rsp)
>> >> > >> > pr_info(" ");
>> >> > >> > level = rnp->level;
>> >> > >> > }
>> >> > >> > -   pr_cont("%d:%d ^%d  ", rnp->grplo, rnp->grphi, 
>> >> > >> > rnp->grpnum);
>> >> > >> > +   pr_cont("%d:%d/%#lx/%#lx ^%d  ", rnp->grplo, 
>> >> > >> > rnp->grphi,
>> >> > >> > +   rnp->qsmask,
>> >> > >> > +   rnp->qsmaskinit | rnp->qsmaskinitnext, 
>> >> > >> > rnp->grpnum);
>> >> > >> > }
>> >> > >> > pr_cont("\n");
>> >> > >> >  }
>> >> > >>
>> >> > >> For me it always crashes during the 37th call of synchronize_sched() 
>> >> > >> in
>> >> > >> setup_kmem_cache_node(), which is the first call after secondary CPU 
>> >> > >> bring up.
>> >> > >> With your and my debug code, I get:
>> >> > >>
>> >> > >>   CPU: Testing write buffer coherency: ok
>> >> > >>   CPU0: thread -1, cpu 0, socket 0, mpidr 8000
>> >> > >>   Setting up static identity map for 0x4010 - 0x40100058
>> >> > >>   cnt = 36, sync
>> >> > >>   CPU1: thread -1, cpu 1, socket 0, mpidr 8001
>> >> > >>   Brought up 2 CPUs
>> >> > >>   SMP: Total of 2 processors activated (2132.00 BogoMIPS).
>> >> > >>   CPU: All CPU(s) started in SVC mode.
>> >> > >>   rcu_node tree layout dump
>> >> > >>0:1/0x0/0x3 ^0
>> >> > >
>> >> > > Thank you for running this!
>> >> > >
>> >> > > OK, so RCU knows about both CPUs (the "0x3"), and the previous
>> >> > > grace period has seen quiescent states from both of them (the "0x0").
>> >> > > That would indicate that your synchronize_sched() showed up when RCU 
>> >> > > was
>> >> > > idle, so it had to start a new grace period.  It also rules out 
>> >> > > failure
>> >> > > modes where RCU thinks that there are more CPUs than really exist.
>> >> > > (Don't laugh, such things have really happened.)
>> >> > >
>> >> > >>   devtmpfs: initialized
>> >> > >>   VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 
>> >> > >> rev 1
>> >> > >>   clocksource: jiffies: mask: 0x max_cycles: 0x,
>> >> > >> max_idle_ns: 1911260446275 ns
>> >> > >>
>> >> > >> I hope it helps. Thanks!
>> >> > >
>> >> > > I am going to guess that this was the first grace period since the 
>> >> > > second
>> >> > > CPU came online.  When there only on CPU online, synchronize_sched()
>> >> > > is a no-op.
>> >> > >
>> >> > > OK, this showed some things that aren't a problem.  What might the
>> >> > > problem be?
>> >> > >
>> >> > > o   The grace-period kthread has not yet started.  It -should- 
>> >> > > start
>> >> > > at early_initcall() time, but who knows?  Adding code to print
>> >> > > out that kthread's task_struct address.
>> >> > >
>> >> > > o   The grace-period kthread might not be responding to wakeups.
>> >> > > Checking this requires that a grace period be in progress,
>> >> > > so please put a call_rcu_sched() just before the call to
>> >> > > rcu_dump_rcu_node_tree().  (Sample code below.)  Adding code
>> >> > > to my patch to print out more GP-kthread state as well.
>> >> > >
>> >> > > o   One of the CPUs might not be responding to RCU.  That -should-
>> >> > > result in an RCU CPU stall warning, so I will ignore this
>> >> > > possibility for the moment.
>> >> > >
>> >> > > That said, do you have some way to determine whether 
>> >> > > scheduling
>> >> > > clock interrupts are really happening?  Without these 
>> >> > > interrupts,
>> >> > > no RCU CPU stall warnings.
>> >> >
>> >> > I believe there are no clocksources yet. The jiffies clocksource is the 
>> >> > first
>> >> > clocksource found, and that happens after the first call to
>> >> > synchronize_sched(), cfr. my dmesg snippet above.
>> >> >
>> >> > In a working boot:
>> >> > # cat /sys/bus/clocksource/devices/clocksource0/available_clocksource
>> >> > e018.timer jiffies
>> >> > # cat /sys/bus/clocksource/devices/clocksource0/current_clocksource
>> >> > e018.timer
>> >>
>> >> Ah!  But if 

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-30 Thread Paul E. McKenney
On Thu, Jun 30, 2016 at 09:58:51AM +0200, Geert Uytterhoeven wrote:
> Hi Joonsoo,
> 
> On Thu, Jun 30, 2016 at 9:47 AM, Joonsoo Kim  wrote:
> > On Wed, Jun 29, 2016 at 11:12:08AM -0700, Paul E. McKenney wrote:
> >> On Wed, Jun 29, 2016 at 07:52:06PM +0200, Geert Uytterhoeven wrote:
> >> > On Wed, Jun 29, 2016 at 6:44 PM, Paul E. McKenney
> >> >  wrote:
> >> > > On Wed, Jun 29, 2016 at 04:54:44PM +0200, Geert Uytterhoeven wrote:
> >> > >> On Thu, Jun 23, 2016 at 4:53 AM, Paul E. McKenney
> >> > >>  wrote:
> >> > >> > On Wed, Jun 22, 2016 at 07:47:42PM -0700, Paul E. McKenney wrote:
> >> > >
> >> > > [ . . . ]
> >> > >
> >> > >> > @@ -4720,11 +4720,18 @@ static void __init 
> >> > >> > rcu_dump_rcu_node_tree(struct rcu_state *rsp)
> >> > >> > pr_info(" ");
> >> > >> > level = rnp->level;
> >> > >> > }
> >> > >> > -   pr_cont("%d:%d ^%d  ", rnp->grplo, rnp->grphi, 
> >> > >> > rnp->grpnum);
> >> > >> > +   pr_cont("%d:%d/%#lx/%#lx ^%d  ", rnp->grplo, 
> >> > >> > rnp->grphi,
> >> > >> > +   rnp->qsmask,
> >> > >> > +   rnp->qsmaskinit | rnp->qsmaskinitnext, 
> >> > >> > rnp->grpnum);
> >> > >> > }
> >> > >> > pr_cont("\n");
> >> > >> >  }
> >> > >>
> >> > >> For me it always crashes during the 37th call of synchronize_sched() 
> >> > >> in
> >> > >> setup_kmem_cache_node(), which is the first call after secondary CPU 
> >> > >> bring up.
> >> > >> With your and my debug code, I get:
> >> > >>
> >> > >>   CPU: Testing write buffer coherency: ok
> >> > >>   CPU0: thread -1, cpu 0, socket 0, mpidr 8000
> >> > >>   Setting up static identity map for 0x4010 - 0x40100058
> >> > >>   cnt = 36, sync
> >> > >>   CPU1: thread -1, cpu 1, socket 0, mpidr 8001
> >> > >>   Brought up 2 CPUs
> >> > >>   SMP: Total of 2 processors activated (2132.00 BogoMIPS).
> >> > >>   CPU: All CPU(s) started in SVC mode.
> >> > >>   rcu_node tree layout dump
> >> > >>0:1/0x0/0x3 ^0
> >> > >
> >> > > Thank you for running this!
> >> > >
> >> > > OK, so RCU knows about both CPUs (the "0x3"), and the previous
> >> > > grace period has seen quiescent states from both of them (the "0x0").
> >> > > That would indicate that your synchronize_sched() showed up when RCU 
> >> > > was
> >> > > idle, so it had to start a new grace period.  It also rules out failure
> >> > > modes where RCU thinks that there are more CPUs than really exist.
> >> > > (Don't laugh, such things have really happened.)
> >> > >
> >> > >>   devtmpfs: initialized
> >> > >>   VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 
> >> > >> rev 1
> >> > >>   clocksource: jiffies: mask: 0x max_cycles: 0x,
> >> > >> max_idle_ns: 1911260446275 ns
> >> > >>
> >> > >> I hope it helps. Thanks!
> >> > >
> >> > > I am going to guess that this was the first grace period since the 
> >> > > second
> >> > > CPU came online.  When there only on CPU online, synchronize_sched()
> >> > > is a no-op.
> >> > >
> >> > > OK, this showed some things that aren't a problem.  What might the
> >> > > problem be?
> >> > >
> >> > > o   The grace-period kthread has not yet started.  It -should- 
> >> > > start
> >> > > at early_initcall() time, but who knows?  Adding code to print
> >> > > out that kthread's task_struct address.
> >> > >
> >> > > o   The grace-period kthread might not be responding to wakeups.
> >> > > Checking this requires that a grace period be in progress,
> >> > > so please put a call_rcu_sched() just before the call to
> >> > > rcu_dump_rcu_node_tree().  (Sample code below.)  Adding code
> >> > > to my patch to print out more GP-kthread state as well.
> >> > >
> >> > > o   One of the CPUs might not be responding to RCU.  That -should-
> >> > > result in an RCU CPU stall warning, so I will ignore this
> >> > > possibility for the moment.
> >> > >
> >> > > That said, do you have some way to determine whether scheduling
> >> > > clock interrupts are really happening?  Without these 
> >> > > interrupts,
> >> > > no RCU CPU stall warnings.
> >> >
> >> > I believe there are no clocksources yet. The jiffies clocksource is the 
> >> > first
> >> > clocksource found, and that happens after the first call to
> >> > synchronize_sched(), cfr. my dmesg snippet above.
> >> >
> >> > In a working boot:
> >> > # cat /sys/bus/clocksource/devices/clocksource0/available_clocksource
> >> > e018.timer jiffies
> >> > # cat /sys/bus/clocksource/devices/clocksource0/current_clocksource
> >> > e018.timer
> >>
> >> Ah!  But if there is no jiffies clocksource, then schedule_timeout()
> >> and friends will never return, correct?  If so, I guarantee you that
> >> synchronize_sched() will unconditionally hang.
> >>
> >> So if I 

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-30 Thread Geert Uytterhoeven
Hi Joonsoo,

On Thu, Jun 30, 2016 at 9:47 AM, Joonsoo Kim  wrote:
> On Wed, Jun 29, 2016 at 11:12:08AM -0700, Paul E. McKenney wrote:
>> On Wed, Jun 29, 2016 at 07:52:06PM +0200, Geert Uytterhoeven wrote:
>> > On Wed, Jun 29, 2016 at 6:44 PM, Paul E. McKenney
>> >  wrote:
>> > > On Wed, Jun 29, 2016 at 04:54:44PM +0200, Geert Uytterhoeven wrote:
>> > >> On Thu, Jun 23, 2016 at 4:53 AM, Paul E. McKenney
>> > >>  wrote:
>> > >> > On Wed, Jun 22, 2016 at 07:47:42PM -0700, Paul E. McKenney wrote:
>> > >
>> > > [ . . . ]
>> > >
>> > >> > @@ -4720,11 +4720,18 @@ static void __init 
>> > >> > rcu_dump_rcu_node_tree(struct rcu_state *rsp)
>> > >> > pr_info(" ");
>> > >> > level = rnp->level;
>> > >> > }
>> > >> > -   pr_cont("%d:%d ^%d  ", rnp->grplo, rnp->grphi, 
>> > >> > rnp->grpnum);
>> > >> > +   pr_cont("%d:%d/%#lx/%#lx ^%d  ", rnp->grplo, 
>> > >> > rnp->grphi,
>> > >> > +   rnp->qsmask,
>> > >> > +   rnp->qsmaskinit | rnp->qsmaskinitnext, 
>> > >> > rnp->grpnum);
>> > >> > }
>> > >> > pr_cont("\n");
>> > >> >  }
>> > >>
>> > >> For me it always crashes during the 37th call of synchronize_sched() in
>> > >> setup_kmem_cache_node(), which is the first call after secondary CPU 
>> > >> bring up.
>> > >> With your and my debug code, I get:
>> > >>
>> > >>   CPU: Testing write buffer coherency: ok
>> > >>   CPU0: thread -1, cpu 0, socket 0, mpidr 8000
>> > >>   Setting up static identity map for 0x4010 - 0x40100058
>> > >>   cnt = 36, sync
>> > >>   CPU1: thread -1, cpu 1, socket 0, mpidr 8001
>> > >>   Brought up 2 CPUs
>> > >>   SMP: Total of 2 processors activated (2132.00 BogoMIPS).
>> > >>   CPU: All CPU(s) started in SVC mode.
>> > >>   rcu_node tree layout dump
>> > >>0:1/0x0/0x3 ^0
>> > >
>> > > Thank you for running this!
>> > >
>> > > OK, so RCU knows about both CPUs (the "0x3"), and the previous
>> > > grace period has seen quiescent states from both of them (the "0x0").
>> > > That would indicate that your synchronize_sched() showed up when RCU was
>> > > idle, so it had to start a new grace period.  It also rules out failure
>> > > modes where RCU thinks that there are more CPUs than really exist.
>> > > (Don't laugh, such things have really happened.)
>> > >
>> > >>   devtmpfs: initialized
>> > >>   VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 
>> > >> 1
>> > >>   clocksource: jiffies: mask: 0x max_cycles: 0x,
>> > >> max_idle_ns: 1911260446275 ns
>> > >>
>> > >> I hope it helps. Thanks!
>> > >
>> > > I am going to guess that this was the first grace period since the second
>> > > CPU came online.  When there only on CPU online, synchronize_sched()
>> > > is a no-op.
>> > >
>> > > OK, this showed some things that aren't a problem.  What might the
>> > > problem be?
>> > >
>> > > o   The grace-period kthread has not yet started.  It -should- start
>> > > at early_initcall() time, but who knows?  Adding code to print
>> > > out that kthread's task_struct address.
>> > >
>> > > o   The grace-period kthread might not be responding to wakeups.
>> > > Checking this requires that a grace period be in progress,
>> > > so please put a call_rcu_sched() just before the call to
>> > > rcu_dump_rcu_node_tree().  (Sample code below.)  Adding code
>> > > to my patch to print out more GP-kthread state as well.
>> > >
>> > > o   One of the CPUs might not be responding to RCU.  That -should-
>> > > result in an RCU CPU stall warning, so I will ignore this
>> > > possibility for the moment.
>> > >
>> > > That said, do you have some way to determine whether scheduling
>> > > clock interrupts are really happening?  Without these interrupts,
>> > > no RCU CPU stall warnings.
>> >
>> > I believe there are no clocksources yet. The jiffies clocksource is the 
>> > first
>> > clocksource found, and that happens after the first call to
>> > synchronize_sched(), cfr. my dmesg snippet above.
>> >
>> > In a working boot:
>> > # cat /sys/bus/clocksource/devices/clocksource0/available_clocksource
>> > e018.timer jiffies
>> > # cat /sys/bus/clocksource/devices/clocksource0/current_clocksource
>> > e018.timer
>>
>> Ah!  But if there is no jiffies clocksource, then schedule_timeout()
>> and friends will never return, correct?  If so, I guarantee you that
>> synchronize_sched() will unconditionally hang.
>>
>> So if I understand correctly, the fix is to get the jiffies clocksource
>> running before the first call to synchronize_sched().
>
> If so, following change would be sufficient.
>
> Thanks.
>
> -->8---
> diff --git a/kernel/time/jiffies.c b/kernel/time/jiffies.c
> index 555e21f..4f6471f 100644
> --- a/kernel/time/jiffies.c
> +++ 

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-30 Thread Joonsoo Kim
On Wed, Jun 29, 2016 at 11:12:08AM -0700, Paul E. McKenney wrote:
> On Wed, Jun 29, 2016 at 07:52:06PM +0200, Geert Uytterhoeven wrote:
> > Hi Paul,
> > 
> > On Wed, Jun 29, 2016 at 6:44 PM, Paul E. McKenney
> >  wrote:
> > > On Wed, Jun 29, 2016 at 04:54:44PM +0200, Geert Uytterhoeven wrote:
> > >> On Thu, Jun 23, 2016 at 4:53 AM, Paul E. McKenney
> > >>  wrote:
> > >> > On Wed, Jun 22, 2016 at 07:47:42PM -0700, Paul E. McKenney wrote:
> > >
> > > [ . . . ]
> > >
> > >> > @@ -4720,11 +4720,18 @@ static void __init 
> > >> > rcu_dump_rcu_node_tree(struct rcu_state *rsp)
> > >> > pr_info(" ");
> > >> > level = rnp->level;
> > >> > }
> > >> > -   pr_cont("%d:%d ^%d  ", rnp->grplo, rnp->grphi, 
> > >> > rnp->grpnum);
> > >> > +   pr_cont("%d:%d/%#lx/%#lx ^%d  ", rnp->grplo, 
> > >> > rnp->grphi,
> > >> > +   rnp->qsmask,
> > >> > +   rnp->qsmaskinit | rnp->qsmaskinitnext, 
> > >> > rnp->grpnum);
> > >> > }
> > >> > pr_cont("\n");
> > >> >  }
> > >>
> > >> For me it always crashes during the 37th call of synchronize_sched() in
> > >> setup_kmem_cache_node(), which is the first call after secondary CPU 
> > >> bring up.
> > >> With your and my debug code, I get:
> > >>
> > >>   CPU: Testing write buffer coherency: ok
> > >>   CPU0: thread -1, cpu 0, socket 0, mpidr 8000
> > >>   Setting up static identity map for 0x4010 - 0x40100058
> > >>   cnt = 36, sync
> > >>   CPU1: thread -1, cpu 1, socket 0, mpidr 8001
> > >>   Brought up 2 CPUs
> > >>   SMP: Total of 2 processors activated (2132.00 BogoMIPS).
> > >>   CPU: All CPU(s) started in SVC mode.
> > >>   rcu_node tree layout dump
> > >>0:1/0x0/0x3 ^0
> > >
> > > Thank you for running this!
> > >
> > > OK, so RCU knows about both CPUs (the "0x3"), and the previous
> > > grace period has seen quiescent states from both of them (the "0x0").
> > > That would indicate that your synchronize_sched() showed up when RCU was
> > > idle, so it had to start a new grace period.  It also rules out failure
> > > modes where RCU thinks that there are more CPUs than really exist.
> > > (Don't laugh, such things have really happened.)
> > >
> > >>   devtmpfs: initialized
> > >>   VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 1
> > >>   clocksource: jiffies: mask: 0x max_cycles: 0x,
> > >> max_idle_ns: 1911260446275 ns
> > >>
> > >> I hope it helps. Thanks!
> > >
> > > I am going to guess that this was the first grace period since the second
> > > CPU came online.  When there only on CPU online, synchronize_sched()
> > > is a no-op.
> > >
> > > OK, this showed some things that aren't a problem.  What might the
> > > problem be?
> > >
> > > o   The grace-period kthread has not yet started.  It -should- start
> > > at early_initcall() time, but who knows?  Adding code to print
> > > out that kthread's task_struct address.
> > >
> > > o   The grace-period kthread might not be responding to wakeups.
> > > Checking this requires that a grace period be in progress,
> > > so please put a call_rcu_sched() just before the call to
> > > rcu_dump_rcu_node_tree().  (Sample code below.)  Adding code
> > > to my patch to print out more GP-kthread state as well.
> > >
> > > o   One of the CPUs might not be responding to RCU.  That -should-
> > > result in an RCU CPU stall warning, so I will ignore this
> > > possibility for the moment.
> > >
> > > That said, do you have some way to determine whether scheduling
> > > clock interrupts are really happening?  Without these interrupts,
> > > no RCU CPU stall warnings.
> > 
> > I believe there are no clocksources yet. The jiffies clocksource is the 
> > first
> > clocksource found, and that happens after the first call to
> > synchronize_sched(), cfr. my dmesg snippet above.
> > 
> > In a working boot:
> > # cat /sys/bus/clocksource/devices/clocksource0/available_clocksource
> > e018.timer jiffies
> > # cat /sys/bus/clocksource/devices/clocksource0/current_clocksource
> > e018.timer
> 
> Ah!  But if there is no jiffies clocksource, then schedule_timeout()
> and friends will never return, correct?  If so, I guarantee you that
> synchronize_sched() will unconditionally hang.
> 
> So if I understand correctly, the fix is to get the jiffies clocksource
> running before the first call to synchronize_sched().

If so, following change would be sufficient.

Thanks.

-->8---
diff --git a/kernel/time/jiffies.c b/kernel/time/jiffies.c
index 555e21f..4f6471f 100644
--- a/kernel/time/jiffies.c
+++ b/kernel/time/jiffies.c
@@ -98,7 +98,7 @@ static int __init init_jiffies_clocksource(void)
return __clocksource_register(_jiffies);
 }
 
-core_initcall(init_jiffies_clocksource);

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-29 Thread Paul E. McKenney
On Wed, Jun 29, 2016 at 07:52:06PM +0200, Geert Uytterhoeven wrote:
> Hi Paul,
> 
> On Wed, Jun 29, 2016 at 6:44 PM, Paul E. McKenney
>  wrote:
> > On Wed, Jun 29, 2016 at 04:54:44PM +0200, Geert Uytterhoeven wrote:
> >> On Thu, Jun 23, 2016 at 4:53 AM, Paul E. McKenney
> >>  wrote:
> >> > On Wed, Jun 22, 2016 at 07:47:42PM -0700, Paul E. McKenney wrote:
> >
> > [ . . . ]
> >
> >> > @@ -4720,11 +4720,18 @@ static void __init rcu_dump_rcu_node_tree(struct 
> >> > rcu_state *rsp)
> >> > pr_info(" ");
> >> > level = rnp->level;
> >> > }
> >> > -   pr_cont("%d:%d ^%d  ", rnp->grplo, rnp->grphi, 
> >> > rnp->grpnum);
> >> > +   pr_cont("%d:%d/%#lx/%#lx ^%d  ", rnp->grplo, rnp->grphi,
> >> > +   rnp->qsmask,
> >> > +   rnp->qsmaskinit | rnp->qsmaskinitnext, 
> >> > rnp->grpnum);
> >> > }
> >> > pr_cont("\n");
> >> >  }
> >>
> >> For me it always crashes during the 37th call of synchronize_sched() in
> >> setup_kmem_cache_node(), which is the first call after secondary CPU bring 
> >> up.
> >> With your and my debug code, I get:
> >>
> >>   CPU: Testing write buffer coherency: ok
> >>   CPU0: thread -1, cpu 0, socket 0, mpidr 8000
> >>   Setting up static identity map for 0x4010 - 0x40100058
> >>   cnt = 36, sync
> >>   CPU1: thread -1, cpu 1, socket 0, mpidr 8001
> >>   Brought up 2 CPUs
> >>   SMP: Total of 2 processors activated (2132.00 BogoMIPS).
> >>   CPU: All CPU(s) started in SVC mode.
> >>   rcu_node tree layout dump
> >>0:1/0x0/0x3 ^0
> >
> > Thank you for running this!
> >
> > OK, so RCU knows about both CPUs (the "0x3"), and the previous
> > grace period has seen quiescent states from both of them (the "0x0").
> > That would indicate that your synchronize_sched() showed up when RCU was
> > idle, so it had to start a new grace period.  It also rules out failure
> > modes where RCU thinks that there are more CPUs than really exist.
> > (Don't laugh, such things have really happened.)
> >
> >>   devtmpfs: initialized
> >>   VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 1
> >>   clocksource: jiffies: mask: 0x max_cycles: 0x,
> >> max_idle_ns: 1911260446275 ns
> >>
> >> I hope it helps. Thanks!
> >
> > I am going to guess that this was the first grace period since the second
> > CPU came online.  When there only on CPU online, synchronize_sched()
> > is a no-op.
> >
> > OK, this showed some things that aren't a problem.  What might the
> > problem be?
> >
> > o   The grace-period kthread has not yet started.  It -should- start
> > at early_initcall() time, but who knows?  Adding code to print
> > out that kthread's task_struct address.
> >
> > o   The grace-period kthread might not be responding to wakeups.
> > Checking this requires that a grace period be in progress,
> > so please put a call_rcu_sched() just before the call to
> > rcu_dump_rcu_node_tree().  (Sample code below.)  Adding code
> > to my patch to print out more GP-kthread state as well.
> >
> > o   One of the CPUs might not be responding to RCU.  That -should-
> > result in an RCU CPU stall warning, so I will ignore this
> > possibility for the moment.
> >
> > That said, do you have some way to determine whether scheduling
> > clock interrupts are really happening?  Without these interrupts,
> > no RCU CPU stall warnings.
> 
> I believe there are no clocksources yet. The jiffies clocksource is the first
> clocksource found, and that happens after the first call to
> synchronize_sched(), cfr. my dmesg snippet above.
> 
> In a working boot:
> # cat /sys/bus/clocksource/devices/clocksource0/available_clocksource
> e018.timer jiffies
> # cat /sys/bus/clocksource/devices/clocksource0/current_clocksource
> e018.timer

Ah!  But if there is no jiffies clocksource, then schedule_timeout()
and friends will never return, correct?  If so, I guarantee you that
synchronize_sched() will unconditionally hang.

So if I understand correctly, the fix is to get the jiffies clocksource
running before the first call to synchronize_sched().

Thanx, Paul

> > OK, that should be enough for the next phase, please see the end for the
> > patch.  This patch applies on top of my previous one.
> >
> > Could you please set this up as follows?
> >
> > struct rcu_head rh;
> >
> > rcu_dump_rcu_node_tree(_sched_state);  /* Initial state. */
> > call_rcu(, do_nothing_cb);
> 
> I added an empty do_nothing_cb() for this:
> 
> static void do_nothing_cb(struct rcu_head *rcu_head)
> {
> }
> 
> According to the debugging technique "comment everything out until it boots",
> it now hangs in call_rcu().
> 
> > 

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-29 Thread Geert Uytterhoeven
Hi Paul,

On Wed, Jun 29, 2016 at 6:44 PM, Paul E. McKenney
 wrote:
> On Wed, Jun 29, 2016 at 04:54:44PM +0200, Geert Uytterhoeven wrote:
>> On Thu, Jun 23, 2016 at 4:53 AM, Paul E. McKenney
>>  wrote:
>> > On Wed, Jun 22, 2016 at 07:47:42PM -0700, Paul E. McKenney wrote:
>
> [ . . . ]
>
>> > @@ -4720,11 +4720,18 @@ static void __init rcu_dump_rcu_node_tree(struct 
>> > rcu_state *rsp)
>> > pr_info(" ");
>> > level = rnp->level;
>> > }
>> > -   pr_cont("%d:%d ^%d  ", rnp->grplo, rnp->grphi, 
>> > rnp->grpnum);
>> > +   pr_cont("%d:%d/%#lx/%#lx ^%d  ", rnp->grplo, rnp->grphi,
>> > +   rnp->qsmask,
>> > +   rnp->qsmaskinit | rnp->qsmaskinitnext, 
>> > rnp->grpnum);
>> > }
>> > pr_cont("\n");
>> >  }
>>
>> For me it always crashes during the 37th call of synchronize_sched() in
>> setup_kmem_cache_node(), which is the first call after secondary CPU bring 
>> up.
>> With your and my debug code, I get:
>>
>>   CPU: Testing write buffer coherency: ok
>>   CPU0: thread -1, cpu 0, socket 0, mpidr 8000
>>   Setting up static identity map for 0x4010 - 0x40100058
>>   cnt = 36, sync
>>   CPU1: thread -1, cpu 1, socket 0, mpidr 8001
>>   Brought up 2 CPUs
>>   SMP: Total of 2 processors activated (2132.00 BogoMIPS).
>>   CPU: All CPU(s) started in SVC mode.
>>   rcu_node tree layout dump
>>0:1/0x0/0x3 ^0
>
> Thank you for running this!
>
> OK, so RCU knows about both CPUs (the "0x3"), and the previous
> grace period has seen quiescent states from both of them (the "0x0").
> That would indicate that your synchronize_sched() showed up when RCU was
> idle, so it had to start a new grace period.  It also rules out failure
> modes where RCU thinks that there are more CPUs than really exist.
> (Don't laugh, such things have really happened.)
>
>>   devtmpfs: initialized
>>   VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 1
>>   clocksource: jiffies: mask: 0x max_cycles: 0x,
>> max_idle_ns: 1911260446275 ns
>>
>> I hope it helps. Thanks!
>
> I am going to guess that this was the first grace period since the second
> CPU came online.  When there only on CPU online, synchronize_sched()
> is a no-op.
>
> OK, this showed some things that aren't a problem.  What might the
> problem be?
>
> o   The grace-period kthread has not yet started.  It -should- start
> at early_initcall() time, but who knows?  Adding code to print
> out that kthread's task_struct address.
>
> o   The grace-period kthread might not be responding to wakeups.
> Checking this requires that a grace period be in progress,
> so please put a call_rcu_sched() just before the call to
> rcu_dump_rcu_node_tree().  (Sample code below.)  Adding code
> to my patch to print out more GP-kthread state as well.
>
> o   One of the CPUs might not be responding to RCU.  That -should-
> result in an RCU CPU stall warning, so I will ignore this
> possibility for the moment.
>
> That said, do you have some way to determine whether scheduling
> clock interrupts are really happening?  Without these interrupts,
> no RCU CPU stall warnings.

I believe there are no clocksources yet. The jiffies clocksource is the first
clocksource found, and that happens after the first call to
synchronize_sched(), cfr. my dmesg snippet above.

In a working boot:
# cat /sys/bus/clocksource/devices/clocksource0/available_clocksource
e018.timer jiffies
# cat /sys/bus/clocksource/devices/clocksource0/current_clocksource
e018.timer

> OK, that should be enough for the next phase, please see the end for the
> patch.  This patch applies on top of my previous one.
>
> Could you please set this up as follows?
>
> struct rcu_head rh;
>
> rcu_dump_rcu_node_tree(_sched_state);  /* Initial state. */
> call_rcu(, do_nothing_cb);

I added an empty do_nothing_cb() for this:

static void do_nothing_cb(struct rcu_head *rcu_head)
{
}

According to the debugging technique "comment everything out until it boots",
it now hangs in call_rcu().

> schedule_timeout_uninterruptible(5 * HZ);  /* Or whatever delay. */
> rcu_dump_rcu_node_tree(_sched_state);  /* GP state. */
> synchronize_sched();  /* Probably hangs. */
> rcu_barrier();  /* Drop RCU's references to rh before return. */

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-29 Thread Paul E. McKenney
On Wed, Jun 29, 2016 at 04:54:44PM +0200, Geert Uytterhoeven wrote:
> Hi Paul,
> 
> On Thu, Jun 23, 2016 at 4:53 AM, Paul E. McKenney
>  wrote:
> > On Wed, Jun 22, 2016 at 07:47:42PM -0700, Paul E. McKenney wrote:

[ . . . ]

> > @@ -4720,11 +4720,18 @@ static void __init rcu_dump_rcu_node_tree(struct 
> > rcu_state *rsp)
> > pr_info(" ");
> > level = rnp->level;
> > }
> > -   pr_cont("%d:%d ^%d  ", rnp->grplo, rnp->grphi, rnp->grpnum);
> > +   pr_cont("%d:%d/%#lx/%#lx ^%d  ", rnp->grplo, rnp->grphi,
> > +   rnp->qsmask,
> > +   rnp->qsmaskinit | rnp->qsmaskinitnext, rnp->grpnum);
> > }
> > pr_cont("\n");
> >  }
> 
> For me it always crashes during the 37th call of synchronize_sched() in
> setup_kmem_cache_node(), which is the first call after secondary CPU bring up.
> With your and my debug code, I get:
> 
>   CPU: Testing write buffer coherency: ok
>   CPU0: thread -1, cpu 0, socket 0, mpidr 8000
>   Setting up static identity map for 0x4010 - 0x40100058
>   cnt = 36, sync
>   CPU1: thread -1, cpu 1, socket 0, mpidr 8001
>   Brought up 2 CPUs
>   SMP: Total of 2 processors activated (2132.00 BogoMIPS).
>   CPU: All CPU(s) started in SVC mode.
>   rcu_node tree layout dump
>0:1/0x0/0x3 ^0

Thank you for running this!

OK, so RCU knows about both CPUs (the "0x3"), and the previous
grace period has seen quiescent states from both of them (the "0x0").
That would indicate that your synchronize_sched() showed up when RCU was
idle, so it had to start a new grace period.  It also rules out failure
modes where RCU thinks that there are more CPUs than really exist.
(Don't laugh, such things have really happened.)

>   devtmpfs: initialized
>   VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 1
>   clocksource: jiffies: mask: 0x max_cycles: 0x,
> max_idle_ns: 1911260446275 ns
> 
> I hope it helps. Thanks!

I am going to guess that this was the first grace period since the second
CPU came online.  When there only on CPU online, synchronize_sched()
is a no-op.

OK, this showed some things that aren't a problem.  What might the
problem be?

o   The grace-period kthread has not yet started.  It -should- start
at early_initcall() time, but who knows?  Adding code to print
out that kthread's task_struct address.

o   The grace-period kthread might not be responding to wakeups.
Checking this requires that a grace period be in progress,
so please put a call_rcu_sched() just before the call to
rcu_dump_rcu_node_tree().  (Sample code below.)  Adding code
to my patch to print out more GP-kthread state as well.

o   One of the CPUs might not be responding to RCU.  That -should-
result in an RCU CPU stall warning, so I will ignore this
possibility for the moment.

That said, do you have some way to determine whether scheduling
clock interrupts are really happening?  Without these interrupts,
no RCU CPU stall warnings.

OK, that should be enough for the next phase, please see the end for the
patch.  This patch applies on top of my previous one.

Could you please set this up as follows?

struct rcu_head rh;

rcu_dump_rcu_node_tree(_sched_state);  /* Initial state. */
call_rcu(, do_nothing_cb);
schedule_timeout_uninterruptible(5 * HZ);  /* Or whatever delay. */
rcu_dump_rcu_node_tree(_sched_state);  /* GP state. */
synchronize_sched();  /* Probably hangs. */
rcu_barrier();  /* Drop RCU's references to rh before return. */

Thanx, Paul



commit 82829ec76c2c0de18874a60ebd7ff8ee80f244d1
Author: Paul E. McKenney 
Date:   Wed Jun 29 09:42:13 2016 -0700

rcu: More diagnostics

Signed-off-by: Paul E. McKenney 

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 2eda7bece401..ff55c569473c 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4712,6 +4712,11 @@ static void rcu_dump_rcu_node_tree(struct rcu_state *rsp)
int level = 0;
struct rcu_node *rnp;
 
+   pr_info("RCU: %s GP kthread: %p state: %d flags: %#x g:%ld c:%ld\n",
+   rsp->name, rsp->gp_kthread, rsp->gp_state, rsp->gp_flags,
+   (long)rsp->gpnum, (long)rsp->completed);
+   pr_info(" jiffies: %#lx  GP start: %#lx Last GP activity: %#lx\n",
+   jiffies, rsp->gp_start, rsp->gp_activity);
pr_info("rcu_node tree layout dump\n");
pr_info(" ");
rcu_for_each_node_breadth_first(rsp, rnp) {



Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-29 Thread Geert Uytterhoeven
Hi Paul,

On Thu, Jun 23, 2016 at 4:53 AM, Paul E. McKenney
 wrote:
> On Wed, Jun 22, 2016 at 07:47:42PM -0700, Paul E. McKenney wrote:
>> On Thu, Jun 23, 2016 at 11:37:56AM +0900, Joonsoo Kim wrote:
>> > On Wed, Jun 22, 2016 at 05:49:35PM -0700, Paul E. McKenney wrote:
>> > > On Wed, Jun 22, 2016 at 12:08:59PM -0700, Paul E. McKenney wrote:
>> > > > On Wed, Jun 22, 2016 at 05:01:35PM +0200, Geert Uytterhoeven wrote:
>> > > > > On Wed, Jun 22, 2016 at 2:52 AM, Joonsoo Kim 
>> > > > >  wrote:
>> > > > > > Could you try below patch to check who causes the hang?
>> > > > > >
>> > > > > > And, if sysalt-t works when hang, could you get sysalt-t output? I 
>> > > > > > haven't
>> > > > > > used it before but Paul could find some culprit on it. :)
>> > > > > >
>> > > > > > Thanks.
>> > > > > >
>> > > > > >
>> > > > > > ->8-
>> > > > > > diff --git a/mm/slab.c b/mm/slab.c
>> > > > > > index 763096a..9652d38 100644
>> > > > > > --- a/mm/slab.c
>> > > > > > +++ b/mm/slab.c
>> > > > > > @@ -964,8 +964,13 @@ static int setup_kmem_cache_node(struct 
>> > > > > > kmem_cache *cachep,
>> > > > > >  * guaranteed to be valid until irq is re-enabled, because 
>> > > > > > it will be
>> > > > > >  * freed after synchronize_sched().
>> > > > > >  */
>> > > > > > -   if (force_change)
>> > > > > > +   if (force_change) {
>> > > > > > +   if (num_online_cpus() > 1)
>> > > > > > +   dump_stack();
>> > > > > > synchronize_sched();
>> > > > > > +   if (num_online_cpus() > 1)
>> > > > > > +   dump_stack();
>> > > > > > +   }
>> > > > >
>> > > > > I've only added the first one, as I would never see the second one. 
>> > > > > All of
>> > > > > this happens before the serial console is activated, earlycon is not 
>> > > > > supported,
>> > > > > and I only have remote access.
>> > > > >
>> > > > > Brought up 2 CPUs
>> > > > > SMP: Total of 2 processors activated (2132.00 BogoMIPS).
>> > > > > CPU: All CPU(s) started in SVC mode.
>> > > > > CPU: 0 PID: 1 Comm: swapper/0 Not tainted
>> > > > > 4.7.0-rc4-kzm9d-00404-g4a235e6dde4404dd-dirty #89
>> > > > > Hardware name: Generic Emma Mobile EV2 (Flattened Device Tree)
>> > > > > [] (unwind_backtrace) from [] 
>> > > > > (show_stack+0x10/0x14)
>> > > > > [] (show_stack) from [] (dump_stack+0x7c/0x9c)
>> > > > > [] (dump_stack) from [] 
>> > > > > (setup_kmem_cache_node+0x140/0x170)
>> > > > > [] (setup_kmem_cache_node) from []
>> > > > > (__do_tune_cpucache+0xf4/0x114)
>> > > > > [] (__do_tune_cpucache) from [] 
>> > > > > (enable_cpucache+0xf8/0x148)
>> > > > > [] (enable_cpucache) from []
>> > > > > (__kmem_cache_create+0x1a8/0x1d0)
>> > > > > [] (__kmem_cache_create) from []
>> > > > > (kmem_cache_create+0xbc/0x190)
>> > > > > [] (kmem_cache_create) from [] 
>> > > > > (shmem_init+0x34/0xb0)
>> > > > > [] (shmem_init) from [] 
>> > > > > (kernel_init_freeable+0x98/0x1ec)
>> > > > > [] (kernel_init_freeable) from [] 
>> > > > > (kernel_init+0x8/0x110)
>> > > > > [] (kernel_init) from [] 
>> > > > > (ret_from_fork+0x14/0x3c)
>> > > > > devtmpfs: initialized
>> > > >
>> > > > I don't see anything here that would prevent grace periods from 
>> > > > completing.
>> > > >
>> > > > The CPUs are using the normal hotplug sequence to come online, correct?
>> > >
>> > > And either way, could you please apply the patch below and then
>> > > invoke rcu_dump_rcu_sched_tree() just before the offending call to
>> > > synchronize_sched()?  That will tell me what CPUs RCU believes exist,
>> > > and perhaps also which CPU is holding it up.
>> >
>> > I can't find rcu_dump_rcu_sched_tree(). Do you mean
>> > rcu_dump_rcu_node_tree()? Anyway, there is no patch below so I attach
>> > one which does what Paul want, maybe.
>>
>> One of those days, I guess!  :-/
>>
>> Your patch is exactly what I intended to send, thank you!
>
> Ah, but your telepathy was not sufficient to intuit the additional
> information I need.  Please see the patch at the end.  Your hunk
> in mm/slab.c is needed on top of my patch.

> @@ -4720,11 +4720,18 @@ static void __init rcu_dump_rcu_node_tree(struct 
> rcu_state *rsp)
> pr_info(" ");
> level = rnp->level;
> }
> -   pr_cont("%d:%d ^%d  ", rnp->grplo, rnp->grphi, rnp->grpnum);
> +   pr_cont("%d:%d/%#lx/%#lx ^%d  ", rnp->grplo, rnp->grphi,
> +   rnp->qsmask,
> +   rnp->qsmaskinit | rnp->qsmaskinitnext, rnp->grpnum);
> }
> pr_cont("\n");
>  }

For me it always crashes during the 37th call of synchronize_sched() in
setup_kmem_cache_node(), which is the first call after secondary CPU bring up.
With your and my debug code, I get:

  CPU: Testing write buffer coherency: ok
  CPU0: thread -1, cpu 0, socket 0, mpidr 8000
  Setting up static identity map for 0x4010 - 

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-28 Thread Joonsoo Kim
On Mon, Jun 27, 2016 at 05:12:43PM -0700, Paul E. McKenney wrote:
> On Wed, Jun 22, 2016 at 07:53:29PM -0700, Paul E. McKenney wrote:
> > On Wed, Jun 22, 2016 at 07:47:42PM -0700, Paul E. McKenney wrote:
> > > On Thu, Jun 23, 2016 at 11:37:56AM +0900, Joonsoo Kim wrote:
> > > > On Wed, Jun 22, 2016 at 05:49:35PM -0700, Paul E. McKenney wrote:
> > > > > On Wed, Jun 22, 2016 at 12:08:59PM -0700, Paul E. McKenney wrote:
> > > > > > On Wed, Jun 22, 2016 at 05:01:35PM +0200, Geert Uytterhoeven wrote:
> > > > > > > On Wed, Jun 22, 2016 at 2:52 AM, Joonsoo Kim 
> > > > > > >  wrote:
> > > > > > > > Could you try below patch to check who causes the hang?
> > > > > > > >
> > > > > > > > And, if sysalt-t works when hang, could you get sysalt-t 
> > > > > > > > output? I haven't
> > > > > > > > used it before but Paul could find some culprit on it. :)
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > > >
> > > > > > > >
> > > > > > > > ->8-
> > > > > > > > diff --git a/mm/slab.c b/mm/slab.c
> > > > > > > > index 763096a..9652d38 100644
> > > > > > > > --- a/mm/slab.c
> > > > > > > > +++ b/mm/slab.c
> > > > > > > > @@ -964,8 +964,13 @@ static int setup_kmem_cache_node(struct 
> > > > > > > > kmem_cache *cachep,
> > > > > > > >  * guaranteed to be valid until irq is re-enabled, 
> > > > > > > > because it will be
> > > > > > > >  * freed after synchronize_sched().
> > > > > > > >  */
> > > > > > > > -   if (force_change)
> > > > > > > > +   if (force_change) {
> > > > > > > > +   if (num_online_cpus() > 1)
> > > > > > > > +   dump_stack();
> > > > > > > > synchronize_sched();
> > > > > > > > +   if (num_online_cpus() > 1)
> > > > > > > > +   dump_stack();
> > > > > > > > +   }
> > > > > > > 
> > > > > > > I've only added the first one, as I would never see the second 
> > > > > > > one. All of
> > > > > > > this happens before the serial console is activated, earlycon is 
> > > > > > > not supported,
> > > > > > > and I only have remote access.
> > > > > > > 
> > > > > > > Brought up 2 CPUs
> > > > > > > SMP: Total of 2 processors activated (2132.00 BogoMIPS).
> > > > > > > CPU: All CPU(s) started in SVC mode.
> > > > > > > CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> > > > > > > 4.7.0-rc4-kzm9d-00404-g4a235e6dde4404dd-dirty #89
> > > > > > > Hardware name: Generic Emma Mobile EV2 (Flattened Device Tree)
> > > > > > > [] (unwind_backtrace) from [] 
> > > > > > > (show_stack+0x10/0x14)
> > > > > > > [] (show_stack) from [] (dump_stack+0x7c/0x9c)
> > > > > > > [] (dump_stack) from [] 
> > > > > > > (setup_kmem_cache_node+0x140/0x170)
> > > > > > > [] (setup_kmem_cache_node) from []
> > > > > > > (__do_tune_cpucache+0xf4/0x114)
> > > > > > > [] (__do_tune_cpucache) from [] 
> > > > > > > (enable_cpucache+0xf8/0x148)
> > > > > > > [] (enable_cpucache) from []
> > > > > > > (__kmem_cache_create+0x1a8/0x1d0)
> > > > > > > [] (__kmem_cache_create) from []
> > > > > > > (kmem_cache_create+0xbc/0x190)
> > > > > > > [] (kmem_cache_create) from [] 
> > > > > > > (shmem_init+0x34/0xb0)
> > > > > > > [] (shmem_init) from [] 
> > > > > > > (kernel_init_freeable+0x98/0x1ec)
> > > > > > > [] (kernel_init_freeable) from [] 
> > > > > > > (kernel_init+0x8/0x110)
> > > > > > > [] (kernel_init) from [] 
> > > > > > > (ret_from_fork+0x14/0x3c)
> > > > > > > devtmpfs: initialized
> > > > > > 
> > > > > > I don't see anything here that would prevent grace periods from 
> > > > > > completing.
> > > > > > 
> > > > > > The CPUs are using the normal hotplug sequence to come online, 
> > > > > > correct?
> > > > > 
> > > > > And either way, could you please apply the patch below and then
> > > > > invoke rcu_dump_rcu_sched_tree() just before the offending call to
> > > > > synchronize_sched()?  That will tell me what CPUs RCU believes exist,
> > > > > and perhaps also which CPU is holding it up.
> > > > 
> > > > I can't find rcu_dump_rcu_sched_tree(). Do you mean
> > > > rcu_dump_rcu_node_tree()? Anyway, there is no patch below so I attach
> > > > one which does what Paul want, maybe.
> > > 
> > > One of those days, I guess!  :-/
> > > 
> > > Your patch is exactly what I intended to send, thank you!
> > 
> > Ah, but your telepathy was not sufficient to intuit the additional
> > information I need.  Please see the patch at the end.  Your hunk
> > in mm/slab.c is needed on top of my patch.
> > 
> > So I am clearly having difficulties reading as well as including patches
> > today...
> 
> Just following up, any news using my diagnostic patch?

Hello, Paul.

Unfortunately, I have no hardware to re-generate it, so we need to wait Geert's
feedback.

Thanks.



Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-27 Thread Paul E. McKenney
On Wed, Jun 22, 2016 at 07:53:29PM -0700, Paul E. McKenney wrote:
> On Wed, Jun 22, 2016 at 07:47:42PM -0700, Paul E. McKenney wrote:
> > On Thu, Jun 23, 2016 at 11:37:56AM +0900, Joonsoo Kim wrote:
> > > On Wed, Jun 22, 2016 at 05:49:35PM -0700, Paul E. McKenney wrote:
> > > > On Wed, Jun 22, 2016 at 12:08:59PM -0700, Paul E. McKenney wrote:
> > > > > On Wed, Jun 22, 2016 at 05:01:35PM +0200, Geert Uytterhoeven wrote:
> > > > > > On Wed, Jun 22, 2016 at 2:52 AM, Joonsoo Kim 
> > > > > >  wrote:
> > > > > > > Could you try below patch to check who causes the hang?
> > > > > > >
> > > > > > > And, if sysalt-t works when hang, could you get sysalt-t output? 
> > > > > > > I haven't
> > > > > > > used it before but Paul could find some culprit on it. :)
> > > > > > >
> > > > > > > Thanks.
> > > > > > >
> > > > > > >
> > > > > > > ->8-
> > > > > > > diff --git a/mm/slab.c b/mm/slab.c
> > > > > > > index 763096a..9652d38 100644
> > > > > > > --- a/mm/slab.c
> > > > > > > +++ b/mm/slab.c
> > > > > > > @@ -964,8 +964,13 @@ static int setup_kmem_cache_node(struct 
> > > > > > > kmem_cache *cachep,
> > > > > > >  * guaranteed to be valid until irq is re-enabled, 
> > > > > > > because it will be
> > > > > > >  * freed after synchronize_sched().
> > > > > > >  */
> > > > > > > -   if (force_change)
> > > > > > > +   if (force_change) {
> > > > > > > +   if (num_online_cpus() > 1)
> > > > > > > +   dump_stack();
> > > > > > > synchronize_sched();
> > > > > > > +   if (num_online_cpus() > 1)
> > > > > > > +   dump_stack();
> > > > > > > +   }
> > > > > > 
> > > > > > I've only added the first one, as I would never see the second one. 
> > > > > > All of
> > > > > > this happens before the serial console is activated, earlycon is 
> > > > > > not supported,
> > > > > > and I only have remote access.
> > > > > > 
> > > > > > Brought up 2 CPUs
> > > > > > SMP: Total of 2 processors activated (2132.00 BogoMIPS).
> > > > > > CPU: All CPU(s) started in SVC mode.
> > > > > > CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> > > > > > 4.7.0-rc4-kzm9d-00404-g4a235e6dde4404dd-dirty #89
> > > > > > Hardware name: Generic Emma Mobile EV2 (Flattened Device Tree)
> > > > > > [] (unwind_backtrace) from [] 
> > > > > > (show_stack+0x10/0x14)
> > > > > > [] (show_stack) from [] (dump_stack+0x7c/0x9c)
> > > > > > [] (dump_stack) from [] 
> > > > > > (setup_kmem_cache_node+0x140/0x170)
> > > > > > [] (setup_kmem_cache_node) from []
> > > > > > (__do_tune_cpucache+0xf4/0x114)
> > > > > > [] (__do_tune_cpucache) from [] 
> > > > > > (enable_cpucache+0xf8/0x148)
> > > > > > [] (enable_cpucache) from []
> > > > > > (__kmem_cache_create+0x1a8/0x1d0)
> > > > > > [] (__kmem_cache_create) from []
> > > > > > (kmem_cache_create+0xbc/0x190)
> > > > > > [] (kmem_cache_create) from [] 
> > > > > > (shmem_init+0x34/0xb0)
> > > > > > [] (shmem_init) from [] 
> > > > > > (kernel_init_freeable+0x98/0x1ec)
> > > > > > [] (kernel_init_freeable) from [] 
> > > > > > (kernel_init+0x8/0x110)
> > > > > > [] (kernel_init) from [] 
> > > > > > (ret_from_fork+0x14/0x3c)
> > > > > > devtmpfs: initialized
> > > > > 
> > > > > I don't see anything here that would prevent grace periods from 
> > > > > completing.
> > > > > 
> > > > > The CPUs are using the normal hotplug sequence to come online, 
> > > > > correct?
> > > > 
> > > > And either way, could you please apply the patch below and then
> > > > invoke rcu_dump_rcu_sched_tree() just before the offending call to
> > > > synchronize_sched()?  That will tell me what CPUs RCU believes exist,
> > > > and perhaps also which CPU is holding it up.
> > > 
> > > I can't find rcu_dump_rcu_sched_tree(). Do you mean
> > > rcu_dump_rcu_node_tree()? Anyway, there is no patch below so I attach
> > > one which does what Paul want, maybe.
> > 
> > One of those days, I guess!  :-/
> > 
> > Your patch is exactly what I intended to send, thank you!
> 
> Ah, but your telepathy was not sufficient to intuit the additional
> information I need.  Please see the patch at the end.  Your hunk
> in mm/slab.c is needed on top of my patch.
> 
> So I am clearly having difficulties reading as well as including patches
> today...

Just following up, any news using my diagnostic patch?

Thanx, Paul

> > > Thanks.
> > > 
> > > --->8-
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 88d3f95..6b650f0 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -4171,7 +4171,7 @@ static void __init rcu_init_geometry(void)
> > >   * Dump out the structure of the rcu_node combining tree associated
> > >   * with the rcu_state structure referenced by rsp.
> > >   */
> > > -static void __init rcu_dump_rcu_node_tree(struct rcu_state *rsp)
> > > +static void rcu_dump_rcu_node_tree(struct rcu_state *rsp)
> > >  {

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-22 Thread Paul E. McKenney
On Wed, Jun 22, 2016 at 07:47:42PM -0700, Paul E. McKenney wrote:
> On Thu, Jun 23, 2016 at 11:37:56AM +0900, Joonsoo Kim wrote:
> > On Wed, Jun 22, 2016 at 05:49:35PM -0700, Paul E. McKenney wrote:
> > > On Wed, Jun 22, 2016 at 12:08:59PM -0700, Paul E. McKenney wrote:
> > > > On Wed, Jun 22, 2016 at 05:01:35PM +0200, Geert Uytterhoeven wrote:
> > > > > On Wed, Jun 22, 2016 at 2:52 AM, Joonsoo Kim  
> > > > > wrote:
> > > > > > Could you try below patch to check who causes the hang?
> > > > > >
> > > > > > And, if sysalt-t works when hang, could you get sysalt-t output? I 
> > > > > > haven't
> > > > > > used it before but Paul could find some culprit on it. :)
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > >
> > > > > > ->8-
> > > > > > diff --git a/mm/slab.c b/mm/slab.c
> > > > > > index 763096a..9652d38 100644
> > > > > > --- a/mm/slab.c
> > > > > > +++ b/mm/slab.c
> > > > > > @@ -964,8 +964,13 @@ static int setup_kmem_cache_node(struct 
> > > > > > kmem_cache *cachep,
> > > > > >  * guaranteed to be valid until irq is re-enabled, because 
> > > > > > it will be
> > > > > >  * freed after synchronize_sched().
> > > > > >  */
> > > > > > -   if (force_change)
> > > > > > +   if (force_change) {
> > > > > > +   if (num_online_cpus() > 1)
> > > > > > +   dump_stack();
> > > > > > synchronize_sched();
> > > > > > +   if (num_online_cpus() > 1)
> > > > > > +   dump_stack();
> > > > > > +   }
> > > > > 
> > > > > I've only added the first one, as I would never see the second one. 
> > > > > All of
> > > > > this happens before the serial console is activated, earlycon is not 
> > > > > supported,
> > > > > and I only have remote access.
> > > > > 
> > > > > Brought up 2 CPUs
> > > > > SMP: Total of 2 processors activated (2132.00 BogoMIPS).
> > > > > CPU: All CPU(s) started in SVC mode.
> > > > > CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> > > > > 4.7.0-rc4-kzm9d-00404-g4a235e6dde4404dd-dirty #89
> > > > > Hardware name: Generic Emma Mobile EV2 (Flattened Device Tree)
> > > > > [] (unwind_backtrace) from [] 
> > > > > (show_stack+0x10/0x14)
> > > > > [] (show_stack) from [] (dump_stack+0x7c/0x9c)
> > > > > [] (dump_stack) from [] 
> > > > > (setup_kmem_cache_node+0x140/0x170)
> > > > > [] (setup_kmem_cache_node) from []
> > > > > (__do_tune_cpucache+0xf4/0x114)
> > > > > [] (__do_tune_cpucache) from [] 
> > > > > (enable_cpucache+0xf8/0x148)
> > > > > [] (enable_cpucache) from []
> > > > > (__kmem_cache_create+0x1a8/0x1d0)
> > > > > [] (__kmem_cache_create) from []
> > > > > (kmem_cache_create+0xbc/0x190)
> > > > > [] (kmem_cache_create) from [] 
> > > > > (shmem_init+0x34/0xb0)
> > > > > [] (shmem_init) from [] 
> > > > > (kernel_init_freeable+0x98/0x1ec)
> > > > > [] (kernel_init_freeable) from [] 
> > > > > (kernel_init+0x8/0x110)
> > > > > [] (kernel_init) from [] (ret_from_fork+0x14/0x3c)
> > > > > devtmpfs: initialized
> > > > 
> > > > I don't see anything here that would prevent grace periods from 
> > > > completing.
> > > > 
> > > > The CPUs are using the normal hotplug sequence to come online, correct?
> > > 
> > > And either way, could you please apply the patch below and then
> > > invoke rcu_dump_rcu_sched_tree() just before the offending call to
> > > synchronize_sched()?  That will tell me what CPUs RCU believes exist,
> > > and perhaps also which CPU is holding it up.
> > 
> > I can't find rcu_dump_rcu_sched_tree(). Do you mean
> > rcu_dump_rcu_node_tree()? Anyway, there is no patch below so I attach
> > one which does what Paul want, maybe.
> 
> One of those days, I guess!  :-/
> 
> Your patch is exactly what I intended to send, thank you!

Ah, but your telepathy was not sufficient to intuit the additional
information I need.  Please see the patch at the end.  Your hunk
in mm/slab.c is needed on top of my patch.

So I am clearly having difficulties reading as well as including patches
today...

Thanx, Paul

> > Thanks.
> > 
> > --->8-
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 88d3f95..6b650f0 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -4171,7 +4171,7 @@ static void __init rcu_init_geometry(void)
> >   * Dump out the structure of the rcu_node combining tree associated
> >   * with the rcu_state structure referenced by rsp.
> >   */
> > -static void __init rcu_dump_rcu_node_tree(struct rcu_state *rsp)
> > +static void rcu_dump_rcu_node_tree(struct rcu_state *rsp)
> >  {
> > int level = 0;
> > struct rcu_node *rnp;
> > @@ -4189,6 +4189,11 @@ static void __init rcu_dump_rcu_node_tree(struct 
> > rcu_state *rsp)
> > pr_cont("\n");
> >  }
> > 
> > +void rcu_dump_rcu_sched_tree(void)
> > +{
> > +   rcu_dump_rcu_node_tree(_sched_state);
> > +}
> > +
> >  void __init rcu_init(void)
> >  {
> > int 

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-22 Thread Paul E. McKenney
On Thu, Jun 23, 2016 at 11:37:56AM +0900, Joonsoo Kim wrote:
> On Wed, Jun 22, 2016 at 05:49:35PM -0700, Paul E. McKenney wrote:
> > On Wed, Jun 22, 2016 at 12:08:59PM -0700, Paul E. McKenney wrote:
> > > On Wed, Jun 22, 2016 at 05:01:35PM +0200, Geert Uytterhoeven wrote:
> > > > On Wed, Jun 22, 2016 at 2:52 AM, Joonsoo Kim  
> > > > wrote:
> > > > > Could you try below patch to check who causes the hang?
> > > > >
> > > > > And, if sysalt-t works when hang, could you get sysalt-t output? I 
> > > > > haven't
> > > > > used it before but Paul could find some culprit on it. :)
> > > > >
> > > > > Thanks.
> > > > >
> > > > >
> > > > > ->8-
> > > > > diff --git a/mm/slab.c b/mm/slab.c
> > > > > index 763096a..9652d38 100644
> > > > > --- a/mm/slab.c
> > > > > +++ b/mm/slab.c
> > > > > @@ -964,8 +964,13 @@ static int setup_kmem_cache_node(struct 
> > > > > kmem_cache *cachep,
> > > > >  * guaranteed to be valid until irq is re-enabled, because it 
> > > > > will be
> > > > >  * freed after synchronize_sched().
> > > > >  */
> > > > > -   if (force_change)
> > > > > +   if (force_change) {
> > > > > +   if (num_online_cpus() > 1)
> > > > > +   dump_stack();
> > > > > synchronize_sched();
> > > > > +   if (num_online_cpus() > 1)
> > > > > +   dump_stack();
> > > > > +   }
> > > > 
> > > > I've only added the first one, as I would never see the second one. All 
> > > > of
> > > > this happens before the serial console is activated, earlycon is not 
> > > > supported,
> > > > and I only have remote access.
> > > > 
> > > > Brought up 2 CPUs
> > > > SMP: Total of 2 processors activated (2132.00 BogoMIPS).
> > > > CPU: All CPU(s) started in SVC mode.
> > > > CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> > > > 4.7.0-rc4-kzm9d-00404-g4a235e6dde4404dd-dirty #89
> > > > Hardware name: Generic Emma Mobile EV2 (Flattened Device Tree)
> > > > [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
> > > > [] (show_stack) from [] (dump_stack+0x7c/0x9c)
> > > > [] (dump_stack) from [] 
> > > > (setup_kmem_cache_node+0x140/0x170)
> > > > [] (setup_kmem_cache_node) from []
> > > > (__do_tune_cpucache+0xf4/0x114)
> > > > [] (__do_tune_cpucache) from [] 
> > > > (enable_cpucache+0xf8/0x148)
> > > > [] (enable_cpucache) from []
> > > > (__kmem_cache_create+0x1a8/0x1d0)
> > > > [] (__kmem_cache_create) from []
> > > > (kmem_cache_create+0xbc/0x190)
> > > > [] (kmem_cache_create) from [] 
> > > > (shmem_init+0x34/0xb0)
> > > > [] (shmem_init) from [] 
> > > > (kernel_init_freeable+0x98/0x1ec)
> > > > [] (kernel_init_freeable) from [] 
> > > > (kernel_init+0x8/0x110)
> > > > [] (kernel_init) from [] (ret_from_fork+0x14/0x3c)
> > > > devtmpfs: initialized
> > > 
> > > I don't see anything here that would prevent grace periods from 
> > > completing.
> > > 
> > > The CPUs are using the normal hotplug sequence to come online, correct?
> > 
> > And either way, could you please apply the patch below and then
> > invoke rcu_dump_rcu_sched_tree() just before the offending call to
> > synchronize_sched()?  That will tell me what CPUs RCU believes exist,
> > and perhaps also which CPU is holding it up.
> 
> I can't find rcu_dump_rcu_sched_tree(). Do you mean
> rcu_dump_rcu_node_tree()? Anyway, there is no patch below so I attach
> one which does what Paul want, maybe.

One of those days, I guess!  :-/

Your patch is exactly what I intended to send, thank you!

Thanx, Paul

> Thanks.
> 
> --->8-
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 88d3f95..6b650f0 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -4171,7 +4171,7 @@ static void __init rcu_init_geometry(void)
>   * Dump out the structure of the rcu_node combining tree associated
>   * with the rcu_state structure referenced by rsp.
>   */
> -static void __init rcu_dump_rcu_node_tree(struct rcu_state *rsp)
> +static void rcu_dump_rcu_node_tree(struct rcu_state *rsp)
>  {
> int level = 0;
> struct rcu_node *rnp;
> @@ -4189,6 +4189,11 @@ static void __init rcu_dump_rcu_node_tree(struct 
> rcu_state *rsp)
> pr_cont("\n");
>  }
> 
> +void rcu_dump_rcu_sched_tree(void)
> +{
> +   rcu_dump_rcu_node_tree(_sched_state);
> +}
> +
>  void __init rcu_init(void)
>  {
> int cpu;
> diff --git a/mm/slab.c b/mm/slab.c
> index 763096a..d88976c 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -909,6 +909,8 @@ static int init_cache_node_node(int node)
> return 0;
>  }
> 
> +extern void rcu_dump_rcu_sched_tree(void);
> +
>  static int setup_kmem_cache_node(struct kmem_cache *cachep,
> int node, gfp_t gfp, bool force_change)
>  {
> @@ -964,8 +966,10 @@ static int setup_kmem_cache_node(struct kmem_cache 
> *cachep,
>  * guaranteed to be valid until irq is re-enabled, because it will be
>   

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-22 Thread Joonsoo Kim
On Wed, Jun 22, 2016 at 05:49:35PM -0700, Paul E. McKenney wrote:
> On Wed, Jun 22, 2016 at 12:08:59PM -0700, Paul E. McKenney wrote:
> > On Wed, Jun 22, 2016 at 05:01:35PM +0200, Geert Uytterhoeven wrote:
> > > On Wed, Jun 22, 2016 at 2:52 AM, Joonsoo Kim  
> > > wrote:
> > > > Could you try below patch to check who causes the hang?
> > > >
> > > > And, if sysalt-t works when hang, could you get sysalt-t output? I 
> > > > haven't
> > > > used it before but Paul could find some culprit on it. :)
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > ->8-
> > > > diff --git a/mm/slab.c b/mm/slab.c
> > > > index 763096a..9652d38 100644
> > > > --- a/mm/slab.c
> > > > +++ b/mm/slab.c
> > > > @@ -964,8 +964,13 @@ static int setup_kmem_cache_node(struct kmem_cache 
> > > > *cachep,
> > > >  * guaranteed to be valid until irq is re-enabled, because it 
> > > > will be
> > > >  * freed after synchronize_sched().
> > > >  */
> > > > -   if (force_change)
> > > > +   if (force_change) {
> > > > +   if (num_online_cpus() > 1)
> > > > +   dump_stack();
> > > > synchronize_sched();
> > > > +   if (num_online_cpus() > 1)
> > > > +   dump_stack();
> > > > +   }
> > > 
> > > I've only added the first one, as I would never see the second one. All of
> > > this happens before the serial console is activated, earlycon is not 
> > > supported,
> > > and I only have remote access.
> > > 
> > > Brought up 2 CPUs
> > > SMP: Total of 2 processors activated (2132.00 BogoMIPS).
> > > CPU: All CPU(s) started in SVC mode.
> > > CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> > > 4.7.0-rc4-kzm9d-00404-g4a235e6dde4404dd-dirty #89
> > > Hardware name: Generic Emma Mobile EV2 (Flattened Device Tree)
> > > [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
> > > [] (show_stack) from [] (dump_stack+0x7c/0x9c)
> > > [] (dump_stack) from [] 
> > > (setup_kmem_cache_node+0x140/0x170)
> > > [] (setup_kmem_cache_node) from []
> > > (__do_tune_cpucache+0xf4/0x114)
> > > [] (__do_tune_cpucache) from [] 
> > > (enable_cpucache+0xf8/0x148)
> > > [] (enable_cpucache) from []
> > > (__kmem_cache_create+0x1a8/0x1d0)
> > > [] (__kmem_cache_create) from []
> > > (kmem_cache_create+0xbc/0x190)
> > > [] (kmem_cache_create) from [] (shmem_init+0x34/0xb0)
> > > [] (shmem_init) from [] 
> > > (kernel_init_freeable+0x98/0x1ec)
> > > [] (kernel_init_freeable) from [] 
> > > (kernel_init+0x8/0x110)
> > > [] (kernel_init) from [] (ret_from_fork+0x14/0x3c)
> > > devtmpfs: initialized
> > 
> > I don't see anything here that would prevent grace periods from completing.
> > 
> > The CPUs are using the normal hotplug sequence to come online, correct?
> 
> And either way, could you please apply the patch below and then
> invoke rcu_dump_rcu_sched_tree() just before the offending call to
> synchronize_sched()?  That will tell me what CPUs RCU believes exist,
> and perhaps also which CPU is holding it up.

I can't find rcu_dump_rcu_sched_tree(). Do you mean
rcu_dump_rcu_node_tree()? Anyway, there is no patch below so I attach
one which does what Paul want, maybe.

Thanks.

--->8-
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 88d3f95..6b650f0 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4171,7 +4171,7 @@ static void __init rcu_init_geometry(void)
  * Dump out the structure of the rcu_node combining tree associated
  * with the rcu_state structure referenced by rsp.
  */
-static void __init rcu_dump_rcu_node_tree(struct rcu_state *rsp)
+static void rcu_dump_rcu_node_tree(struct rcu_state *rsp)
 {
int level = 0;
struct rcu_node *rnp;
@@ -4189,6 +4189,11 @@ static void __init rcu_dump_rcu_node_tree(struct 
rcu_state *rsp)
pr_cont("\n");
 }
 
+void rcu_dump_rcu_sched_tree(void)
+{
+   rcu_dump_rcu_node_tree(_sched_state);
+}
+
 void __init rcu_init(void)
 {
int cpu;
diff --git a/mm/slab.c b/mm/slab.c
index 763096a..d88976c 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -909,6 +909,8 @@ static int init_cache_node_node(int node)
return 0;
 }
 
+extern void rcu_dump_rcu_sched_tree(void);
+
 static int setup_kmem_cache_node(struct kmem_cache *cachep,
int node, gfp_t gfp, bool force_change)
 {
@@ -964,8 +966,10 @@ static int setup_kmem_cache_node(struct kmem_cache *cachep,
 * guaranteed to be valid until irq is re-enabled, because it will be
 * freed after synchronize_sched().
 */
-   if (force_change)
+   if (force_change) {
+   rcu_dump_rcu_sched_tree();
synchronize_sched();
+   }
 
 fail:
kfree(old_shared);



Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-22 Thread Paul E. McKenney
On Wed, Jun 22, 2016 at 12:08:59PM -0700, Paul E. McKenney wrote:
> On Wed, Jun 22, 2016 at 05:01:35PM +0200, Geert Uytterhoeven wrote:
> > On Wed, Jun 22, 2016 at 2:52 AM, Joonsoo Kim  wrote:
> > > Could you try below patch to check who causes the hang?
> > >
> > > And, if sysalt-t works when hang, could you get sysalt-t output? I haven't
> > > used it before but Paul could find some culprit on it. :)
> > >
> > > Thanks.
> > >
> > >
> > > ->8-
> > > diff --git a/mm/slab.c b/mm/slab.c
> > > index 763096a..9652d38 100644
> > > --- a/mm/slab.c
> > > +++ b/mm/slab.c
> > > @@ -964,8 +964,13 @@ static int setup_kmem_cache_node(struct kmem_cache 
> > > *cachep,
> > >  * guaranteed to be valid until irq is re-enabled, because it 
> > > will be
> > >  * freed after synchronize_sched().
> > >  */
> > > -   if (force_change)
> > > +   if (force_change) {
> > > +   if (num_online_cpus() > 1)
> > > +   dump_stack();
> > > synchronize_sched();
> > > +   if (num_online_cpus() > 1)
> > > +   dump_stack();
> > > +   }
> > 
> > I've only added the first one, as I would never see the second one. All of
> > this happens before the serial console is activated, earlycon is not 
> > supported,
> > and I only have remote access.
> > 
> > Brought up 2 CPUs
> > SMP: Total of 2 processors activated (2132.00 BogoMIPS).
> > CPU: All CPU(s) started in SVC mode.
> > CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> > 4.7.0-rc4-kzm9d-00404-g4a235e6dde4404dd-dirty #89
> > Hardware name: Generic Emma Mobile EV2 (Flattened Device Tree)
> > [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
> > [] (show_stack) from [] (dump_stack+0x7c/0x9c)
> > [] (dump_stack) from [] 
> > (setup_kmem_cache_node+0x140/0x170)
> > [] (setup_kmem_cache_node) from []
> > (__do_tune_cpucache+0xf4/0x114)
> > [] (__do_tune_cpucache) from [] 
> > (enable_cpucache+0xf8/0x148)
> > [] (enable_cpucache) from []
> > (__kmem_cache_create+0x1a8/0x1d0)
> > [] (__kmem_cache_create) from []
> > (kmem_cache_create+0xbc/0x190)
> > [] (kmem_cache_create) from [] (shmem_init+0x34/0xb0)
> > [] (shmem_init) from [] 
> > (kernel_init_freeable+0x98/0x1ec)
> > [] (kernel_init_freeable) from [] 
> > (kernel_init+0x8/0x110)
> > [] (kernel_init) from [] (ret_from_fork+0x14/0x3c)
> > devtmpfs: initialized
> 
> I don't see anything here that would prevent grace periods from completing.
> 
> The CPUs are using the normal hotplug sequence to come online, correct?

And either way, could you please apply the patch below and then
invoke rcu_dump_rcu_sched_tree() just before the offending call to
synchronize_sched()?  That will tell me what CPUs RCU believes exist,
and perhaps also which CPU is holding it up.

Thanx, Paul



Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-22 Thread Paul E. McKenney
On Wed, Jun 22, 2016 at 05:01:35PM +0200, Geert Uytterhoeven wrote:
> On Wed, Jun 22, 2016 at 2:52 AM, Joonsoo Kim  wrote:
> > Could you try below patch to check who causes the hang?
> >
> > And, if sysalt-t works when hang, could you get sysalt-t output? I haven't
> > used it before but Paul could find some culprit on it. :)
> >
> > Thanks.
> >
> >
> > ->8-
> > diff --git a/mm/slab.c b/mm/slab.c
> > index 763096a..9652d38 100644
> > --- a/mm/slab.c
> > +++ b/mm/slab.c
> > @@ -964,8 +964,13 @@ static int setup_kmem_cache_node(struct kmem_cache 
> > *cachep,
> >  * guaranteed to be valid until irq is re-enabled, because it will 
> > be
> >  * freed after synchronize_sched().
> >  */
> > -   if (force_change)
> > +   if (force_change) {
> > +   if (num_online_cpus() > 1)
> > +   dump_stack();
> > synchronize_sched();
> > +   if (num_online_cpus() > 1)
> > +   dump_stack();
> > +   }
> 
> I've only added the first one, as I would never see the second one. All of
> this happens before the serial console is activated, earlycon is not 
> supported,
> and I only have remote access.
> 
> Brought up 2 CPUs
> SMP: Total of 2 processors activated (2132.00 BogoMIPS).
> CPU: All CPU(s) started in SVC mode.
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> 4.7.0-rc4-kzm9d-00404-g4a235e6dde4404dd-dirty #89
> Hardware name: Generic Emma Mobile EV2 (Flattened Device Tree)
> [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
> [] (show_stack) from [] (dump_stack+0x7c/0x9c)
> [] (dump_stack) from [] 
> (setup_kmem_cache_node+0x140/0x170)
> [] (setup_kmem_cache_node) from []
> (__do_tune_cpucache+0xf4/0x114)
> [] (__do_tune_cpucache) from [] 
> (enable_cpucache+0xf8/0x148)
> [] (enable_cpucache) from []
> (__kmem_cache_create+0x1a8/0x1d0)
> [] (__kmem_cache_create) from []
> (kmem_cache_create+0xbc/0x190)
> [] (kmem_cache_create) from [] (shmem_init+0x34/0xb0)
> [] (shmem_init) from [] (kernel_init_freeable+0x98/0x1ec)
> [] (kernel_init_freeable) from [] (kernel_init+0x8/0x110)
> [] (kernel_init) from [] (ret_from_fork+0x14/0x3c)
> devtmpfs: initialized

I don't see anything here that would prevent grace periods from completing.

The CPUs are using the normal hotplug sequence to come online, correct?

Thanx, Paul



Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-22 Thread Geert Uytterhoeven
On Wed, Jun 22, 2016 at 2:52 AM, Joonsoo Kim  wrote:
> Could you try below patch to check who causes the hang?
>
> And, if sysalt-t works when hang, could you get sysalt-t output? I haven't
> used it before but Paul could find some culprit on it. :)
>
> Thanks.
>
>
> ->8-
> diff --git a/mm/slab.c b/mm/slab.c
> index 763096a..9652d38 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -964,8 +964,13 @@ static int setup_kmem_cache_node(struct kmem_cache 
> *cachep,
>  * guaranteed to be valid until irq is re-enabled, because it will be
>  * freed after synchronize_sched().
>  */
> -   if (force_change)
> +   if (force_change) {
> +   if (num_online_cpus() > 1)
> +   dump_stack();
> synchronize_sched();
> +   if (num_online_cpus() > 1)
> +   dump_stack();
> +   }

I've only added the first one, as I would never see the second one. All of
this happens before the serial console is activated, earlycon is not supported,
and I only have remote access.

Brought up 2 CPUs
SMP: Total of 2 processors activated (2132.00 BogoMIPS).
CPU: All CPU(s) started in SVC mode.
CPU: 0 PID: 1 Comm: swapper/0 Not tainted
4.7.0-rc4-kzm9d-00404-g4a235e6dde4404dd-dirty #89
Hardware name: Generic Emma Mobile EV2 (Flattened Device Tree)
[] (unwind_backtrace) from [] (show_stack+0x10/0x14)
[] (show_stack) from [] (dump_stack+0x7c/0x9c)
[] (dump_stack) from [] (setup_kmem_cache_node+0x140/0x170)
[] (setup_kmem_cache_node) from []
(__do_tune_cpucache+0xf4/0x114)
[] (__do_tune_cpucache) from [] (enable_cpucache+0xf8/0x148)
[] (enable_cpucache) from []
(__kmem_cache_create+0x1a8/0x1d0)
[] (__kmem_cache_create) from []
(kmem_cache_create+0xbc/0x190)
[] (kmem_cache_create) from [] (shmem_init+0x34/0xb0)
[] (shmem_init) from [] (kernel_init_freeable+0x98/0x1ec)
[] (kernel_init_freeable) from [] (kernel_init+0x8/0x110)
[] (kernel_init) from [] (ret_from_fork+0x14/0x3c)
devtmpfs: initialized

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-21 Thread Paul E. McKenney
On Wed, Jun 22, 2016 at 09:52:08AM +0900, Joonsoo Kim wrote:
> On Tue, Jun 21, 2016 at 05:54:06AM -0700, Paul E. McKenney wrote:
> > On Tue, Jun 21, 2016 at 03:43:02PM +0900, Joonsoo Kim wrote:
> > > On Mon, Jun 20, 2016 at 06:12:54AM -0700, Paul E. McKenney wrote:
> > > > On Mon, Jun 20, 2016 at 03:39:43PM +0900, Joonsoo Kim wrote:
> > > > > CCing Paul to ask some question.
> > > > > 
> > > > > On Wed, Jun 15, 2016 at 10:39:47AM +0200, Geert Uytterhoeven wrote:
> > > > > > Hi Joonsoo,
> > > > > > 
> > > > > > On Wed, Jun 15, 2016 at 4:23 AM, Joonsoo Kim 
> > > > > >  wrote:
> > > > > > > On Tue, Jun 14, 2016 at 12:45:14PM +0200, Geert Uytterhoeven 
> > > > > > > wrote:
> > > > > > >> On Tue, Jun 14, 2016 at 10:11 AM, Joonsoo Kim 
> > > > > > >>  wrote:
> > > > > > >> > On Tue, Jun 14, 2016 at 09:31:23AM +0200, Geert Uytterhoeven 
> > > > > > >> > wrote:
> > > > > > >> >> On Tue, Jun 14, 2016 at 8:24 AM, Joonsoo Kim 
> > > > > > >> >>  wrote:
> > > > > > >> >> > On Mon, Jun 13, 2016 at 09:43:13PM +0200, Geert 
> > > > > > >> >> > Uytterhoeven wrote:
> > > > > > >> >> >> On Tue, Apr 12, 2016 at 6:51 AM,   wrote:
> > > > > > >> >> >> > From: Joonsoo Kim 
> > > > > > >> >> >> > To check whther free objects exist or not precisely, we 
> > > > > > >> >> >> > need to grab a
> > > > > > >> >> >> > lock.  But, accuracy isn't that important because race 
> > > > > > >> >> >> > window would be
> > > > > > >> >> >> > even small and if there is too much free object, cache 
> > > > > > >> >> >> > reaper would reap
> > > > > > >> >> >> > it.  So, this patch makes the check for free object 
> > > > > > >> >> >> > exisistence not to
> > > > > > >> >> >> > hold a lock.  This will reduce lock contention in 
> > > > > > >> >> >> > heavily allocation case.
> > > > > > 
> > > > > > >> >> >> I've bisected a boot failure (no output at all) in 
> > > > > > >> >> >> v4.7-rc2 on emev2/kzm9d
> > > > > > >> >> >> (Renesas dual Cortex A9) to this patch, which is upstream 
> > > > > > >> >> >> commit
> > > > > > >> >> >> 801faf0db8947e01877920e848a4d338dd7a99e7.
> > > > > > 
> > > > > > > It's curious that synchronize_sched() has some effect in this 
> > > > > > > early
> > > > > > > phase. In synchronize_sched(), rcu_blocking_is_gp() is called and
> > > > > > > it checks num_online_cpus <= 1. If so, synchronize_sched() does 
> > > > > > > nothing.
> > > > > > >
> > > > > > > It would be related to might_sleep() in rcu_blocking_is_gp() but 
> > > > > > > I'm not sure now.
> > > > > > >
> > > > > > > First, I'd like to confirm that num_online_cpus() is correct.
> > > > > > > Could you try following patch and give me a dmesg?
> > > > > > >
> > > > > > > Thanks.
> > > > > > >
> > > > > > > --->8--
> > > > > > > diff --git a/mm/slab.c b/mm/slab.c
> > > > > > > index 763096a..5b7300a 100644
> > > > > > > --- a/mm/slab.c
> > > > > > > +++ b/mm/slab.c
> > > > > > > @@ -964,8 +964,10 @@ static int setup_kmem_cache_node(struct 
> > > > > > > kmem_cache *cachep,
> > > > > > >  * guaranteed to be valid until irq is re-enabled, 
> > > > > > > because it will be
> > > > > > >  * freed after synchronize_sched().
> > > > > > >  */
> > > > > > > -   if (force_change)
> > > > > > > -   synchronize_sched();
> > > > > > > +   if (force_change) {
> > > > > > > +   WARN_ON_ONCE(num_online_cpus() <= 1);
> > > > > > > +   WARN_ON_ONCE(num_online_cpus() > 1);
> > > > > > > +   }
> > > > > > 
> > > > > > Full dmesg output below.
> > > > > > 
> > > > > > I also tested whether it's the call to synchronize_sched() before 
> > > > > > or after
> > > > > > secondary CPU bringup that hangs.
> > > > > > 
> > > > > > if (force_change && num_online_cpus() <= 1)
> > > > > > synchronize_sched();
> > > > > > 
> > > > > > boots.
> > > > > > 
> > > > > > if (force_change && num_online_cpus() > 1)
> > > > > > synchronize_sched();
> > > > > > 
> > > > > > hangs.
> > > > > 
> > > > > Hello, Paul.
> > > > > 
> > > > > I changed slab.c to use synchronize_sched() for full memory barrier. 
> > > > > First
> > > > > call happens on kmem_cache_init_late() and it would not be a problem
> > > > > because, at this time, num_online_cpus() <= 1 and synchronize_sched()
> > > > > would return immediately. Second call site would be shmem_init()
> > > > > and it seems that system hangs on it. Since smp is already initialized
> > > > > at that time, there would be some effect of synchronize_sched() but I
> > > > > can't imagine what's wrong here. Is it invalid moment to call
> > > > > synchronize_sched()?
> > > > > 
> > > > > Note that my x86 virtual machine works fine even if
> > > > > synchronize_sched() is called in shmem_init() but Geert's some ARM
> > > > > machines (not all ARM machine) don't work well with it.
> > > > 
> > > > Color me 

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-21 Thread Joonsoo Kim
On Mon, Jun 20, 2016 at 06:12:54AM -0700, Paul E. McKenney wrote:
> On Mon, Jun 20, 2016 at 03:39:43PM +0900, Joonsoo Kim wrote:
> > CCing Paul to ask some question.
> > 
> > On Wed, Jun 15, 2016 at 10:39:47AM +0200, Geert Uytterhoeven wrote:
> > > Hi Joonsoo,
> > > 
> > > On Wed, Jun 15, 2016 at 4:23 AM, Joonsoo Kim  
> > > wrote:
> > > > On Tue, Jun 14, 2016 at 12:45:14PM +0200, Geert Uytterhoeven wrote:
> > > >> On Tue, Jun 14, 2016 at 10:11 AM, Joonsoo Kim  
> > > >> wrote:
> > > >> > On Tue, Jun 14, 2016 at 09:31:23AM +0200, Geert Uytterhoeven wrote:
> > > >> >> On Tue, Jun 14, 2016 at 8:24 AM, Joonsoo Kim 
> > > >> >>  wrote:
> > > >> >> > On Mon, Jun 13, 2016 at 09:43:13PM +0200, Geert Uytterhoeven 
> > > >> >> > wrote:
> > > >> >> >> On Tue, Apr 12, 2016 at 6:51 AM,   wrote:
> > > >> >> >> > From: Joonsoo Kim 
> > > >> >> >> > To check whther free objects exist or not precisely, we need 
> > > >> >> >> > to grab a
> > > >> >> >> > lock.  But, accuracy isn't that important because race window 
> > > >> >> >> > would be
> > > >> >> >> > even small and if there is too much free object, cache reaper 
> > > >> >> >> > would reap
> > > >> >> >> > it.  So, this patch makes the check for free object 
> > > >> >> >> > exisistence not to
> > > >> >> >> > hold a lock.  This will reduce lock contention in heavily 
> > > >> >> >> > allocation case.
> > > 
> > > >> >> >> I've bisected a boot failure (no output at all) in v4.7-rc2 on 
> > > >> >> >> emev2/kzm9d
> > > >> >> >> (Renesas dual Cortex A9) to this patch, which is upstream commit
> > > >> >> >> 801faf0db8947e01877920e848a4d338dd7a99e7.
> > > 
> > > > It's curious that synchronize_sched() has some effect in this early
> > > > phase. In synchronize_sched(), rcu_blocking_is_gp() is called and
> > > > it checks num_online_cpus <= 1. If so, synchronize_sched() does nothing.
> > > >
> > > > It would be related to might_sleep() in rcu_blocking_is_gp() but I'm 
> > > > not sure now.
> > > >
> > > > First, I'd like to confirm that num_online_cpus() is correct.
> > > > Could you try following patch and give me a dmesg?
> > > >
> > > > Thanks.
> > > >
> > > > --->8--
> > > > diff --git a/mm/slab.c b/mm/slab.c
> > > > index 763096a..5b7300a 100644
> > > > --- a/mm/slab.c
> > > > +++ b/mm/slab.c
> > > > @@ -964,8 +964,10 @@ static int setup_kmem_cache_node(struct kmem_cache 
> > > > *cachep,
> > > >  * guaranteed to be valid until irq is re-enabled, because it 
> > > > will be
> > > >  * freed after synchronize_sched().
> > > >  */
> > > > -   if (force_change)
> > > > -   synchronize_sched();
> > > > +   if (force_change) {
> > > > +   WARN_ON_ONCE(num_online_cpus() <= 1);
> > > > +   WARN_ON_ONCE(num_online_cpus() > 1);
> > > > +   }
> > > 
> > > Full dmesg output below.
> > > 
> > > I also tested whether it's the call to synchronize_sched() before or after
> > > secondary CPU bringup that hangs.
> > > 
> > > if (force_change && num_online_cpus() <= 1)
> > > synchronize_sched();
> > > 
> > > boots.
> > > 
> > > if (force_change && num_online_cpus() > 1)
> > > synchronize_sched();
> > > 
> > > hangs.
> > 
> > Hello, Paul.
> > 
> > I changed slab.c to use synchronize_sched() for full memory barrier. First
> > call happens on kmem_cache_init_late() and it would not be a problem
> > because, at this time, num_online_cpus() <= 1 and synchronize_sched()
> > would return immediately. Second call site would be shmem_init()
> > and it seems that system hangs on it. Since smp is already initialized
> > at that time, there would be some effect of synchronize_sched() but I
> > can't imagine what's wrong here. Is it invalid moment to call
> > synchronize_sched()?
> > 
> > Note that my x86 virtual machine works fine even if
> > synchronize_sched() is called in shmem_init() but Geert's some ARM
> > machines (not all ARM machine) don't work well with it.
> 
> Color me confused.
> 
> Is Geert's ARM system somehow adding the second CPU before
> rcu_spawn_gp_kthread() is called, that is, before or during
> early_initcall() time?

Hang would happen on shmem_init() which is called in do_basic_setup().
do_basic_setup() is called after early_initcall().

Hmm... Is it okay to call synchronize_sched() by kernel thread?

Thanks.

> 
> RCU does assume that its kthreads are created before the second
> CPU shows up.
> 
>   Thanx, Paul
> 
> > Thanks.
> > 
> > > 
> > > Booting Linux on physical CPU 0x0
> > > Linux version 4.6.0-kzm9d-05060-g801faf0db8947e01-dirty (geert@ramsan)
> > > (gcc version 4.9.0 (GCC) ) #84 SMP Wed Jun 15 10:20:12 CEST 2016
> > > CPU: ARMv7 Processor [411fc093] revision 3 (ARMv7), cr=10c5387d
> > > CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
> > > 

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-20 Thread Paul E. McKenney
On Mon, Jun 20, 2016 at 03:39:43PM +0900, Joonsoo Kim wrote:
> CCing Paul to ask some question.
> 
> On Wed, Jun 15, 2016 at 10:39:47AM +0200, Geert Uytterhoeven wrote:
> > Hi Joonsoo,
> > 
> > On Wed, Jun 15, 2016 at 4:23 AM, Joonsoo Kim  wrote:
> > > On Tue, Jun 14, 2016 at 12:45:14PM +0200, Geert Uytterhoeven wrote:
> > >> On Tue, Jun 14, 2016 at 10:11 AM, Joonsoo Kim  
> > >> wrote:
> > >> > On Tue, Jun 14, 2016 at 09:31:23AM +0200, Geert Uytterhoeven wrote:
> > >> >> On Tue, Jun 14, 2016 at 8:24 AM, Joonsoo Kim  
> > >> >> wrote:
> > >> >> > On Mon, Jun 13, 2016 at 09:43:13PM +0200, Geert Uytterhoeven wrote:
> > >> >> >> On Tue, Apr 12, 2016 at 6:51 AM,   wrote:
> > >> >> >> > From: Joonsoo Kim 
> > >> >> >> > To check whther free objects exist or not precisely, we need to 
> > >> >> >> > grab a
> > >> >> >> > lock.  But, accuracy isn't that important because race window 
> > >> >> >> > would be
> > >> >> >> > even small and if there is too much free object, cache reaper 
> > >> >> >> > would reap
> > >> >> >> > it.  So, this patch makes the check for free object exisistence 
> > >> >> >> > not to
> > >> >> >> > hold a lock.  This will reduce lock contention in heavily 
> > >> >> >> > allocation case.
> > 
> > >> >> >> I've bisected a boot failure (no output at all) in v4.7-rc2 on 
> > >> >> >> emev2/kzm9d
> > >> >> >> (Renesas dual Cortex A9) to this patch, which is upstream commit
> > >> >> >> 801faf0db8947e01877920e848a4d338dd7a99e7.
> > 
> > > It's curious that synchronize_sched() has some effect in this early
> > > phase. In synchronize_sched(), rcu_blocking_is_gp() is called and
> > > it checks num_online_cpus <= 1. If so, synchronize_sched() does nothing.
> > >
> > > It would be related to might_sleep() in rcu_blocking_is_gp() but I'm not 
> > > sure now.
> > >
> > > First, I'd like to confirm that num_online_cpus() is correct.
> > > Could you try following patch and give me a dmesg?
> > >
> > > Thanks.
> > >
> > > --->8--
> > > diff --git a/mm/slab.c b/mm/slab.c
> > > index 763096a..5b7300a 100644
> > > --- a/mm/slab.c
> > > +++ b/mm/slab.c
> > > @@ -964,8 +964,10 @@ static int setup_kmem_cache_node(struct kmem_cache 
> > > *cachep,
> > >  * guaranteed to be valid until irq is re-enabled, because it 
> > > will be
> > >  * freed after synchronize_sched().
> > >  */
> > > -   if (force_change)
> > > -   synchronize_sched();
> > > +   if (force_change) {
> > > +   WARN_ON_ONCE(num_online_cpus() <= 1);
> > > +   WARN_ON_ONCE(num_online_cpus() > 1);
> > > +   }
> > 
> > Full dmesg output below.
> > 
> > I also tested whether it's the call to synchronize_sched() before or after
> > secondary CPU bringup that hangs.
> > 
> > if (force_change && num_online_cpus() <= 1)
> > synchronize_sched();
> > 
> > boots.
> > 
> > if (force_change && num_online_cpus() > 1)
> > synchronize_sched();
> > 
> > hangs.
> 
> Hello, Paul.
> 
> I changed slab.c to use synchronize_sched() for full memory barrier. First
> call happens on kmem_cache_init_late() and it would not be a problem
> because, at this time, num_online_cpus() <= 1 and synchronize_sched()
> would return immediately. Second call site would be shmem_init()
> and it seems that system hangs on it. Since smp is already initialized
> at that time, there would be some effect of synchronize_sched() but I
> can't imagine what's wrong here. Is it invalid moment to call
> synchronize_sched()?
> 
> Note that my x86 virtual machine works fine even if
> synchronize_sched() is called in shmem_init() but Geert's some ARM
> machines (not all ARM machine) don't work well with it.

Color me confused.

Is Geert's ARM system somehow adding the second CPU before
rcu_spawn_gp_kthread() is called, that is, before or during
early_initcall() time?

RCU does assume that its kthreads are created before the second
CPU shows up.

Thanx, Paul

> Thanks.
> 
> > 
> > Booting Linux on physical CPU 0x0
> > Linux version 4.6.0-kzm9d-05060-g801faf0db8947e01-dirty (geert@ramsan)
> > (gcc version 4.9.0 (GCC) ) #84 SMP Wed Jun 15 10:20:12 CEST 2016
> > CPU: ARMv7 Processor [411fc093] revision 3 (ARMv7), cr=10c5387d
> > CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
> > Machine model: EMEV2 KZM9D Board
> > debug: ignoring loglevel setting.
> > Memory policy: Data cache writealloc
> > On node 0 totalpages: 32768
> > free_area_init_node: node 0, pgdat c09286c0, node_mem_map c7efa000
> >   Normal zone: 256 pages used for memmap
> >   Normal zone: 0 pages reserved
> >   Normal zone: 32768 pages, LIFO batch:7
> > percpu: Embedded 12 pages/cpu @c7ed9000 s19264 r8192 d21696 u49152
> > pcpu-alloc: s19264 r8192 d21696 u49152 alloc=12*4096
> > pcpu-alloc: [0] 0 [0] 1
> > Built 1 

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-20 Thread Joonsoo Kim
CCing Paul to ask some question.

On Wed, Jun 15, 2016 at 10:39:47AM +0200, Geert Uytterhoeven wrote:
> Hi Joonsoo,
> 
> On Wed, Jun 15, 2016 at 4:23 AM, Joonsoo Kim  wrote:
> > On Tue, Jun 14, 2016 at 12:45:14PM +0200, Geert Uytterhoeven wrote:
> >> On Tue, Jun 14, 2016 at 10:11 AM, Joonsoo Kim  
> >> wrote:
> >> > On Tue, Jun 14, 2016 at 09:31:23AM +0200, Geert Uytterhoeven wrote:
> >> >> On Tue, Jun 14, 2016 at 8:24 AM, Joonsoo Kim  
> >> >> wrote:
> >> >> > On Mon, Jun 13, 2016 at 09:43:13PM +0200, Geert Uytterhoeven wrote:
> >> >> >> On Tue, Apr 12, 2016 at 6:51 AM,   wrote:
> >> >> >> > From: Joonsoo Kim 
> >> >> >> > To check whther free objects exist or not precisely, we need to 
> >> >> >> > grab a
> >> >> >> > lock.  But, accuracy isn't that important because race window 
> >> >> >> > would be
> >> >> >> > even small and if there is too much free object, cache reaper 
> >> >> >> > would reap
> >> >> >> > it.  So, this patch makes the check for free object exisistence 
> >> >> >> > not to
> >> >> >> > hold a lock.  This will reduce lock contention in heavily 
> >> >> >> > allocation case.
> 
> >> >> >> I've bisected a boot failure (no output at all) in v4.7-rc2 on 
> >> >> >> emev2/kzm9d
> >> >> >> (Renesas dual Cortex A9) to this patch, which is upstream commit
> >> >> >> 801faf0db8947e01877920e848a4d338dd7a99e7.
> 
> > It's curious that synchronize_sched() has some effect in this early
> > phase. In synchronize_sched(), rcu_blocking_is_gp() is called and
> > it checks num_online_cpus <= 1. If so, synchronize_sched() does nothing.
> >
> > It would be related to might_sleep() in rcu_blocking_is_gp() but I'm not 
> > sure now.
> >
> > First, I'd like to confirm that num_online_cpus() is correct.
> > Could you try following patch and give me a dmesg?
> >
> > Thanks.
> >
> > --->8--
> > diff --git a/mm/slab.c b/mm/slab.c
> > index 763096a..5b7300a 100644
> > --- a/mm/slab.c
> > +++ b/mm/slab.c
> > @@ -964,8 +964,10 @@ static int setup_kmem_cache_node(struct kmem_cache 
> > *cachep,
> >  * guaranteed to be valid until irq is re-enabled, because it will 
> > be
> >  * freed after synchronize_sched().
> >  */
> > -   if (force_change)
> > -   synchronize_sched();
> > +   if (force_change) {
> > +   WARN_ON_ONCE(num_online_cpus() <= 1);
> > +   WARN_ON_ONCE(num_online_cpus() > 1);
> > +   }
> 
> Full dmesg output below.
> 
> I also tested whether it's the call to synchronize_sched() before or after
> secondary CPU bringup that hangs.
> 
> if (force_change && num_online_cpus() <= 1)
> synchronize_sched();
> 
> boots.
> 
> if (force_change && num_online_cpus() > 1)
> synchronize_sched();
> 
> hangs.

Hello, Paul.

I changed slab.c to use synchronize_sched() for full memory barrier. First
call happens on kmem_cache_init_late() and it would not be a problem
because, at this time, num_online_cpus() <= 1 and synchronize_sched()
would return immediately. Second call site would be shmem_init()
and it seems that system hangs on it. Since smp is already initialized
at that time, there would be some effect of synchronize_sched() but I
can't imagine what's wrong here. Is it invalid moment to call
synchronize_sched()?

Note that my x86 virtual machine works fine even if
synchronize_sched() is called in shmem_init() but Geert's some ARM
machines (not all ARM machine) don't work well with it.

Thanks.

> 
> Booting Linux on physical CPU 0x0
> Linux version 4.6.0-kzm9d-05060-g801faf0db8947e01-dirty (geert@ramsan)
> (gcc version 4.9.0 (GCC) ) #84 SMP Wed Jun 15 10:20:12 CEST 2016
> CPU: ARMv7 Processor [411fc093] revision 3 (ARMv7), cr=10c5387d
> CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
> Machine model: EMEV2 KZM9D Board
> debug: ignoring loglevel setting.
> Memory policy: Data cache writealloc
> On node 0 totalpages: 32768
> free_area_init_node: node 0, pgdat c09286c0, node_mem_map c7efa000
>   Normal zone: 256 pages used for memmap
>   Normal zone: 0 pages reserved
>   Normal zone: 32768 pages, LIFO batch:7
> percpu: Embedded 12 pages/cpu @c7ed9000 s19264 r8192 d21696 u49152
> pcpu-alloc: s19264 r8192 d21696 u49152 alloc=12*4096
> pcpu-alloc: [0] 0 [0] 1
> Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 32512
> Kernel command line: console=ttyS1,115200n81 ignore_loglevel
> root=/dev/nfs ip=dhcp
> PID hash table entries: 512 (order: -1, 2048 bytes)
> Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
> Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
> Memory: 121144K/131072K available (4243K kernel code, 165K rwdata,
> 1344K rodata, 2048K init, 264K bss, 9928K reserved, 0K cma-reserved,
> 0K highmem)
> Virtual kernel memory layout:
> vector  : 0x - 0x1000   (   4 kB)
> fixmap  : 

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-15 Thread Geert Uytterhoeven
Hi Joonsoo,

On Wed, Jun 15, 2016 at 4:23 AM, Joonsoo Kim  wrote:
> On Tue, Jun 14, 2016 at 12:45:14PM +0200, Geert Uytterhoeven wrote:
>> On Tue, Jun 14, 2016 at 10:11 AM, Joonsoo Kim  wrote:
>> > On Tue, Jun 14, 2016 at 09:31:23AM +0200, Geert Uytterhoeven wrote:
>> >> On Tue, Jun 14, 2016 at 8:24 AM, Joonsoo Kim  
>> >> wrote:
>> >> > On Mon, Jun 13, 2016 at 09:43:13PM +0200, Geert Uytterhoeven wrote:
>> >> >> On Tue, Apr 12, 2016 at 6:51 AM,   wrote:
>> >> >> > From: Joonsoo Kim 
>> >> >> > To check whther free objects exist or not precisely, we need to grab 
>> >> >> > a
>> >> >> > lock.  But, accuracy isn't that important because race window would 
>> >> >> > be
>> >> >> > even small and if there is too much free object, cache reaper would 
>> >> >> > reap
>> >> >> > it.  So, this patch makes the check for free object exisistence not 
>> >> >> > to
>> >> >> > hold a lock.  This will reduce lock contention in heavily allocation 
>> >> >> > case.

>> >> >> I've bisected a boot failure (no output at all) in v4.7-rc2 on 
>> >> >> emev2/kzm9d
>> >> >> (Renesas dual Cortex A9) to this patch, which is upstream commit
>> >> >> 801faf0db8947e01877920e848a4d338dd7a99e7.

> It's curious that synchronize_sched() has some effect in this early
> phase. In synchronize_sched(), rcu_blocking_is_gp() is called and
> it checks num_online_cpus <= 1. If so, synchronize_sched() does nothing.
>
> It would be related to might_sleep() in rcu_blocking_is_gp() but I'm not sure 
> now.
>
> First, I'd like to confirm that num_online_cpus() is correct.
> Could you try following patch and give me a dmesg?
>
> Thanks.
>
> --->8--
> diff --git a/mm/slab.c b/mm/slab.c
> index 763096a..5b7300a 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -964,8 +964,10 @@ static int setup_kmem_cache_node(struct kmem_cache 
> *cachep,
>  * guaranteed to be valid until irq is re-enabled, because it will be
>  * freed after synchronize_sched().
>  */
> -   if (force_change)
> -   synchronize_sched();
> +   if (force_change) {
> +   WARN_ON_ONCE(num_online_cpus() <= 1);
> +   WARN_ON_ONCE(num_online_cpus() > 1);
> +   }

Full dmesg output below.

I also tested whether it's the call to synchronize_sched() before or after
secondary CPU bringup that hangs.

if (force_change && num_online_cpus() <= 1)
synchronize_sched();

boots.

if (force_change && num_online_cpus() > 1)
synchronize_sched();

hangs.

Booting Linux on physical CPU 0x0
Linux version 4.6.0-kzm9d-05060-g801faf0db8947e01-dirty (geert@ramsan)
(gcc version 4.9.0 (GCC) ) #84 SMP Wed Jun 15 10:20:12 CEST 2016
CPU: ARMv7 Processor [411fc093] revision 3 (ARMv7), cr=10c5387d
CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
Machine model: EMEV2 KZM9D Board
debug: ignoring loglevel setting.
Memory policy: Data cache writealloc
On node 0 totalpages: 32768
free_area_init_node: node 0, pgdat c09286c0, node_mem_map c7efa000
  Normal zone: 256 pages used for memmap
  Normal zone: 0 pages reserved
  Normal zone: 32768 pages, LIFO batch:7
percpu: Embedded 12 pages/cpu @c7ed9000 s19264 r8192 d21696 u49152
pcpu-alloc: s19264 r8192 d21696 u49152 alloc=12*4096
pcpu-alloc: [0] 0 [0] 1
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 32512
Kernel command line: console=ttyS1,115200n81 ignore_loglevel
root=/dev/nfs ip=dhcp
PID hash table entries: 512 (order: -1, 2048 bytes)
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
Memory: 121144K/131072K available (4243K kernel code, 165K rwdata,
1344K rodata, 2048K init, 264K bss, 9928K reserved, 0K cma-reserved,
0K highmem)
Virtual kernel memory layout:
vector  : 0x - 0x1000   (   4 kB)
fixmap  : 0xffc0 - 0xfff0   (3072 kB)
vmalloc : 0xc880 - 0xff80   ( 880 MB)
lowmem  : 0xc000 - 0xc800   ( 128 MB)
pkmap   : 0xbfe0 - 0xc000   (   2 MB)
modules : 0xbf00 - 0xbfe0   (  14 MB)
  .text : 0xc0008000 - 0xc0674eb8   (6580 kB)
  .init : 0xc070 - 0xc090   (2048 kB)
  .data : 0xc090 - 0xc0929420   ( 166 kB)
   .bss : 0xc092b000 - 0xc096d1e8   ( 265 kB)
Hierarchical RCU implementation.
 Build-time adjustment of leaf fanout to 32.
 RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.
RCU: Adjusting geometry for rcu_fanout_leaf=32, nr_cpu_ids=2
NR_IRQS:16 nr_irqs:16 16
clocksource_probe: no matching clocksources found
sched_clock: 32 bits at 100 Hz, resolution 1000ns, wraps every
2147483647500ns
[ cut here ]
WARNING: CPU: 0 PID: 0 at mm/slab.c:975 setup_kmem_cache_node+0x160/0x1c8
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted
4.6.0-kzm9d-05060-g801faf0db8947e01-dirty #84
Hardware 

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-14 Thread Joonsoo Kim
On Tue, Jun 14, 2016 at 12:45:14PM +0200, Geert Uytterhoeven wrote:
> Hi Joonsoo,
> 
> On Tue, Jun 14, 2016 at 10:11 AM, Joonsoo Kim  wrote:
> > On Tue, Jun 14, 2016 at 09:31:23AM +0200, Geert Uytterhoeven wrote:
> >> On Tue, Jun 14, 2016 at 8:24 AM, Joonsoo Kim  
> >> wrote:
> >> > On Mon, Jun 13, 2016 at 09:43:13PM +0200, Geert Uytterhoeven wrote:
> >> >> On Tue, Apr 12, 2016 at 6:51 AM,   wrote:
> >> >> > From: Joonsoo Kim 
> >> >> > To check whther free objects exist or not precisely, we need to grab a
> >> >> > lock.  But, accuracy isn't that important because race window would be
> >> >> > even small and if there is too much free object, cache reaper would 
> >> >> > reap
> >> >> > it.  So, this patch makes the check for free object exisistence not to
> >> >> > hold a lock.  This will reduce lock contention in heavily allocation 
> >> >> > case.
> >> >> >
> >> >> > Note that until now, n->shared can be freed during the processing by
> >> >> > writing slabinfo, but, with some trick in this patch, we can access it
> >> >> > freely within interrupt disabled period.
> >> >> >
> >> >> > Below is the result of concurrent allocation/free in slab allocation
> >> >> > benchmark made by Christoph a long time ago.  I make the output 
> >> >> > simpler.
> >> >> > The number shows cycle count during alloc/free respectively so less is
> >> >> > better.
> >> >> >
> >> >> > * Before
> >> >> > Kmalloc N*alloc N*free(32): Average=248/966
> >> >> > Kmalloc N*alloc N*free(64): Average=261/949
> >> >> > Kmalloc N*alloc N*free(128): Average=314/1016
> >> >> > Kmalloc N*alloc N*free(256): Average=741/1061
> >> >> > Kmalloc N*alloc N*free(512): Average=1246/1152
> >> >> > Kmalloc N*alloc N*free(1024): Average=2437/1259
> >> >> > Kmalloc N*alloc N*free(2048): Average=4980/1800
> >> >> > Kmalloc N*alloc N*free(4096): Average=9000/2078
> >> >> >
> >> >> > * After
> >> >> > Kmalloc N*alloc N*free(32): Average=344/792
> >> >> > Kmalloc N*alloc N*free(64): Average=347/882
> >> >> > Kmalloc N*alloc N*free(128): Average=390/959
> >> >> > Kmalloc N*alloc N*free(256): Average=393/1067
> >> >> > Kmalloc N*alloc N*free(512): Average=683/1229
> >> >> > Kmalloc N*alloc N*free(1024): Average=1295/1325
> >> >> > Kmalloc N*alloc N*free(2048): Average=2513/1664
> >> >> > Kmalloc N*alloc N*free(4096): Average=4742/2172
> >> >> >
> >> >> > It shows that allocation performance decreases for the object size up 
> >> >> > to
> >> >> > 128 and it may be due to extra checks in cache_alloc_refill().  But, 
> >> >> > with
> >> >> > considering improvement of free performance, net result looks the 
> >> >> > same.
> >> >> > Result for other size class looks very promising, roughly, 50% 
> >> >> > performance
> >> >> > improvement.
> >> >> >
> >> >> > v2: replace kick_all_cpus_sync() with synchronize_sched().
> >> >> >
> >> >> > Signed-off-by: Joonsoo Kim 
> >> >>
> >> >> I've bisected a boot failure (no output at all) in v4.7-rc2 on 
> >> >> emev2/kzm9d
> >> >> (Renesas dual Cortex A9) to this patch, which is upstream commit
> >> >> 801faf0db8947e01877920e848a4d338dd7a99e7.
> >> >>
> >> >> I've attached my .config. I don't know if it also happens with
> >> >> shmobile_defconfig, as something went wrong with my remote access to 
> >> >> the board,
> >> >> preventing further testing. I also couldn't verify if the issue 
> >> >> persists in
> >> >> v4.7-rc3.
> >>
> >> In the mean time, I've verified it also happens with shmobile_defconfig.
> >>
> >> >> Do you have a clue?
> >> >
> >> > I don't have yet. Could you help me to narrow down the problem?
> >> > Following diff is half-revert change to check that synchronize_sched()
> >> > has no problem.
> >>
> >> Thanks!
> >>
> >> Unfortunately the half revert is not sufficient. The full revert is.
> >
> > Thanks for quick testing!
> >
> > Could I ask one more time to check that synchronize_sched() is root
> > cause of the problem? Testing following two diffs will be helpful to me.
> >
> > Thanks.
> >
> > --->8
> > diff --git a/mm/slab.c b/mm/slab.c
> > index 763096a..d892364 100644
> > --- a/mm/slab.c
> > +++ b/mm/slab.c
> > @@ -965,7 +965,7 @@ static int setup_kmem_cache_node(struct kmem_cache 
> > *cachep,
> >  * freed after synchronize_sched().
> >  */
> > if (force_change)
> > -   synchronize_sched();
> > +   kick_all_cpus_sync();
> >
> >  fail:
> > kfree(old_shared);
> 
> Works.
> 
> > --->8--
> > diff --git a/mm/slab.c b/mm/slab.c
> > index 763096a..38d99c2 100644
> > --- a/mm/slab.c
> > +++ b/mm/slab.c
> > @@ -964,8 +964,6 @@ static int setup_kmem_cache_node(struct kmem_cache 
> > *cachep,
> >  * guaranteed to be valid until irq is re-enabled, because it will 
> > be
> >  * freed after synchronize_sched().
> >  */
> > -   if (force_change)
> > -   synchronize_sched();
> >
> > 

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-14 Thread Geert Uytterhoeven
Hi Jonsoo,

On Mon, Jun 13, 2016 at 9:43 PM, Geert Uytterhoeven
 wrote:
> On Tue, Apr 12, 2016 at 6:51 AM,   wrote:
>> From: Joonsoo Kim 
>> To check whther free objects exist or not precisely, we need to grab a
>> lock.  But, accuracy isn't that important because race window would be
>> even small and if there is too much free object, cache reaper would reap
>> it.  So, this patch makes the check for free object exisistence not to
>> hold a lock.  This will reduce lock contention in heavily allocation case.

>> Signed-off-by: Joonsoo Kim 
>
> I've bisected a boot failure (no output at all) in v4.7-rc2 on emev2/kzm9d
> (Renesas dual Cortex A9) to this patch, which is upstream commit
> 801faf0db8947e01877920e848a4d338dd7a99e7.

BTW, when disabling SMP, the problem goes away.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-14 Thread Geert Uytterhoeven
Hi Joonsoo,

On Tue, Jun 14, 2016 at 10:11 AM, Joonsoo Kim  wrote:
> On Tue, Jun 14, 2016 at 09:31:23AM +0200, Geert Uytterhoeven wrote:
>> On Tue, Jun 14, 2016 at 8:24 AM, Joonsoo Kim  wrote:
>> > On Mon, Jun 13, 2016 at 09:43:13PM +0200, Geert Uytterhoeven wrote:
>> >> On Tue, Apr 12, 2016 at 6:51 AM,   wrote:
>> >> > From: Joonsoo Kim 
>> >> > To check whther free objects exist or not precisely, we need to grab a
>> >> > lock.  But, accuracy isn't that important because race window would be
>> >> > even small and if there is too much free object, cache reaper would reap
>> >> > it.  So, this patch makes the check for free object exisistence not to
>> >> > hold a lock.  This will reduce lock contention in heavily allocation 
>> >> > case.
>> >> >
>> >> > Note that until now, n->shared can be freed during the processing by
>> >> > writing slabinfo, but, with some trick in this patch, we can access it
>> >> > freely within interrupt disabled period.
>> >> >
>> >> > Below is the result of concurrent allocation/free in slab allocation
>> >> > benchmark made by Christoph a long time ago.  I make the output simpler.
>> >> > The number shows cycle count during alloc/free respectively so less is
>> >> > better.
>> >> >
>> >> > * Before
>> >> > Kmalloc N*alloc N*free(32): Average=248/966
>> >> > Kmalloc N*alloc N*free(64): Average=261/949
>> >> > Kmalloc N*alloc N*free(128): Average=314/1016
>> >> > Kmalloc N*alloc N*free(256): Average=741/1061
>> >> > Kmalloc N*alloc N*free(512): Average=1246/1152
>> >> > Kmalloc N*alloc N*free(1024): Average=2437/1259
>> >> > Kmalloc N*alloc N*free(2048): Average=4980/1800
>> >> > Kmalloc N*alloc N*free(4096): Average=9000/2078
>> >> >
>> >> > * After
>> >> > Kmalloc N*alloc N*free(32): Average=344/792
>> >> > Kmalloc N*alloc N*free(64): Average=347/882
>> >> > Kmalloc N*alloc N*free(128): Average=390/959
>> >> > Kmalloc N*alloc N*free(256): Average=393/1067
>> >> > Kmalloc N*alloc N*free(512): Average=683/1229
>> >> > Kmalloc N*alloc N*free(1024): Average=1295/1325
>> >> > Kmalloc N*alloc N*free(2048): Average=2513/1664
>> >> > Kmalloc N*alloc N*free(4096): Average=4742/2172
>> >> >
>> >> > It shows that allocation performance decreases for the object size up to
>> >> > 128 and it may be due to extra checks in cache_alloc_refill().  But, 
>> >> > with
>> >> > considering improvement of free performance, net result looks the same.
>> >> > Result for other size class looks very promising, roughly, 50% 
>> >> > performance
>> >> > improvement.
>> >> >
>> >> > v2: replace kick_all_cpus_sync() with synchronize_sched().
>> >> >
>> >> > Signed-off-by: Joonsoo Kim 
>> >>
>> >> I've bisected a boot failure (no output at all) in v4.7-rc2 on emev2/kzm9d
>> >> (Renesas dual Cortex A9) to this patch, which is upstream commit
>> >> 801faf0db8947e01877920e848a4d338dd7a99e7.
>> >>
>> >> I've attached my .config. I don't know if it also happens with
>> >> shmobile_defconfig, as something went wrong with my remote access to the 
>> >> board,
>> >> preventing further testing. I also couldn't verify if the issue persists 
>> >> in
>> >> v4.7-rc3.
>>
>> In the mean time, I've verified it also happens with shmobile_defconfig.
>>
>> >> Do you have a clue?
>> >
>> > I don't have yet. Could you help me to narrow down the problem?
>> > Following diff is half-revert change to check that synchronize_sched()
>> > has no problem.
>>
>> Thanks!
>>
>> Unfortunately the half revert is not sufficient. The full revert is.
>
> Thanks for quick testing!
>
> Could I ask one more time to check that synchronize_sched() is root
> cause of the problem? Testing following two diffs will be helpful to me.
>
> Thanks.
>
> --->8
> diff --git a/mm/slab.c b/mm/slab.c
> index 763096a..d892364 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -965,7 +965,7 @@ static int setup_kmem_cache_node(struct kmem_cache 
> *cachep,
>  * freed after synchronize_sched().
>  */
> if (force_change)
> -   synchronize_sched();
> +   kick_all_cpus_sync();
>
>  fail:
> kfree(old_shared);

Works.

> --->8--
> diff --git a/mm/slab.c b/mm/slab.c
> index 763096a..38d99c2 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -964,8 +964,6 @@ static int setup_kmem_cache_node(struct kmem_cache 
> *cachep,
>  * guaranteed to be valid until irq is re-enabled, because it will be
>  * freed after synchronize_sched().
>  */
> -   if (force_change)
> -   synchronize_sched();
>
>  fail:
> kfree(old_shared);
>

Also works.

Note that I do not see this problem on any of the other boards I use, one
of which is also a dual Cortex A9.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a 

Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-14 Thread Joonsoo Kim
On Tue, Jun 14, 2016 at 09:31:23AM +0200, Geert Uytterhoeven wrote:
> Hi Joonsoo,
> 
> On Tue, Jun 14, 2016 at 8:24 AM, Joonsoo Kim  wrote:
> > On Mon, Jun 13, 2016 at 09:43:13PM +0200, Geert Uytterhoeven wrote:
> >> On Tue, Apr 12, 2016 at 6:51 AM,   wrote:
> >> > From: Joonsoo Kim 
> >> > To check whther free objects exist or not precisely, we need to grab a
> >> > lock.  But, accuracy isn't that important because race window would be
> >> > even small and if there is too much free object, cache reaper would reap
> >> > it.  So, this patch makes the check for free object exisistence not to
> >> > hold a lock.  This will reduce lock contention in heavily allocation 
> >> > case.
> >> >
> >> > Note that until now, n->shared can be freed during the processing by
> >> > writing slabinfo, but, with some trick in this patch, we can access it
> >> > freely within interrupt disabled period.
> >> >
> >> > Below is the result of concurrent allocation/free in slab allocation
> >> > benchmark made by Christoph a long time ago.  I make the output simpler.
> >> > The number shows cycle count during alloc/free respectively so less is
> >> > better.
> >> >
> >> > * Before
> >> > Kmalloc N*alloc N*free(32): Average=248/966
> >> > Kmalloc N*alloc N*free(64): Average=261/949
> >> > Kmalloc N*alloc N*free(128): Average=314/1016
> >> > Kmalloc N*alloc N*free(256): Average=741/1061
> >> > Kmalloc N*alloc N*free(512): Average=1246/1152
> >> > Kmalloc N*alloc N*free(1024): Average=2437/1259
> >> > Kmalloc N*alloc N*free(2048): Average=4980/1800
> >> > Kmalloc N*alloc N*free(4096): Average=9000/2078
> >> >
> >> > * After
> >> > Kmalloc N*alloc N*free(32): Average=344/792
> >> > Kmalloc N*alloc N*free(64): Average=347/882
> >> > Kmalloc N*alloc N*free(128): Average=390/959
> >> > Kmalloc N*alloc N*free(256): Average=393/1067
> >> > Kmalloc N*alloc N*free(512): Average=683/1229
> >> > Kmalloc N*alloc N*free(1024): Average=1295/1325
> >> > Kmalloc N*alloc N*free(2048): Average=2513/1664
> >> > Kmalloc N*alloc N*free(4096): Average=4742/2172
> >> >
> >> > It shows that allocation performance decreases for the object size up to
> >> > 128 and it may be due to extra checks in cache_alloc_refill().  But, with
> >> > considering improvement of free performance, net result looks the same.
> >> > Result for other size class looks very promising, roughly, 50% 
> >> > performance
> >> > improvement.
> >> >
> >> > v2: replace kick_all_cpus_sync() with synchronize_sched().
> >> >
> >> > Signed-off-by: Joonsoo Kim 
> >>
> >> I've bisected a boot failure (no output at all) in v4.7-rc2 on emev2/kzm9d
> >> (Renesas dual Cortex A9) to this patch, which is upstream commit
> >> 801faf0db8947e01877920e848a4d338dd7a99e7.
> >>
> >> I've attached my .config. I don't know if it also happens with
> >> shmobile_defconfig, as something went wrong with my remote access to the 
> >> board,
> >> preventing further testing. I also couldn't verify if the issue persists in
> >> v4.7-rc3.
> 
> In the mean time, I've verified it also happens with shmobile_defconfig.
> 
> >>
> >> Do you have a clue?
> >
> > I don't have yet. Could you help me to narrow down the problem?
> > Following diff is half-revert change to check that synchronize_sched()
> > has no problem.
> 
> Thanks!
> 
> Unfortunately the half revert is not sufficient. The full revert is.

Thanks for quick testing!

Could I ask one more time to check that synchronize_sched() is root
cause of the problem? Testing following two diffs will be helpful to me.

Thanks.

--->8
diff --git a/mm/slab.c b/mm/slab.c
index 763096a..d892364 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -965,7 +965,7 @@ static int setup_kmem_cache_node(struct kmem_cache *cachep,
 * freed after synchronize_sched().
 */
if (force_change)
-   synchronize_sched();
+   kick_all_cpus_sync();
 
 fail:
kfree(old_shared);

--->8--
diff --git a/mm/slab.c b/mm/slab.c
index 763096a..38d99c2 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -964,8 +964,6 @@ static int setup_kmem_cache_node(struct kmem_cache *cachep,
 * guaranteed to be valid until irq is re-enabled, because it will be
 * freed after synchronize_sched().
 */
-   if (force_change)
-   synchronize_sched();
 
 fail:
kfree(old_shared);



Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-14 Thread Joonsoo Kim
On Mon, Jun 13, 2016 at 09:43:13PM +0200, Geert Uytterhoeven wrote:
> Hi Joonsoo,

Hello,

> 
> On Tue, Apr 12, 2016 at 6:51 AM,   wrote:
> > From: Joonsoo Kim 
> >
> > To check whther free objects exist or not precisely, we need to grab a
> > lock.  But, accuracy isn't that important because race window would be
> > even small and if there is too much free object, cache reaper would reap
> > it.  So, this patch makes the check for free object exisistence not to
> > hold a lock.  This will reduce lock contention in heavily allocation case.
> >
> > Note that until now, n->shared can be freed during the processing by
> > writing slabinfo, but, with some trick in this patch, we can access it
> > freely within interrupt disabled period.
> >
> > Below is the result of concurrent allocation/free in slab allocation
> > benchmark made by Christoph a long time ago.  I make the output simpler.
> > The number shows cycle count during alloc/free respectively so less is
> > better.
> >
> > * Before
> > Kmalloc N*alloc N*free(32): Average=248/966
> > Kmalloc N*alloc N*free(64): Average=261/949
> > Kmalloc N*alloc N*free(128): Average=314/1016
> > Kmalloc N*alloc N*free(256): Average=741/1061
> > Kmalloc N*alloc N*free(512): Average=1246/1152
> > Kmalloc N*alloc N*free(1024): Average=2437/1259
> > Kmalloc N*alloc N*free(2048): Average=4980/1800
> > Kmalloc N*alloc N*free(4096): Average=9000/2078
> >
> > * After
> > Kmalloc N*alloc N*free(32): Average=344/792
> > Kmalloc N*alloc N*free(64): Average=347/882
> > Kmalloc N*alloc N*free(128): Average=390/959
> > Kmalloc N*alloc N*free(256): Average=393/1067
> > Kmalloc N*alloc N*free(512): Average=683/1229
> > Kmalloc N*alloc N*free(1024): Average=1295/1325
> > Kmalloc N*alloc N*free(2048): Average=2513/1664
> > Kmalloc N*alloc N*free(4096): Average=4742/2172
> >
> > It shows that allocation performance decreases for the object size up to
> > 128 and it may be due to extra checks in cache_alloc_refill().  But, with
> > considering improvement of free performance, net result looks the same.
> > Result for other size class looks very promising, roughly, 50% performance
> > improvement.
> >
> > v2: replace kick_all_cpus_sync() with synchronize_sched().
> >
> > Signed-off-by: Joonsoo Kim 
> 
> I've bisected a boot failure (no output at all) in v4.7-rc2 on emev2/kzm9d
> (Renesas dual Cortex A9) to this patch, which is upstream commit
> 801faf0db8947e01877920e848a4d338dd7a99e7.
> 
> I've attached my .config. I don't know if it also happens with
> shmobile_defconfig, as something went wrong with my remote access to the 
> board,
> preventing further testing. I also couldn't verify if the issue persists in
> v4.7-rc3.
> 
> Do you have a clue?

I don't have yet. Could you help me to narrow down the problem?
Following diff is half-revert change to check that synchronize_sched()
has no problem.

Thanks.

->8-
diff --git a/mm/slab.c b/mm/slab.c
index 763096a..257a0eb 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3016,9 +3016,6 @@ static void *cache_alloc_refill(struct kmem_cache 
*cachep, gfp_t flags)
n = get_node(cachep, node);
 
BUG_ON(ac->avail > 0 || !n);
-   shared = READ_ONCE(n->shared);
-   if (!n->free_objects && (!shared || !shared->avail))
-   goto direct_grow;
 
spin_lock(>list_lock);
shared = READ_ONCE(n->shared);
@@ -3047,7 +3044,6 @@ alloc_done:
spin_unlock(>list_lock);
fixup_objfreelist_debug(cachep, );
 
-direct_grow:
if (unlikely(!ac->avail)) {
/* Check if we can use obj in pfmemalloc slab */
if (sk_memalloc_socks()) {