Re: Perf lockups / stack overflows on v3.17-rc6, x86_64, arm, arm64

2014-10-06 Thread Vince Weaver
On Mon, 6 Oct 2014, Mark Rutland wrote:

> So far I haven't been able to trigger the above failure on v3.17, so perhaps
> some patch has fixed that.
> 
> With the same seed (1411654897) I can trigger a hw_breakpoint warning
> relatively repeatably (logs for a couple of instances below).

I see those fairly frequently too.  Adding some breakpoint people to
the CC and maybe they can help debug it.

Vince


> >8
> [ 3268.694056] [ cut here ]
> [ 3268.694066] WARNING: CPU: 0 PID: 19671 at 
> arch/x86/kernel/hw_breakpoint.c:119 arch_install_hw_breakpoint+0xf0/0x100()
> [ 3268.694068] Can't find any breakpoint slot
> [ 3268.694070] Modules linked in:
> [ 3268.694075] CPU: 0 PID: 19671 Comm: perf_fuzzer Not tainted 
> 3.17.0hark-lockup2-2014-10-06 #4
> [ 3268.694077] Hardware name: LENOVO 7484A3G/LENOVO, BIOS 5CKT54AUS 09/07/2009
> [ 3268.694079]  0009 88019a343c78 8182da5c 
> 88019a343cc0
> [ 3268.694084]  88019a343cb0 8104af38 8801af5ec800 
> 8801a0faae00
> [ 3268.694088]  8801bec16780 8801bec16784 01b8c35e3e18 
> 88019a343d10
> [ 3268.694092] Call Trace:
> [ 3268.694098]  [] dump_stack+0x45/0x56
> [ 3268.694103]  [] warn_slowpath_common+0x78/0xa0
> [ 3268.694106]  [] warn_slowpath_fmt+0x47/0x50
> [ 3268.694110]  [] arch_install_hw_breakpoint+0xf0/0x100
> [ 3268.694114]  [] hw_breakpoint_add+0x3f/0x50
> [ 3268.694117]  [] event_sched_in.isra.80+0x84/0x1b0
> [ 3268.694121]  [] group_sched_in+0x69/0x1e0
> [ 3268.694124]  [] ? perf_event_update_userpage+0xeb/0x160
> [ 3268.694129]  [] ? sched_clock_local+0x1d/0x80
> [ 3268.694132]  [] ctx_sched_in.isra.81+0xd2/0x1a0
> [ 3268.694136]  [] perf_event_sched_in.isra.84+0x4f/0x70
> [ 3268.694139]  [] 
> perf_event_context_sched_in.isra.85+0x73/0xc0
> [ 3268.694142]  [] __perf_event_task_sched_in+0x185/0x1a0
> [ 3268.694147]  [] finish_task_switch+0xb2/0xf0
> [ 3268.694151]  [] __schedule+0x34f/0x810
> [ 3268.694154]  [] schedule+0x24/0x70
> [ 3268.694158]  [] int_careful+0xd/0x14
> [ 3268.694160] ---[ end trace e1f62407a7d7e846 ]---
> >8
> 
> >8
> [ 4016.924076] [ cut here ]
> [ 4016.925039] WARNING: CPU: 1 PID: 14091 at 
> arch/x86/kernel/hw_breakpoint.c:119 arch_install_hw_breakpoint+0xf0/0x100()
> [ 4016.925039] Can't find any breakpoint slot
> [ 4016.925039] Modules linked in:
> [ 4016.925039] CPU: 1 PID: 14091 Comm: perf_fuzzer Not tainted 
> 3.17.0hark-lockup2-2014-10-06 #4
> [ 4016.925039] Hardware name: LENOVO 7484A3G/LENOVO, BIOS 5CKT54AUS 09/07/2009
> [ 4016.925039]  0009 8800cd85f9a8 8182da5c 
> 8800cd85f9f0
> [ 4016.925039]  8800cd85f9e0 8104af38 8800e38bd000 
> 8801a6fb1c00
> [ 4016.925039]  8801bec96780 8801bec96784 027400d480df 
> 8800cd85fa40
> [ 4016.925039] Call Trace:
> [ 4016.925039]  [] dump_stack+0x45/0x56
> [ 4016.925039]  [] warn_slowpath_common+0x78/0xa0
> [ 4016.925039]  [] warn_slowpath_fmt+0x47/0x50
> [ 4016.925039]  [] arch_install_hw_breakpoint+0xf0/0x100
> [ 4016.925039]  [] hw_breakpoint_add+0x3f/0x50
> [ 4016.925039]  [] event_sched_in.isra.80+0x84/0x1b0
> [ 4016.925039]  [] group_sched_in+0x69/0x1e0
> [ 4016.925039]  [] ? perf_event_update_userpage+0xeb/0x160
> [ 4016.925039]  [] ? sched_clock_local+0x1d/0x80
> [ 4016.925039]  [] ctx_sched_in.isra.81+0xd2/0x1a0
> [ 4016.925039]  [] perf_event_sched_in.isra.84+0x4f/0x70
> [ 4016.925039]  [] 
> perf_event_context_sched_in.isra.85+0x73/0xc0
> [ 4016.925039]  [] __perf_event_task_sched_in+0x185/0x1a0
> [ 4016.925039]  [] finish_task_switch+0xb2/0xf0
> [ 4016.925039]  [] __schedule+0x34f/0x810
> [ 4016.925039]  [] schedule+0x24/0x70
> [ 4016.925039]  [] schedule_timeout+0x1b9/0x290
> [ 4016.925039]  [] ? wait_for_completion+0x23/0x100
> [ 4016.925039]  [] wait_for_completion+0x9c/0x100
> [ 4016.925039]  [] ? wake_up_state+0x10/0x10
> [ 4016.925039]  [] ? call_rcu_bh+0x20/0x20
> [ 4016.925039]  [] wait_rcu_gp+0x46/0x50
> [ 4016.925039]  [] ? 
> ftrace_raw_output_rcu_utilization+0x50/0x50
> [ 4016.925039]  [] synchronize_sched+0x33/0x50
> [ 4016.925039]  [] perf_trace_event_unreg.isra.1+0x3b/0x90
> [ 4016.925039]  [] perf_trace_destroy+0x38/0x50
> [ 4016.925039]  [] tp_perf_event_destroy+0x9/0x10
> [ 4016.925039]  [] __free_event+0x23/0x70
> [ 4016.925039]  [] SYSC_perf_event_open+0x397/0xa50
> [ 4016.925039]  [] SyS_perf_event_open+0x9/0x10
> [ 4016.925039]  [] tracesys+0xdd/0xe2
> [ 4016.925039] ---[ end trace a2fe478e9cb5649b ]---
> >8
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Perf lockups / stack overflows on v3.17-rc6, x86_64, arm, arm64

2014-10-06 Thread Mark Rutland
> > > Log 2, x86_64 stack overflow
> > 
> > > [  346.641345] divide error:  [#1] SMP
> > > [  346.642010] Modules linked in:
> > > [  346.642010] CPU: 0 PID: 4076 Comm: perf_fuzzer Not tainted 
> > > 3.17.0-rc6hark-perf-lockup+ #1
> > > [  346.642010] Hardware name: LENOVO 7484A3G/LENOVO, BIOS 5CKT54AUS 
> > > 09/07/2009
> > > [  346.642010] task: 8801ac449a70 ti: 8801ac574000 task.ti: 
> > > 8801ac574000
> > > [  346.642010] RIP: 0010:[]  [] 
> > > find_busiest_group+0x28e/0x8a0
> > > [  346.642010] RSP: 0018:8801ac577760  EFLAGS: 00010006
> > > [  346.642010] RAX: 03ff RBX:  RCX: 
> > > 8801
> > > [  346.642010] RDX:  RSI: 0001 RDI: 
> > > 0001
> > > [  346.642010] RBP: 8801ac577890 R08:  R09: 
> > > 
> > > [  346.704010] [ cut here ]
> > > [  346.704017] WARNING: CPU: 2 PID: 5 at arch/x86/kernel/irq_64.c:70 
> > > handle_irq+0x141/0x150()
> > > [  346.704019] do_IRQ():  has overflown the kernel stack 
> > > (cur:1,sp:8801b653fe88,irq stk 
> > > top-bottom:8801bed00080-8801bed03fc0,exception stk 
> > > top-bottom:8801bed04080-8801bed0a000)
> > 
> > weird, have not seen this before.  Though I was hitting a reboot issue
> > that would give really strange crash messages that was possibly fixed by
> > a patch that went into 3.17-rc7.
> 
> Interesting. I'll retry with v3.17.

So far I haven't been able to trigger the above failure on v3.17, so perhaps
some patch has fixed that.

With the same seed (1411654897) I can trigger a hw_breakpoint warning
relatively repeatably (logs for a couple of instances below).

Mark.

>8
[ 3268.694056] [ cut here ]
[ 3268.694066] WARNING: CPU: 0 PID: 19671 at 
arch/x86/kernel/hw_breakpoint.c:119 arch_install_hw_breakpoint+0xf0/0x100()
[ 3268.694068] Can't find any breakpoint slot
[ 3268.694070] Modules linked in:
[ 3268.694075] CPU: 0 PID: 19671 Comm: perf_fuzzer Not tainted 
3.17.0hark-lockup2-2014-10-06 #4
[ 3268.694077] Hardware name: LENOVO 7484A3G/LENOVO, BIOS 5CKT54AUS 09/07/2009
[ 3268.694079]  0009 88019a343c78 8182da5c 
88019a343cc0
[ 3268.694084]  88019a343cb0 8104af38 8801af5ec800 
8801a0faae00
[ 3268.694088]  8801bec16780 8801bec16784 01b8c35e3e18 
88019a343d10
[ 3268.694092] Call Trace:
[ 3268.694098]  [] dump_stack+0x45/0x56
[ 3268.694103]  [] warn_slowpath_common+0x78/0xa0
[ 3268.694106]  [] warn_slowpath_fmt+0x47/0x50
[ 3268.694110]  [] arch_install_hw_breakpoint+0xf0/0x100
[ 3268.694114]  [] hw_breakpoint_add+0x3f/0x50
[ 3268.694117]  [] event_sched_in.isra.80+0x84/0x1b0
[ 3268.694121]  [] group_sched_in+0x69/0x1e0
[ 3268.694124]  [] ? perf_event_update_userpage+0xeb/0x160
[ 3268.694129]  [] ? sched_clock_local+0x1d/0x80
[ 3268.694132]  [] ctx_sched_in.isra.81+0xd2/0x1a0
[ 3268.694136]  [] perf_event_sched_in.isra.84+0x4f/0x70
[ 3268.694139]  [] 
perf_event_context_sched_in.isra.85+0x73/0xc0
[ 3268.694142]  [] __perf_event_task_sched_in+0x185/0x1a0
[ 3268.694147]  [] finish_task_switch+0xb2/0xf0
[ 3268.694151]  [] __schedule+0x34f/0x810
[ 3268.694154]  [] schedule+0x24/0x70
[ 3268.694158]  [] int_careful+0xd/0x14
[ 3268.694160] ---[ end trace e1f62407a7d7e846 ]---
>8

>8
[ 4016.924076] [ cut here ]
[ 4016.925039] WARNING: CPU: 1 PID: 14091 at 
arch/x86/kernel/hw_breakpoint.c:119 arch_install_hw_breakpoint+0xf0/0x100()
[ 4016.925039] Can't find any breakpoint slot
[ 4016.925039] Modules linked in:
[ 4016.925039] CPU: 1 PID: 14091 Comm: perf_fuzzer Not tainted 
3.17.0hark-lockup2-2014-10-06 #4
[ 4016.925039] Hardware name: LENOVO 7484A3G/LENOVO, BIOS 5CKT54AUS 09/07/2009
[ 4016.925039]  0009 8800cd85f9a8 8182da5c 
8800cd85f9f0
[ 4016.925039]  8800cd85f9e0 8104af38 8800e38bd000 
8801a6fb1c00
[ 4016.925039]  8801bec96780 8801bec96784 027400d480df 
8800cd85fa40
[ 4016.925039] Call Trace:
[ 4016.925039]  [] dump_stack+0x45/0x56
[ 4016.925039]  [] warn_slowpath_common+0x78/0xa0
[ 4016.925039]  [] warn_slowpath_fmt+0x47/0x50
[ 4016.925039]  [] arch_install_hw_breakpoint+0xf0/0x100
[ 4016.925039]  [] hw_breakpoint_add+0x3f/0x50
[ 4016.925039]  [] event_sched_in.isra.80+0x84/0x1b0
[ 4016.925039]  [] group_sched_in+0x69/0x1e0
[ 4016.925039]  [] ? perf_event_update_userpage+0xeb/0x160
[ 4016.925039]  [] ? sched_clock_local+0x1d/0x80
[ 4016.925039]  [] ctx_sched_in.isra.81+0xd2/0x1a0
[ 4016.925039]  [] perf_event_sched_in.isra.84+0x4f/0x70
[ 4016.925039]  [] 
perf_event_context_sched_in.isra.85+0x73/0xc0
[ 4016.925039]  [] __perf_event_task_sched_in+0x185/0x1a0
[ 4016.925039]  [] finish_task_switch+0xb2/0xf0
[ 4016.925039]  [] __schedule+0x34f/0x810
[ 4016.925039]  [] schedule+0x24/0x70
[ 4016.925039]  [] schedule_timeout+0x1b9/0x290
[ 4016.925039]  [] ? wait_for_completion+0x23/0x100
[ 4016.925039]  

Re: Perf lockups / stack overflows on v3.17-rc6, x86_64, arm, arm64

2014-10-06 Thread Mark Rutland
On Sun, Oct 05, 2014 at 06:13:24AM +0100, Vince Weaver wrote:
> On Thu, 25 Sep 2014, Mark Rutland wrote:
> 
> > Log 1, x86_64 lockup
> > [  223.007005]  [] ? 
> > poll_select_copy_remaining+0x130/0x130
> > [  223.007005]  [] ? getname_flags+0x4a/0x1a0
> > [  223.007005]  [] ? final_putname+0x1d/0x40
> > [  223.007005]  [] ? putname+0x24/0x40
> > [  223.007005]  [] ? user_path_at_empty+0x5a/0x90
> > [  223.007005]  [] ? wake_up_state+0x10/0x10
> > [  223.007005]  [] ? eventfd_read+0x38/0x60
> > [  223.007005]  [] ? ktime_get_ts64+0x45/0xf0
> > [  223.007005]  [] SyS_poll+0x60/0xf0
> 
> I have seen issues similar to this before, where the problem appeared
> to be in poll/hrtimer.  Never managed to track down anything useful about
> the bug.

Ok. 

> > Log 2, x86_64 stack overflow
> 
> > [  346.641345] divide error:  [#1] SMP
> > [  346.642010] Modules linked in:
> > [  346.642010] CPU: 0 PID: 4076 Comm: perf_fuzzer Not tainted 
> > 3.17.0-rc6hark-perf-lockup+ #1
> > [  346.642010] Hardware name: LENOVO 7484A3G/LENOVO, BIOS 5CKT54AUS 
> > 09/07/2009
> > [  346.642010] task: 8801ac449a70 ti: 8801ac574000 task.ti: 
> > 8801ac574000
> > [  346.642010] RIP: 0010:[]  [] 
> > find_busiest_group+0x28e/0x8a0
> > [  346.642010] RSP: 0018:8801ac577760  EFLAGS: 00010006
> > [  346.642010] RAX: 03ff RBX:  RCX: 
> > 8801
> > [  346.642010] RDX:  RSI: 0001 RDI: 
> > 0001
> > [  346.642010] RBP: 8801ac577890 R08:  R09: 
> > 
> > [  346.704010] [ cut here ]
> > [  346.704017] WARNING: CPU: 2 PID: 5 at arch/x86/kernel/irq_64.c:70 
> > handle_irq+0x141/0x150()
> > [  346.704019] do_IRQ():  has overflown the kernel stack 
> > (cur:1,sp:8801b653fe88,irq stk 
> > top-bottom:8801bed00080-8801bed03fc0,exception stk 
> > top-bottom:8801bed04080-8801bed0a000)
> 
> weird, have not seen this before.  Though I was hitting a reboot issue
> that would give really strange crash messages that was possibly fixed by
> a patch that went into 3.17-rc7.

Interesting. I'll retry with v3.17.

> > Log 3, arm64 lockup
> > >8
> 
> > Seeding random number generator with 1411488270
> > /proc/sys/kernel/perf_event_max_sample_rate currently: 285518974/s
> > /proc/sys/kernel/perf_event_paranoid currently: 1142898651
> 
> Those last two lines are suspect.  Is my fuzzer broken on arm64 somehow?

Good point. I'd mainly paid attention to the stack dump and hadn't
noticed. I'll take a look shortly and see what's going on.

> Sorry that I don't have good answers for these bugs, but I will stick them 
> in my perf_fuzzer outstanding bugs list.

Cheers anyhow. I'll see if I can figure out anything further.

Mark.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Perf lockups / stack overflows on v3.17-rc6, x86_64, arm, arm64

2014-10-06 Thread Mark Rutland
On Sun, Oct 05, 2014 at 06:13:24AM +0100, Vince Weaver wrote:
 On Thu, 25 Sep 2014, Mark Rutland wrote:
 
  Log 1, x86_64 lockup
  [  223.007005]  [81168910] ? 
  poll_select_copy_remaining+0x130/0x130
  [  223.007005]  [811600ea] ? getname_flags+0x4a/0x1a0
  [  223.007005]  [8116007d] ? final_putname+0x1d/0x40
  [  223.007005]  [811602f4] ? putname+0x24/0x40
  [  223.007005]  [8116581a] ? user_path_at_empty+0x5a/0x90
  [  223.007005]  [810701c0] ? wake_up_state+0x10/0x10
  [  223.007005]  [81198078] ? eventfd_read+0x38/0x60
  [  223.007005]  [810a1e75] ? ktime_get_ts64+0x45/0xf0
  [  223.007005]  [81169f00] SyS_poll+0x60/0xf0
 
 I have seen issues similar to this before, where the problem appeared
 to be in poll/hrtimer.  Never managed to track down anything useful about
 the bug.

Ok. 

  Log 2, x86_64 stack overflow
 
  [  346.641345] divide error:  [#1] SMP
  [  346.642010] Modules linked in:
  [  346.642010] CPU: 0 PID: 4076 Comm: perf_fuzzer Not tainted 
  3.17.0-rc6hark-perf-lockup+ #1
  [  346.642010] Hardware name: LENOVO 7484A3G/LENOVO, BIOS 5CKT54AUS 
  09/07/2009
  [  346.642010] task: 8801ac449a70 ti: 8801ac574000 task.ti: 
  8801ac574000
  [  346.642010] RIP: 0010:[81078bce]  [81078bce] 
  find_busiest_group+0x28e/0x8a0
  [  346.642010] RSP: 0018:8801ac577760  EFLAGS: 00010006
  [  346.642010] RAX: 03ff RBX:  RCX: 
  8801
  [  346.642010] RDX:  RSI: 0001 RDI: 
  0001
  [  346.642010] RBP: 8801ac577890 R08:  R09: 
  
  [  346.704010] [ cut here ]
  [  346.704017] WARNING: CPU: 2 PID: 5 at arch/x86/kernel/irq_64.c:70 
  handle_irq+0x141/0x150()
  [  346.704019] do_IRQ():  has overflown the kernel stack 
  (cur:1,sp:8801b653fe88,irq stk 
  top-bottom:8801bed00080-8801bed03fc0,exception stk 
  top-bottom:8801bed04080-8801bed0a000)
 
 weird, have not seen this before.  Though I was hitting a reboot issue
 that would give really strange crash messages that was possibly fixed by
 a patch that went into 3.17-rc7.

Interesting. I'll retry with v3.17.

  Log 3, arm64 lockup
  8
 
  Seeding random number generator with 1411488270
  /proc/sys/kernel/perf_event_max_sample_rate currently: 285518974/s
  /proc/sys/kernel/perf_event_paranoid currently: 1142898651
 
 Those last two lines are suspect.  Is my fuzzer broken on arm64 somehow?

Good point. I'd mainly paid attention to the stack dump and hadn't
noticed. I'll take a look shortly and see what's going on.

 Sorry that I don't have good answers for these bugs, but I will stick them 
 in my perf_fuzzer outstanding bugs list.

Cheers anyhow. I'll see if I can figure out anything further.

Mark.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Perf lockups / stack overflows on v3.17-rc6, x86_64, arm, arm64

2014-10-06 Thread Mark Rutland
   Log 2, x86_64 stack overflow
  
   [  346.641345] divide error:  [#1] SMP
   [  346.642010] Modules linked in:
   [  346.642010] CPU: 0 PID: 4076 Comm: perf_fuzzer Not tainted 
   3.17.0-rc6hark-perf-lockup+ #1
   [  346.642010] Hardware name: LENOVO 7484A3G/LENOVO, BIOS 5CKT54AUS 
   09/07/2009
   [  346.642010] task: 8801ac449a70 ti: 8801ac574000 task.ti: 
   8801ac574000
   [  346.642010] RIP: 0010:[81078bce]  [81078bce] 
   find_busiest_group+0x28e/0x8a0
   [  346.642010] RSP: 0018:8801ac577760  EFLAGS: 00010006
   [  346.642010] RAX: 03ff RBX:  RCX: 
   8801
   [  346.642010] RDX:  RSI: 0001 RDI: 
   0001
   [  346.642010] RBP: 8801ac577890 R08:  R09: 
   
   [  346.704010] [ cut here ]
   [  346.704017] WARNING: CPU: 2 PID: 5 at arch/x86/kernel/irq_64.c:70 
   handle_irq+0x141/0x150()
   [  346.704019] do_IRQ():  has overflown the kernel stack 
   (cur:1,sp:8801b653fe88,irq stk 
   top-bottom:8801bed00080-8801bed03fc0,exception stk 
   top-bottom:8801bed04080-8801bed0a000)
  
  weird, have not seen this before.  Though I was hitting a reboot issue
  that would give really strange crash messages that was possibly fixed by
  a patch that went into 3.17-rc7.
 
 Interesting. I'll retry with v3.17.

So far I haven't been able to trigger the above failure on v3.17, so perhaps
some patch has fixed that.

With the same seed (1411654897) I can trigger a hw_breakpoint warning
relatively repeatably (logs for a couple of instances below).

Mark.

8
[ 3268.694056] [ cut here ]
[ 3268.694066] WARNING: CPU: 0 PID: 19671 at 
arch/x86/kernel/hw_breakpoint.c:119 arch_install_hw_breakpoint+0xf0/0x100()
[ 3268.694068] Can't find any breakpoint slot
[ 3268.694070] Modules linked in:
[ 3268.694075] CPU: 0 PID: 19671 Comm: perf_fuzzer Not tainted 
3.17.0hark-lockup2-2014-10-06 #4
[ 3268.694077] Hardware name: LENOVO 7484A3G/LENOVO, BIOS 5CKT54AUS 09/07/2009
[ 3268.694079]  0009 88019a343c78 8182da5c 
88019a343cc0
[ 3268.694084]  88019a343cb0 8104af38 8801af5ec800 
8801a0faae00
[ 3268.694088]  8801bec16780 8801bec16784 01b8c35e3e18 
88019a343d10
[ 3268.694092] Call Trace:
[ 3268.694098]  [8182da5c] dump_stack+0x45/0x56
[ 3268.694103]  [8104af38] warn_slowpath_common+0x78/0xa0
[ 3268.694106]  [8104afa7] warn_slowpath_fmt+0x47/0x50
[ 3268.694110]  [8100a720] arch_install_hw_breakpoint+0xf0/0x100
[ 3268.694114]  [801f] hw_breakpoint_add+0x3f/0x50
[ 3268.694117]  [8110b3d4] event_sched_in.isra.80+0x84/0x1b0
[ 3268.694121]  [8110b569] group_sched_in+0x69/0x1e0
[ 3268.694124]  [8110c5cb] ? perf_event_update_userpage+0xeb/0x160
[ 3268.694129]  [8107568d] ? sched_clock_local+0x1d/0x80
[ 3268.694132]  [8110b7b2] ctx_sched_in.isra.81+0xd2/0x1a0
[ 3268.694136]  [8110b8cf] perf_event_sched_in.isra.84+0x4f/0x70
[ 3268.694139]  [8110bac3] 
perf_event_context_sched_in.isra.85+0x73/0xc0
[ 3268.694142]  [8110c175] __perf_event_task_sched_in+0x185/0x1a0
[ 3268.694147]  [8106c832] finish_task_switch+0xb2/0xf0
[ 3268.694151]  [818339bf] __schedule+0x34f/0x810
[ 3268.694154]  [81833ea4] schedule+0x24/0x70
[ 3268.694158]  [818395d1] int_careful+0xd/0x14
[ 3268.694160] ---[ end trace e1f62407a7d7e846 ]---
8

8
[ 4016.924076] [ cut here ]
[ 4016.925039] WARNING: CPU: 1 PID: 14091 at 
arch/x86/kernel/hw_breakpoint.c:119 arch_install_hw_breakpoint+0xf0/0x100()
[ 4016.925039] Can't find any breakpoint slot
[ 4016.925039] Modules linked in:
[ 4016.925039] CPU: 1 PID: 14091 Comm: perf_fuzzer Not tainted 
3.17.0hark-lockup2-2014-10-06 #4
[ 4016.925039] Hardware name: LENOVO 7484A3G/LENOVO, BIOS 5CKT54AUS 09/07/2009
[ 4016.925039]  0009 8800cd85f9a8 8182da5c 
8800cd85f9f0
[ 4016.925039]  8800cd85f9e0 8104af38 8800e38bd000 
8801a6fb1c00
[ 4016.925039]  8801bec96780 8801bec96784 027400d480df 
8800cd85fa40
[ 4016.925039] Call Trace:
[ 4016.925039]  [8182da5c] dump_stack+0x45/0x56
[ 4016.925039]  [8104af38] warn_slowpath_common+0x78/0xa0
[ 4016.925039]  [8104afa7] warn_slowpath_fmt+0x47/0x50
[ 4016.925039]  [8100a720] arch_install_hw_breakpoint+0xf0/0x100
[ 4016.925039]  [801f] hw_breakpoint_add+0x3f/0x50
[ 4016.925039]  [8110b3d4] event_sched_in.isra.80+0x84/0x1b0
[ 4016.925039]  [8110b569] group_sched_in+0x69/0x1e0
[ 4016.925039]  [8110c5cb] ? perf_event_update_userpage+0xeb/0x160
[ 4016.925039]  [8107568d] ? sched_clock_local+0x1d/0x80
[ 4016.925039]  [8110b7b2] ctx_sched_in.isra.81+0xd2/0x1a0
[ 4016.925039]  [8110b8cf] 

Re: Perf lockups / stack overflows on v3.17-rc6, x86_64, arm, arm64

2014-10-06 Thread Vince Weaver
On Mon, 6 Oct 2014, Mark Rutland wrote:

 So far I haven't been able to trigger the above failure on v3.17, so perhaps
 some patch has fixed that.
 
 With the same seed (1411654897) I can trigger a hw_breakpoint warning
 relatively repeatably (logs for a couple of instances below).

I see those fairly frequently too.  Adding some breakpoint people to
the CC and maybe they can help debug it.

Vince


 8
 [ 3268.694056] [ cut here ]
 [ 3268.694066] WARNING: CPU: 0 PID: 19671 at 
 arch/x86/kernel/hw_breakpoint.c:119 arch_install_hw_breakpoint+0xf0/0x100()
 [ 3268.694068] Can't find any breakpoint slot
 [ 3268.694070] Modules linked in:
 [ 3268.694075] CPU: 0 PID: 19671 Comm: perf_fuzzer Not tainted 
 3.17.0hark-lockup2-2014-10-06 #4
 [ 3268.694077] Hardware name: LENOVO 7484A3G/LENOVO, BIOS 5CKT54AUS 09/07/2009
 [ 3268.694079]  0009 88019a343c78 8182da5c 
 88019a343cc0
 [ 3268.694084]  88019a343cb0 8104af38 8801af5ec800 
 8801a0faae00
 [ 3268.694088]  8801bec16780 8801bec16784 01b8c35e3e18 
 88019a343d10
 [ 3268.694092] Call Trace:
 [ 3268.694098]  [8182da5c] dump_stack+0x45/0x56
 [ 3268.694103]  [8104af38] warn_slowpath_common+0x78/0xa0
 [ 3268.694106]  [8104afa7] warn_slowpath_fmt+0x47/0x50
 [ 3268.694110]  [8100a720] arch_install_hw_breakpoint+0xf0/0x100
 [ 3268.694114]  [801f] hw_breakpoint_add+0x3f/0x50
 [ 3268.694117]  [8110b3d4] event_sched_in.isra.80+0x84/0x1b0
 [ 3268.694121]  [8110b569] group_sched_in+0x69/0x1e0
 [ 3268.694124]  [8110c5cb] ? perf_event_update_userpage+0xeb/0x160
 [ 3268.694129]  [8107568d] ? sched_clock_local+0x1d/0x80
 [ 3268.694132]  [8110b7b2] ctx_sched_in.isra.81+0xd2/0x1a0
 [ 3268.694136]  [8110b8cf] perf_event_sched_in.isra.84+0x4f/0x70
 [ 3268.694139]  [8110bac3] 
 perf_event_context_sched_in.isra.85+0x73/0xc0
 [ 3268.694142]  [8110c175] __perf_event_task_sched_in+0x185/0x1a0
 [ 3268.694147]  [8106c832] finish_task_switch+0xb2/0xf0
 [ 3268.694151]  [818339bf] __schedule+0x34f/0x810
 [ 3268.694154]  [81833ea4] schedule+0x24/0x70
 [ 3268.694158]  [818395d1] int_careful+0xd/0x14
 [ 3268.694160] ---[ end trace e1f62407a7d7e846 ]---
 8
 
 8
 [ 4016.924076] [ cut here ]
 [ 4016.925039] WARNING: CPU: 1 PID: 14091 at 
 arch/x86/kernel/hw_breakpoint.c:119 arch_install_hw_breakpoint+0xf0/0x100()
 [ 4016.925039] Can't find any breakpoint slot
 [ 4016.925039] Modules linked in:
 [ 4016.925039] CPU: 1 PID: 14091 Comm: perf_fuzzer Not tainted 
 3.17.0hark-lockup2-2014-10-06 #4
 [ 4016.925039] Hardware name: LENOVO 7484A3G/LENOVO, BIOS 5CKT54AUS 09/07/2009
 [ 4016.925039]  0009 8800cd85f9a8 8182da5c 
 8800cd85f9f0
 [ 4016.925039]  8800cd85f9e0 8104af38 8800e38bd000 
 8801a6fb1c00
 [ 4016.925039]  8801bec96780 8801bec96784 027400d480df 
 8800cd85fa40
 [ 4016.925039] Call Trace:
 [ 4016.925039]  [8182da5c] dump_stack+0x45/0x56
 [ 4016.925039]  [8104af38] warn_slowpath_common+0x78/0xa0
 [ 4016.925039]  [8104afa7] warn_slowpath_fmt+0x47/0x50
 [ 4016.925039]  [8100a720] arch_install_hw_breakpoint+0xf0/0x100
 [ 4016.925039]  [801f] hw_breakpoint_add+0x3f/0x50
 [ 4016.925039]  [8110b3d4] event_sched_in.isra.80+0x84/0x1b0
 [ 4016.925039]  [8110b569] group_sched_in+0x69/0x1e0
 [ 4016.925039]  [8110c5cb] ? perf_event_update_userpage+0xeb/0x160
 [ 4016.925039]  [8107568d] ? sched_clock_local+0x1d/0x80
 [ 4016.925039]  [8110b7b2] ctx_sched_in.isra.81+0xd2/0x1a0
 [ 4016.925039]  [8110b8cf] perf_event_sched_in.isra.84+0x4f/0x70
 [ 4016.925039]  [8110bac3] 
 perf_event_context_sched_in.isra.85+0x73/0xc0
 [ 4016.925039]  [8110c175] __perf_event_task_sched_in+0x185/0x1a0
 [ 4016.925039]  [8106c832] finish_task_switch+0xb2/0xf0
 [ 4016.925039]  [818339bf] __schedule+0x34f/0x810
 [ 4016.925039]  [81833ea4] schedule+0x24/0x70
 [ 4016.925039]  [81837ca9] schedule_timeout+0x1b9/0x290
 [ 4016.925039]  [81835113] ? wait_for_completion+0x23/0x100
 [ 4016.925039]  [8183518c] wait_for_completion+0x9c/0x100
 [ 4016.925039]  [81072800] ? wake_up_state+0x10/0x10
 [ 4016.925039]  [8109e7f0] ? call_rcu_bh+0x20/0x20
 [ 4016.925039]  [8109c766] wait_rcu_gp+0x46/0x50
 [ 4016.925039]  [8109c710] ? 
 ftrace_raw_output_rcu_utilization+0x50/0x50
 [ 4016.925039]  [8109ee93] synchronize_sched+0x33/0x50
 [ 4016.925039]  [810f9fab] perf_trace_event_unreg.isra.1+0x3b/0x90
 [ 4016.925039]  [810fa318] perf_trace_destroy+0x38/0x50
 [ 4016.925039]  [81106d99] tp_perf_event_destroy+0x9/0x10
 [ 4016.925039]  [81108523] __free_event+0x23/0x70
 [ 4016.925039]  [8110f057] 

Re: Perf lockups / stack overflows on v3.17-rc6, x86_64, arm, arm64

2014-10-04 Thread Vince Weaver
On Thu, 25 Sep 2014, Mark Rutland wrote:

> Log 1, x86_64 lockup
> [  223.007005]  [] ? poll_select_copy_remaining+0x130/0x130
> [  223.007005]  [] ? getname_flags+0x4a/0x1a0
> [  223.007005]  [] ? final_putname+0x1d/0x40
> [  223.007005]  [] ? putname+0x24/0x40
> [  223.007005]  [] ? user_path_at_empty+0x5a/0x90
> [  223.007005]  [] ? wake_up_state+0x10/0x10
> [  223.007005]  [] ? eventfd_read+0x38/0x60
> [  223.007005]  [] ? ktime_get_ts64+0x45/0xf0
> [  223.007005]  [] SyS_poll+0x60/0xf0

I have seen issues similar to this before, where the problem appeared
to be in poll/hrtimer.  Never managed to track down anything useful about
the bug.

> Log 2, x86_64 stack overflow

> [  346.641345] divide error:  [#1] SMP
> [  346.642010] Modules linked in:
> [  346.642010] CPU: 0 PID: 4076 Comm: perf_fuzzer Not tainted 
> 3.17.0-rc6hark-perf-lockup+ #1
> [  346.642010] Hardware name: LENOVO 7484A3G/LENOVO, BIOS 5CKT54AUS 09/07/2009
> [  346.642010] task: 8801ac449a70 ti: 8801ac574000 task.ti: 
> 8801ac574000
> [  346.642010] RIP: 0010:[]  [] 
> find_busiest_group+0x28e/0x8a0
> [  346.642010] RSP: 0018:8801ac577760  EFLAGS: 00010006
> [  346.642010] RAX: 03ff RBX:  RCX: 
> 8801
> [  346.642010] RDX:  RSI: 0001 RDI: 
> 0001
> [  346.642010] RBP: 8801ac577890 R08:  R09: 
> 
> [  346.704010] [ cut here ]
> [  346.704017] WARNING: CPU: 2 PID: 5 at arch/x86/kernel/irq_64.c:70 
> handle_irq+0x141/0x150()
> [  346.704019] do_IRQ():  has overflown the kernel stack 
> (cur:1,sp:8801b653fe88,irq stk 
> top-bottom:8801bed00080-8801bed03fc0,exception stk 
> top-bottom:8801bed04080-8801bed0a000)

weird, have not seen this before.  Though I was hitting a reboot issue
that would give really strange crash messages that was possibly fixed by
a patch that went into 3.17-rc7.

> Log 3, arm64 lockup
> >8

> Seeding random number generator with 1411488270
> /proc/sys/kernel/perf_event_max_sample_rate currently: 285518974/s
> /proc/sys/kernel/perf_event_paranoid currently: 1142898651

Those last two lines are suspect.  Is my fuzzer broken on arm64 somehow?
I do try to test on arm occasionally, but my pandaboard suffered massive 
SD card failure and I haven't had a chance to get it back running yet.
I have trouble getting any of my other flock of arm machines running 
recent upstream kernels.


Sorry that I don't have good answers for these bugs, but I will stick them 
in my perf_fuzzer outstanding bugs list.

Vince
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Perf lockups / stack overflows on v3.17-rc6, x86_64, arm, arm64

2014-10-04 Thread Vince Weaver
On Thu, 25 Sep 2014, Mark Rutland wrote:

 Log 1, x86_64 lockup
 [  223.007005]  [81168910] ? poll_select_copy_remaining+0x130/0x130
 [  223.007005]  [811600ea] ? getname_flags+0x4a/0x1a0
 [  223.007005]  [8116007d] ? final_putname+0x1d/0x40
 [  223.007005]  [811602f4] ? putname+0x24/0x40
 [  223.007005]  [8116581a] ? user_path_at_empty+0x5a/0x90
 [  223.007005]  [810701c0] ? wake_up_state+0x10/0x10
 [  223.007005]  [81198078] ? eventfd_read+0x38/0x60
 [  223.007005]  [810a1e75] ? ktime_get_ts64+0x45/0xf0
 [  223.007005]  [81169f00] SyS_poll+0x60/0xf0

I have seen issues similar to this before, where the problem appeared
to be in poll/hrtimer.  Never managed to track down anything useful about
the bug.

 Log 2, x86_64 stack overflow

 [  346.641345] divide error:  [#1] SMP
 [  346.642010] Modules linked in:
 [  346.642010] CPU: 0 PID: 4076 Comm: perf_fuzzer Not tainted 
 3.17.0-rc6hark-perf-lockup+ #1
 [  346.642010] Hardware name: LENOVO 7484A3G/LENOVO, BIOS 5CKT54AUS 09/07/2009
 [  346.642010] task: 8801ac449a70 ti: 8801ac574000 task.ti: 
 8801ac574000
 [  346.642010] RIP: 0010:[81078bce]  [81078bce] 
 find_busiest_group+0x28e/0x8a0
 [  346.642010] RSP: 0018:8801ac577760  EFLAGS: 00010006
 [  346.642010] RAX: 03ff RBX:  RCX: 
 8801
 [  346.642010] RDX:  RSI: 0001 RDI: 
 0001
 [  346.642010] RBP: 8801ac577890 R08:  R09: 
 
 [  346.704010] [ cut here ]
 [  346.704017] WARNING: CPU: 2 PID: 5 at arch/x86/kernel/irq_64.c:70 
 handle_irq+0x141/0x150()
 [  346.704019] do_IRQ():  has overflown the kernel stack 
 (cur:1,sp:8801b653fe88,irq stk 
 top-bottom:8801bed00080-8801bed03fc0,exception stk 
 top-bottom:8801bed04080-8801bed0a000)

weird, have not seen this before.  Though I was hitting a reboot issue
that would give really strange crash messages that was possibly fixed by
a patch that went into 3.17-rc7.

 Log 3, arm64 lockup
 8

 Seeding random number generator with 1411488270
 /proc/sys/kernel/perf_event_max_sample_rate currently: 285518974/s
 /proc/sys/kernel/perf_event_paranoid currently: 1142898651

Those last two lines are suspect.  Is my fuzzer broken on arm64 somehow?
I do try to test on arm occasionally, but my pandaboard suffered massive 
SD card failure and I haven't had a chance to get it back running yet.
I have trouble getting any of my other flock of arm machines running 
recent upstream kernels.


Sorry that I don't have good answers for these bugs, but I will stick them 
in my perf_fuzzer outstanding bugs list.

Vince
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Perf lockups / stack overflows on v3.17-rc6, x86_64, arm, arm64

2014-09-25 Thread Mark Rutland
Hi all,

I've been running Vince's fuzzer (latest HEAD, 8338ecf4a6fb6892) to test some
local perf patches, and in doing so I've encountered a number of lockups on
vanilla v3.17-rc6, on arm, arm64, and x86_64 (I've not tested 32-bit x86).

The two x86_64 logs below show the typical lockup case and a rare stack
overflow that may or may not be related, both recorded from a Core 2 system.
I'm able to trigger lockups on a Haswell system, but as that's my development
machine acquiring logs is a little tricky. Both are possible to trigger within
a minute or two.

On arm64 I'm able to trigger lockups after a number (10+ usually) of minutes,
and I'm able to dump the stack on the RCU stalling CPU so long as it isn't
CPU0. From a prior dump I spotted that the stalled CPU was somewhere in
do_softirq_ownstack, and my hacked in softirq accounting shows it doesn't seem
to be making any progress with softirqs. I haven't yet figured out what it's
doing.

Has anyone seen anything like this, or does anyone have any ideas as to what
might be going on? I didn't spot anything in tip/perf/urgent so I assume these
aren't known issues.

Thanks,
Mark.

Log 1, x86_64 lockup
>8
*** perf_fuzzer 0.29-pre *** by Vince Weaver

Linux version 3.17.0-rc6hark-perf-lockup+ x86_64
Processor: Intel 6/23/10

Seeding random number generator with 1411655463
/proc/sys/kernel/perf_event_max_sample_rate currently: 10/s
/proc/sys/kernel/perf_event_paranoid currently: 1
Logging perf_event_open() failures: no
Running fsync after every syscall: no
To reproduce, try: ./perf_fuzzer -r 1411655463

Pid=2723, sleeping 1s
==
Fuzzing the following syscalls:
mmap perf_event_open close read write ioctl fork prctl poll
*NOT* Fuzzing the following syscalls:

Also attempting the following:
signal-handler-on-overflow busy-instruction-loop 
accessing-perf-proc-and-sys-files trashing-the-mmap-page
*NOT* attempting the following:

==

... userspace output removed ...

[  157.280682] NOHZ: local_softirq_pending 100
[  223.007005] INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 
3, t=21006 jiffies, g=25497, c=25496, q=3)
[  223.007005] Task dump for CPU 1:
[  223.007005] accounts-daemon R  running task13712  2237  1 0x1000
[  223.007005]  8801ae8d3a60 0082 8800e3df08d0 
8801ae8d3fd8
[  223.007005]  00012900 00012900 8801b64d4f50 
8800e3df08d0
[  223.007005]  8801ae8d3b98 003cf95e  
8801ae8d3bc4
[  223.007005] Call Trace:
[  223.007005]  [] schedule+0x24/0x70
[  223.007005]  [] schedule_hrtimeout_range_clock+0xfc/0x140
[  223.007005]  [] ? hrtimer_get_res+0x40/0x40
[  223.007005]  [] ? schedule_hrtimeout_range_clock+0x92/0x140
[  223.007005]  [] schedule_hrtimeout_range+0xe/0x10
[  223.007005]  [] poll_schedule_timeout+0x44/0x60
[  223.007005]  [] do_sys_poll+0x422/0x540
[  223.007005]  [] ? unix_stream_sendmsg+0x3e6/0x420
[  223.007005]  [] ? selinux_inode_permission+0x9b/0x150
[  223.007005]  [] ? poll_select_copy_remaining+0x130/0x130
[  223.007005]  [] ? poll_select_copy_remaining+0x130/0x130
[  223.007005]  [] ? poll_select_copy_remaining+0x130/0x130
[  223.007005]  [] ? getname_flags+0x4a/0x1a0
[  223.007005]  [] ? final_putname+0x1d/0x40
[  223.007005]  [] ? putname+0x24/0x40
[  223.007005]  [] ? user_path_at_empty+0x5a/0x90
[  223.007005]  [] ? wake_up_state+0x10/0x10
[  223.007005]  [] ? eventfd_read+0x38/0x60
[  223.007005]  [] ? ktime_get_ts64+0x45/0xf0
[  223.007005]  [] SyS_poll+0x60/0xf0
[  223.007005]  [] system_call_fastpath+0x16/0x1b
[  286.012004] INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 
3, t=84007 jiffies, g=25497, c=25496, q=2003)
[  286.012004] Task dump for CPU 1:
[  286.012004] accounts-daemon R  running task13712  2237  1 0x1000
[  286.012004]  8801ae8d3a60 0082 8800e3df08d0 
8801ae8d3fd8
[  286.012004]  00012900 00012900 8801b64d4f50 
8800e3df08d0
[  286.012004]  8801ae8d3b98 003cf95e  
8801ae8d3bc4
[  286.012004] Call Trace:
[  286.012004]  [] schedule+0x24/0x70
[  286.012004]  [] schedule_hrtimeout_range_clock+0xfc/0x140
[  286.012004]  [] ? hrtimer_get_res+0x40/0x40
[  286.012004]  [] ? schedule_hrtimeout_range_clock+0x92/0x140
[  286.012004]  [] schedule_hrtimeout_range+0xe/0x10
[  286.012004]  [] poll_schedule_timeout+0x44/0x60
[  286.012004]  [] do_sys_poll+0x422/0x540
[  286.012004]  [] ? unix_stream_sendmsg+0x3e6/0x420
[  286.012004]  [] ? selinux_inode_permission+0x9b/0x150
[  286.012004]  [] ? poll_select_copy_remaining+0x130/0x130
[  286.012004]  [] ? poll_select_copy_remaining+0x130/0x130
[  286.012004]  [] ? poll_select_copy_remaining+0x130/0x130
[  286.012004]  [] ? getname_flags+0x4a/0x1a0
[  286.012004]  [] ? 

Perf lockups / stack overflows on v3.17-rc6, x86_64, arm, arm64

2014-09-25 Thread Mark Rutland
Hi all,

I've been running Vince's fuzzer (latest HEAD, 8338ecf4a6fb6892) to test some
local perf patches, and in doing so I've encountered a number of lockups on
vanilla v3.17-rc6, on arm, arm64, and x86_64 (I've not tested 32-bit x86).

The two x86_64 logs below show the typical lockup case and a rare stack
overflow that may or may not be related, both recorded from a Core 2 system.
I'm able to trigger lockups on a Haswell system, but as that's my development
machine acquiring logs is a little tricky. Both are possible to trigger within
a minute or two.

On arm64 I'm able to trigger lockups after a number (10+ usually) of minutes,
and I'm able to dump the stack on the RCU stalling CPU so long as it isn't
CPU0. From a prior dump I spotted that the stalled CPU was somewhere in
do_softirq_ownstack, and my hacked in softirq accounting shows it doesn't seem
to be making any progress with softirqs. I haven't yet figured out what it's
doing.

Has anyone seen anything like this, or does anyone have any ideas as to what
might be going on? I didn't spot anything in tip/perf/urgent so I assume these
aren't known issues.

Thanks,
Mark.

Log 1, x86_64 lockup
8
*** perf_fuzzer 0.29-pre *** by Vince Weaver

Linux version 3.17.0-rc6hark-perf-lockup+ x86_64
Processor: Intel 6/23/10

Seeding random number generator with 1411655463
/proc/sys/kernel/perf_event_max_sample_rate currently: 10/s
/proc/sys/kernel/perf_event_paranoid currently: 1
Logging perf_event_open() failures: no
Running fsync after every syscall: no
To reproduce, try: ./perf_fuzzer -r 1411655463

Pid=2723, sleeping 1s
==
Fuzzing the following syscalls:
mmap perf_event_open close read write ioctl fork prctl poll
*NOT* Fuzzing the following syscalls:

Also attempting the following:
signal-handler-on-overflow busy-instruction-loop 
accessing-perf-proc-and-sys-files trashing-the-mmap-page
*NOT* attempting the following:

==

... userspace output removed ...

[  157.280682] NOHZ: local_softirq_pending 100
[  223.007005] INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 
3, t=21006 jiffies, g=25497, c=25496, q=3)
[  223.007005] Task dump for CPU 1:
[  223.007005] accounts-daemon R  running task13712  2237  1 0x1000
[  223.007005]  8801ae8d3a60 0082 8800e3df08d0 
8801ae8d3fd8
[  223.007005]  00012900 00012900 8801b64d4f50 
8800e3df08d0
[  223.007005]  8801ae8d3b98 003cf95e  
8801ae8d3bc4
[  223.007005] Call Trace:
[  223.007005]  [81911f34] schedule+0x24/0x70
[  223.007005]  [81914ebc] schedule_hrtimeout_range_clock+0xfc/0x140
[  223.007005]  [8109cc40] ? hrtimer_get_res+0x40/0x40
[  223.007005]  [81914e52] ? schedule_hrtimeout_range_clock+0x92/0x140
[  223.007005]  [81914f0e] schedule_hrtimeout_range+0xe/0x10
[  223.007005]  [81168794] poll_schedule_timeout+0x44/0x60
[  223.007005]  [81169d22] do_sys_poll+0x422/0x540
[  223.007005]  [8180e6b6] ? unix_stream_sendmsg+0x3e6/0x420
[  223.007005]  [812908fb] ? selinux_inode_permission+0x9b/0x150
[  223.007005]  [81168910] ? poll_select_copy_remaining+0x130/0x130
[  223.007005]  [81168910] ? poll_select_copy_remaining+0x130/0x130
[  223.007005]  [81168910] ? poll_select_copy_remaining+0x130/0x130
[  223.007005]  [811600ea] ? getname_flags+0x4a/0x1a0
[  223.007005]  [8116007d] ? final_putname+0x1d/0x40
[  223.007005]  [811602f4] ? putname+0x24/0x40
[  223.007005]  [8116581a] ? user_path_at_empty+0x5a/0x90
[  223.007005]  [810701c0] ? wake_up_state+0x10/0x10
[  223.007005]  [81198078] ? eventfd_read+0x38/0x60
[  223.007005]  [810a1e75] ? ktime_get_ts64+0x45/0xf0
[  223.007005]  [81169f00] SyS_poll+0x60/0xf0
[  223.007005]  [81915bd2] system_call_fastpath+0x16/0x1b
[  286.012004] INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 
3, t=84007 jiffies, g=25497, c=25496, q=2003)
[  286.012004] Task dump for CPU 1:
[  286.012004] accounts-daemon R  running task13712  2237  1 0x1000
[  286.012004]  8801ae8d3a60 0082 8800e3df08d0 
8801ae8d3fd8
[  286.012004]  00012900 00012900 8801b64d4f50 
8800e3df08d0
[  286.012004]  8801ae8d3b98 003cf95e  
8801ae8d3bc4
[  286.012004] Call Trace:
[  286.012004]  [81911f34] schedule+0x24/0x70
[  286.012004]  [81914ebc] schedule_hrtimeout_range_clock+0xfc/0x140
[  286.012004]  [8109cc40] ? hrtimer_get_res+0x40/0x40
[  286.012004]  [81914e52] ? schedule_hrtimeout_range_clock+0x92/0x140
[  286.012004]  [81914f0e] schedule_hrtimeout_range+0xe/0x10
[  286.012004]  [81168794]