Re: Occasionally losing the tick_sched_timer

2018-04-10 Thread Thomas Gleixner
On Tue, 10 Apr 2018, Nicholas Piggin wrote: > On Tue, 10 Apr 2018 09:42:29 +0200 (CEST) > Thomas Gleixner wrote: > > > Thomas do you have any ideas on what we might look for, or if we can add > > > some BUG_ON()s to catch this at its source? > > > > Not really. Tracing might be a more efficient

Re: Occasionally losing the tick_sched_timer

2018-04-10 Thread Nicholas Piggin
On Tue, 10 Apr 2018 09:42:29 +0200 (CEST) Thomas Gleixner wrote: > Nick, > > On Tue, 10 Apr 2018, Nicholas Piggin wrote: > > We are seeing rare hard lockup watchdog timeouts, a CPU seems to have no > > more timers scheduled, despite hard and soft lockup watchdogs should have > > their heart beat

Re: Occasionally losing the tick_sched_timer

2018-04-10 Thread Thomas Gleixner
Nick, On Tue, 10 Apr 2018, Nicholas Piggin wrote: > We are seeing rare hard lockup watchdog timeouts, a CPU seems to have no > more timers scheduled, despite hard and soft lockup watchdogs should have > their heart beat timers and probably many others. > > The reproducer we have is running a KVM w

Occasionally losing the tick_sched_timer

2018-04-09 Thread Nicholas Piggin
We are seeing rare hard lockup watchdog timeouts, a CPU seems to have no more timers scheduled, despite hard and soft lockup watchdogs should have their heart beat timers and probably many others. The reproducer we have is running a KVM workload. The lockup is in the host kernel, quite rare but we