On Sun, Mar 14, 2021 at 9:40 AM Johnny Billquist <b...@softjar.se> wrote:
> On 2021-03-14 17:36, Xiang Xiao wrote: > > On Sun, Mar 14, 2021 at 8:27 AM Fotis Panagiotopoulos < > f.j.pa...@gmail.com> > > wrote: > > > >>> Why not to use the hardware watchdog timer which is more reliable and > >>> simple than the pure software solution? > >> > >> I do use it, but a hardware watchdog can monitor only one thing (in my > case > >> the kernel itself). > >> > >> > > If you want to catch some task/thread in an infinite loop, the hardware > > watchdog monitor in nuttx can do it for you. > > Of course. But it will not be easy to do if you want to watch multiple > threads. Because the hardware watchdog is very binary. If any thread > were to kick the watchdog, it will not do a reset. So if one thread is > hung, but others still run, your hardware watchdog will not do what you > want, possibly. > > You don't need kick the watchdog from userspace, but watchdog monitor will: 1. Kick the watchdog from idle thread(WATCHDOG_AUTOMONITOR_BY_IDLE=y) 2. Kick the watchdog from work thread(WATCHDOG_AUTOMONITOR_BY_WORKER=y) 3. Kick the watchdog from timer interrupt(WATCHDOG_AUTOMONITOR_BY_TIMER=y) 4. Kick the watchdog from watchdog interrupt(WATCHDOG_AUTOMONITOR_BY_CAPTURE=y) The different kick strategy can find the different problem: 1. WATCHDOG_AUTOMONITOR_BY_IDLE find any busy loop in long time 2. WATCHDOG_AUTOMONITOR_BY_WORKER find some worker blocking the work queue 3. WATCHDOG_AUTOMONITOR_BY_TIMER find the interrupt mask for long time You can find more detail from the code: https://github.com/apache/incubator-nuttx/blob/master/drivers/timers/watchdog.c Not to mention that from a diagnistics point of view, it could be nice > to be informed which thread was hung as well, as a part of the > handling/restarting. > > Johnny > > -- > Johnny Billquist || "I'm on a bus > || on a psychedelic trip > email: b...@softjar.se || Reading murder books > pdp is alive! || tryin' to stay hip" - B. Idol >