Hi all,
I am having an issue where work scheduled on the HPWORK queue stops being
processed.
The work queue is locked up inside of nxsig_timedwait(). Specifically on
line 370 of sig_timedwait.c. The watchdog that is scheduled to wake the
thread back up is never firing.
I know because I traced the stack once the queue was locked waiting on a
signal and I am waiting on line 370 of sig_timedwait.c. Additionally, by
setting the breakpoint like below and running a command from nsh> that
queues work on the HPWORK queue, I can manually wake it back up and then it
will start processing again.
/* Start the watchdog */
wd_start(rtcb->waitdog, waitticks,
nxsig_timeout, 1, wdparm.pvarg);
/* Now wait for either the signal or the watchdog, but
* first, make sure this is not the idle task,
* descheduling that isn't going to end well.
*/
DEBUGASSERT(NULL != rtcb->flink);
up_block_task(rtcb, TSTATE_WAIT_SIG); <-- waiting here
/* We no longer need the watchdog */
wd_delete(rtcb->waitdog); <-- breakpoint here
rtcb->waitdog = NULL;
waitticks is the correct value of 6700 ticks = 6.7ms
I am running an STM32F765 with a tickless, alarm-based, system clock at
1us/tick supporting 64-bit. I am suspicious that this particular
combination may be the reason I am having this issue, but I don't know yet.
Another data point is that when I disable CONFIG_SCHED_TICKLESS_ALARM, then
I start hitting this assertion below from line 413 of wd_start.c
#ifndef CONFIG_SCHED_TICKLESS_ALARM
/* There is logic to handle the case where ticks is greater than
* the watchdog lag, but if the scheduling is working properly
* that should never happen.
*/
DEBUGASSERT(ticks <= wdog->lag);
#endif
I'll keep digging and report back if there is a core bug, but if anyone
else has any suggestions, please let me know.
Thanks,
Anthony