Fix-Point commented on code in PR #17640:
URL: https://github.com/apache/nuttx/pull/17640#discussion_r2642626103


##########
sched/wdog/wd_cancel.c:
##########
@@ -61,34 +61,33 @@
 int wd_cancel(FAR struct wdog_s *wdog)
 {
   irqstate_t flags;
-  bool head;
+  bool       reassess = false;
+  int        ret      = -EINVAL;
 
-  flags = spin_lock_irqsave(&g_wdspinlock);
+  flags = enter_critical_section();

Review Comment:
   > **Explanation of other related points:** **Why not improve the state 
machine based on the new fine-grained spin-lock?** After introducing the third 
state `WDOG_INACTIVE | shared`, we could enhance the state machine to achieve 
the following:
   > 
   > * On single-core platforms, it would be possible to implement semantics 
identical to the original `wd_cancel`.
   > * On SMP platforms, hazard pointers could be used to avoid the issue where 
periodic WDOGs cannot be canceled. However, this would require waiting for 
remote threads to complete, meaning we cannot achieve semantics fully identical 
to the original `wd_cancel`. Consequently, we would **have to reimplement all 
code that uses WDOGs**.
   > 
   > For example, under the asynchronous `wd_cancel` semantics the following 
problematic thread interleaving could still occur:
   > 
   > ```
   >    The previous `nxsem_timeout` callback function caused the second 
semaphore wait to fail.
   >     
   >         Core 0 [nxsem_clockwait]         Core 1
   >         enter_critical_section()       |  ...
   >         wd_start(nxsem_timeout)        |  ...
   >         nxsem_wait(sem)                |  wd_expiration() --> nxsem_timeout
   >         waked up by others             |
   >         wd_cancel(&rtcb->waitdog)      |  try enter_critical_section()
   >         leave_critical_section()       |  Failed retry...(Blocking by Core 
n)
   >         ... call nxsem_clockwait again |  Failed retry...
   >         enter_critical_section()       |  Failed retry...
   >         wd_start(nxsem_timeout)        |  Failed retry...
   >         nxsem_wait(sem)                |  Core 1 enter the critical section
   >                                        |  nxsem_wait_irq(wtcb, ETIMEDOUT) 
-> incorrectly wake-up the rtcb.
   > ```
   > 
   > * According to statistics, WDOG usage appears in **219/189 files** across 
**400/427 locations** where `wd_start`/`wd_cancel` are called. Reviewing and 
rewriting all related logic would clearly be impractical.
   > 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to