xiaoxiang781216 commented on code in PR #15929:
URL: https://github.com/apache/nuttx/pull/15929#discussion_r1984406661


##########
arch/risc-v/src/common/riscv_mtimer.c:
##########
@@ -222,8 +221,11 @@ static int riscv_mtimer_start(struct oneshot_lowerhalf_s 
*lower,
   flags = up_irq_save();
 
   mtime = riscv_mtimer_get_mtime(priv);
-  alarm = mtime + ts->tv_sec * priv->freq +
-          ts->tv_nsec * priv->freq / NSEC_PER_SEC;
+
+  /* Align each tick start to cycle_per_tick boundary to avoid clock drift */
+
+  alarm = mtime / priv->cycle_per_tick * priv->cycle_per_tick +

Review Comment:
   > This is wrong, and would add drift to the timer. The "mtime" here is not 
the value where the compare was set previously. It is some arbitrary time after 
that. The time between is unknown. The drift will eventually cause random 
errors in wdog functionality (e.g. sleeping for 1 tick may randomly timeout in 
less than 1 tick time), which then again breaks all the HW drivers using the 
tick based sleep as a timeout timer.
   > 
   
   here is a possible fix :
   ```
   alarm = mtime + (tick  * priv->freq  + TICK_PER_SEC - 1) / TICK_PER_SEC;
   ```
   
   > This is the main reason why it is smart idea to only allow tick times 
which are even multiples of the underlying hw timer; you know the previous tick 
start by always aligning the current mtime to the tick time boundary.
   > 
   
   even multiples is good, but it mayn't possible with the real chip design.
   
   > There is a way around this by using priv->freq as well, by specifically 
storing the previously set compare time, and always calculate the needed amount 
of ticks to progress by using the current mtime and previously set compare 
value....
   > 
   > I didn't do all of this, as I still don't understand why you can't just 
allow the inaccuracy in a single tick cycle time. 
   > 
   
   It impossible to achieve the accuracy within 1 tick for any 
sleep/timeout/wdog, that's why wdog always increase 1 tick:
   https://github.com/apache/nuttx/blob/master/sched/wdog/wd_start.c#L295
   so, I never think one single tick error is a problem.
   
   The suggestion I made here is only to improve the peformance.
   
   > What is the practical problem you are trying to solve??
   
   What's I really care is the drift and accuracy from 
clock_gettime(riscv_mtimer_current).
   
   > I can, however, make a suggestion tomorrow following that path if you 
insist.
   > 
   > I see that there is again changes to arm64 tick timer. I hope this time 
drift was not added _again_ there. It is painful to debug because it causes 
very random failures in HW drivers due to timeout timers firing too early 
randomly.
   > 
   
   the early wakeup could be fixed by tick+1 and roundup hardware count. 
   
   > But again, adding this drift to the timer is a horrible mistake, which is 
repeated over and over again with these timers.
   
   Yes, I agree that timing error is hard to fix. so, let's come up a common 
solution to fix both arm and riscv platform.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to