pussuw commented on code in PR #9103:
URL: https://github.com/apache/nuttx/pull/9103#discussion_r1193895403


##########
arch/risc-v/src/common/riscv_macros.S:
##########
@@ -227,8 +222,15 @@
   REGLOAD      t0, REG_INT_CTX(\out)
   li           t1, MSTATUS_FS
   and          t2, t0, t1
-  li           t1, MSTATUS_FS_INIT
-  ble          t2, t1, 1f
+  li           t1, MSTATUS_FS_DIRTY
+  bne          t2, t1, 1f
+
+  /* Reset FS bit to MSTATUS_FS_CLEAN */
+  li           t1, MSTATUS_FS_CLEAN

Review Comment:
   >It's strange that almost every thread use FPU. Does the code use FPU 
instruction explicitly or generated by compler implicitely?
   
   We use FPU explicitly in our application via the C standard float/double 
datatypes and we enforce the FPU HW block into use, so in relocation it 
generates actual HW FPU instructions (instead of LIBM). Most / almost every 
thread we have running uses floating point arithmetics (thus also use the FPU).
   
   > It's fine to add a new option, but I am afraid the new option can't help 
so much as you expect if FPU instruction is inserted by compiler like gcc for 
arm32/arm64.
   
   Of course I will profile this before submitting anything. I did an initial 
test and it yielded about a 5% decrease in CPU load, but I compared it to the 
current implementation that does not work. If we remove the current non-working 
lazy-FPU implementation the CPU load should increase a bit due to 
saving/restoring always.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to