pussuw commented on code in PR #9103: URL: https://github.com/apache/nuttx/pull/9103#discussion_r1193895403
########## arch/risc-v/src/common/riscv_macros.S: ########## @@ -227,8 +222,15 @@ REGLOAD t0, REG_INT_CTX(\out) li t1, MSTATUS_FS and t2, t0, t1 - li t1, MSTATUS_FS_INIT - ble t2, t1, 1f + li t1, MSTATUS_FS_DIRTY + bne t2, t1, 1f + + /* Reset FS bit to MSTATUS_FS_CLEAN */ + li t1, MSTATUS_FS_CLEAN Review Comment: >It's strange that almost every thread use FPU. Does the code use FPU instruction explicitly or generated by compler implicitely? We use FPU explicitly in our application via the C standard float/double datatypes and we enforce the FPU HW block into use, so in relocation it generates actual HW FPU instructions (instead of LIBM). Most / almost every thread we have running uses floating point arithmetics (thus also use the FPU). > It's fine to add a new option, but I am afraid the new option can't help so much as you expect if FPU instruction is inserted by compiler like gcc for arm32/arm64. Of course I will profile this before submitting anything. I did an initial test and it yielded about a 5% decrease in CPU load, but I compared it to the current implementation that does not work. If we remove the current non-working lazy-FPU implementation the CPU load should increase a bit due to saving/restoring always. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
