On Thu, Aug 14, 2025 at 2:49 PM Vineet Gupta <vine...@rivosinc.com> wrote: > > On 8/14/25 13:36, Jeff Law wrote: > > On 8/14/25 8:23 AM, Vineet Gupta wrote: > >> __builtin_round() fails to save/restore FP exception flags around the FP > >> compare insn which can potentially clobber the same. > >> > >> Worth noting that the fflags restore bracketing is slightly different > >> than the glibc implementation. gcc implementation w/ this patch > >> generates following, where fsflags is called even if fcvt* were not > >> executed because the prior flt can also clobber the flags. glibc > >> implementation due to early NaN check (before the flt.s) only needs > >> inside of the branch. > >> > >> | convert_float_to_float_round > >> | ... > >> | frflags a5 > >> | fabs.s fa5,fa0 > >> | flt.s a4,fa5,fa4 <--- can clobber fflags > >> | beq a4,zero,.L3 > >> | fcvt.w.s a4,fa0,rmm <--- also > >> | fcvt.s.w fa5,a4 > >> | fsgnj.s fa0,fa5,fa0 > >> | .L3: > >> | fsflags a5 <-- both code paths > >> > >> PR target/121534 > >> > >> gcc/ChangeLog: > >> > >> * config/riscv/riscv.md (round_pattern): save/restore fflags. > >> > >> gcc/testsuite/ChangeLog: > >> > >> * gcc.target/riscv/round_64.c: Scan for frflags and fsflags. > > So I'm guessing the issue at hand is in that rounding path that does not > > use Zfa we've introduced a flt instruction where we didn't have one > > before (previously we likey called into glibc). > > No flt exists even in glibc - it just happens after NaN check (look at the > snippet above which is mor eor less same for both cases) > > > And so now we have to > > manage the accrued exception state? Essentially saving/restoring state > > around the round() case? > > glibc already saves/restores arounf ftl which gcc code doesn't - which is the > bug. > > > > > My recollection is that sequence was dramatically better than what we > > had before, so even with a bit mucking around with FP CSRs it's likely > > still profitable. > > For !zfa gcc is slightly worse off as the flag restore happens unconditionally > vs. glibc not doing this for NaN, granted we don't want to optimize that. > So we will be at par with glibc, not better. The better off was with zfa > AFAIKR. > > > It failed CI, presumably a scan test: > > > >> https://github.com/ewlu/gcc-precommit-ci/issues/3774#issuecomment-3188777455 > > Assuming the scan test just needs the count updated, this is OK for the > > trunk. > > Yes it needed adjustment due to additional save/restore. Fixed that and will > push shortly.
While you're at it, does it make sense to skip the flag save/restore if !flag_trapping_math? It's an admittedly minor point, since the non-Zfa case is more about correctness than it is about performance, but if it's easy to do, you might as well. > > Thx, > -Vineet