On 8/14/25 13:36, Jeff Law wrote:
> On 8/14/25 8:23 AM, Vineet Gupta wrote:
>> __builtin_round() fails to save/restore FP exception flags around the FP
>> compare insn which can potentially clobber the same.
>>
>> Worth noting that the fflags restore bracketing is slightly different
>> than the glibc implementation. gcc implementation w/ this patch
>> generates following, where fsflags is called even if fcvt* were not
>> executed because the prior flt can also clobber the flags. glibc
>> implementation due to early NaN check (before the flt.s) only needs
>> inside of the branch.
>>
>> | convert_float_to_float_round
>> | ...
>> |   frflags  a5
>> |   fabs.s   fa5,fa0
>> |   flt.s    a4,fa5,fa4    <--- can clobber fflags
>> |   beq      a4,zero,.L3
>> |     fcvt.w.s a4,fa0,rmm     <--- also
>> |     fcvt.s.w       fa5,a4
>> |     fsgnj.s        fa0,fa5,fa0
>> | .L3:
>> |    fsflags a5            <-- both code paths
>>
>>      PR target/121534
>>
>> gcc/ChangeLog:
>>
>>      * config/riscv/riscv.md (round_pattern): save/restore fflags.
>>
>> gcc/testsuite/ChangeLog:
>>
>>      * gcc.target/riscv/round_64.c: Scan for frflags and fsflags.
> So I'm guessing the issue at hand is in that rounding path that does not 
> use Zfa we've introduced a flt instruction where we didn't have one 
> before (previously we likey called into glibc). 

No flt exists even in glibc - it just happens after NaN check (look at the
snippet above which is mor eor less same for both cases)

> And so now we have to 
> manage the accrued exception state?  Essentially saving/restoring state 
> around the round() case?

glibc already saves/restores arounf ftl which gcc code doesn't - which is the 
bug.

>
> My recollection is that sequence was dramatically better than what we 
> had before, so even with a bit mucking around with FP CSRs it's likely 
> still profitable.

For !zfa gcc is slightly worse off as the flag restore happens unconditionally
vs. glibc not doing this for NaN, granted we don't want to optimize that.
So we will be at par with glibc, not better. The better off was with zfa AFAIKR.

> It failed CI, presumably a scan test:
>
>> https://github.com/ewlu/gcc-precommit-ci/issues/3774#issuecomment-3188777455
> Assuming the scan test just needs the count updated, this is OK for the 
> trunk.

Yes it needed adjustment due to additional save/restore. Fixed that and will
push shortly.

Thx,
-Vineet

Reply via email to