On Thu, Aug 14, 2025 at 2:49 PM Vineet Gupta <vine...@rivosinc.com> wrote:
>
> On 8/14/25 13:36, Jeff Law wrote:
> > On 8/14/25 8:23 AM, Vineet Gupta wrote:
> >> __builtin_round() fails to save/restore FP exception flags around the FP
> >> compare insn which can potentially clobber the same.
> >>
> >> Worth noting that the fflags restore bracketing is slightly different
> >> than the glibc implementation. gcc implementation w/ this patch
> >> generates following, where fsflags is called even if fcvt* were not
> >> executed because the prior flt can also clobber the flags. glibc
> >> implementation due to early NaN check (before the flt.s) only needs
> >> inside of the branch.
> >>
> >> | convert_float_to_float_round
> >> | ...
> >> |   frflags  a5
> >> |   fabs.s   fa5,fa0
> >> |   flt.s    a4,fa5,fa4    <--- can clobber fflags
> >> |   beq      a4,zero,.L3
> >> |     fcvt.w.s a4,fa0,rmm     <--- also
> >> |     fcvt.s.w       fa5,a4
> >> |     fsgnj.s        fa0,fa5,fa0
> >> | .L3:
> >> |    fsflags a5            <-- both code paths
> >>
> >>      PR target/121534
> >>
> >> gcc/ChangeLog:
> >>
> >>      * config/riscv/riscv.md (round_pattern): save/restore fflags.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>      * gcc.target/riscv/round_64.c: Scan for frflags and fsflags.
> > So I'm guessing the issue at hand is in that rounding path that does not
> > use Zfa we've introduced a flt instruction where we didn't have one
> > before (previously we likey called into glibc).
>
> No flt exists even in glibc - it just happens after NaN check (look at the
> snippet above which is mor eor less same for both cases)
>
> > And so now we have to
> > manage the accrued exception state?  Essentially saving/restoring state
> > around the round() case?
>
> glibc already saves/restores arounf ftl which gcc code doesn't - which is the 
> bug.
>
> >
> > My recollection is that sequence was dramatically better than what we
> > had before, so even with a bit mucking around with FP CSRs it's likely
> > still profitable.
>
> For !zfa gcc is slightly worse off as the flag restore happens unconditionally
> vs. glibc not doing this for NaN, granted we don't want to optimize that.
> So we will be at par with glibc, not better. The better off was with zfa 
> AFAIKR.
>
> > It failed CI, presumably a scan test:
> >
> >> https://github.com/ewlu/gcc-precommit-ci/issues/3774#issuecomment-3188777455
> > Assuming the scan test just needs the count updated, this is OK for the
> > trunk.
>
> Yes it needed adjustment due to additional save/restore. Fixed that and will
> push shortly.

While you're at it, does it make sense to skip the flag save/restore
if !flag_trapping_math?  It's an admittedly minor point, since the
non-Zfa case is more about correctness than it is about performance,
but if it's easy to do, you might as well.

>
> Thx,
> -Vineet

Reply via email to