[Bug target/121095] [15 Regression] Possibly unnecessary PRE pass on aarch64 for fpmr

2025-07-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121095

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #10 from Andrew Pinski  ---
Fixed.

[Bug target/121095] [15 Regression] Possibly unnecessary PRE pass on aarch64 for fpmr

2025-07-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121095

--- Comment #9 from GCC Commits  ---
The releases/gcc-15 branch has been updated by Andrew Pinski
:

https://gcc.gnu.org/g:b26588f0cb42a6a37ad7c303e61b23f471758e0c

commit r15-10073-gb26588f0cb42a6a37ad7c303e61b23f471758e0c
Author: Andrew Pinski 
Date:   Wed Jul 16 09:31:35 2025 -0700

gcse: Skip hardreg pre when the hardreg is never live [PR121095]

r15-6789-ge7f98d9603808b added a new RTL pass for hardreg PRE for the hard
register
of FPM_REGNUM, this pass could get expensive if you have a large number of
basic blocks
and the hard register was never live so it does nothing in the end.
In the aarch64 case, FPM_REGNUM is only used for FP8 related code so it has
a high probability
of not being used. So skipping the pass for that register can improve both
compile time and memory
usage.

Build and tested for aarch64-linux-gnu.

PR middle-end/121095
gcc/ChangeLog:

* gcse.cc (execute_hardreg_pre): Skip if the hardreg which is never
live.

Signed-off-by: Andrew Pinski 
(cherry picked from commit 6916639b48357334579cf94717a3e51dd003e940)

[Bug target/121095] [15 Regression] Possibly unnecessary PRE pass on aarch64 for fpmr

2025-07-18 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121095

Segher Boessenkool  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #8 from Segher Boessenkool  ---
Hi!

(In reply to lucier from comment #2)
> My take, without having any serious knowledge of what's going on, is:
> 
> If a C function doesn't involve (set, read, manipulate, ...) FP8 values or
> the fpmr register in any way, then this PRE pass shouldn't be run, because
> this PRE pass only deals with the fpmr register, and I don't see how any
> relevant code can be affected by, or can affect, the fpmr register if no FP8
> manipulations are involved in a routine.

Yup.  Andrew's patch does pretty much this, right?

> And I suggest that this should be true even if the architecture supports FP8
> arithmetic in general.
> 
> I am ignorant of the details of what's going on here, so my expectation may
> very well be incorrect.
> 
> This PRE pass wasn't run on this example function because more than 128MB of
> memory would have been needed, but when building the Gambit Scheme system
> there are many relatively large routines where this PRE pass is run.

The hardreg PRE pass makes a lot of garbage, just because of how it works
(nested loops, a loop per reg, it could be coded a bit smarter).

> In another large file where this PRE pass was run, the dump file contained:
> 
> PRE GCSE of ___H___num, 31567 basic blocks, 2524104 bytes needed, 0 substs,
> 0 insns created
> 
> Does "0 substs, 0 insns created" mean that running the pass had no effect on
> the code?

Pretty much, yes.

[Bug target/121095] [15 Regression] Possibly unnecessary PRE pass on aarch64 for fpmr

2025-07-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121095

Andrew Pinski  changed:

   What|Removed |Added

Summary|[15/16 Regression] Possibly |[15 Regression] Possibly
   |unnecessary PRE pass on |unnecessary PRE pass on
   |aarch64 for fpmr|aarch64 for fpmr
  Known to work||14.3.0, 16.0
  Known to fail||15.1.0

--- Comment #7 from Andrew Pinski  ---
Fixed on the trunk. I am going to wait until after 15.2.0 is released to
backport it there. I want some soaking time on the trunk for the patch.