https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
Jakub Jelinek changed:
What|Removed |Added
Target Milestone|14.2|14.3
--- Comment #10 from Jakub Jelinek
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #9 from Tamar Christina ---
(In reply to prathamesh3492 from comment #8)
> Hi Tamar,
> Using -falign-loops=5 indeed brings back the performance.
> The adrp instruction has same address (0x4ae784) by setting -falign-loops=5
> (which r
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #8 from prathamesh3492 at gcc dot gnu.org ---
Hi Tamar,
Using -falign-loops=5 indeed brings back the performance.
The adrp instruction has same address (0x4ae784) by setting -falign-loops=5
(which reduces misalignment to 4) with/witho
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #7 from Tamar Christina ---
Yeah, it's most likely an alignment issue, especially as there's no code
changes.
We run our benchmarking with different flags so it may be why we don't see it.
the loop seems misaligned, you can try incr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
Richard Biener changed:
What|Removed |Added
Target Milestone|14.0|14.2
--- Comment #6 from Richard Biene
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #5 from Andrew Pinski ---
(In reply to prathamesh3492 from comment #4)
> To check for any
> possible icache misses I used L1I_CACHE_REFILL counter, and turns out that
> there are 64% more L1 icache misses for above adrp instruction w
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #4 from prathamesh3492 at gcc dot gnu.org ---
Hi Tamar,
Sorry for late response.
perf profile for povray with LTO:
Compiled with 82d6d385f97 (commit before a2f4be3dae0):
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #3 from Tamar Christina ---
I cannot reproduce this even recompiling libc.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
--- Comme
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #1 from Tamar Christina ---
Hmm
I Am unable to reproduce this with -O3 - flto -mcpu=neoverse-v2 on a
neoverse-v2 machine.
Is any other option required?
Also that code was new in gcc 14 and was partially reverted due to register
al
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
Richard Biener changed:
What|Removed |Added
Target Milestone|--- |14.0
11 matches
Mail list logo