[Bug middle-end/110757] [14 Regression] 7% parest regression on zen3 -Ofast -march=native -flto between g:4dbb3af1efe55174 (2023-07-14 00:54) and g:a5088dc3f5ef73c8 (2023-07-17 03:24)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110757 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P2 CC||law at gcc dot gnu.org
[Bug middle-end/110757] [14 Regression] 7% parest regression on zen3 -Ofast -march=native -flto between g:4dbb3af1efe55174 (2023-07-14 00:54) and g:a5088dc3f5ef73c8 (2023-07-17 03:24)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110757 --- Comment #4 from Jan Hubicka --- Most of the profile based regression is gone between g:1c6231c05bdccab3 (2023-07-21 03:06) and g:f33fdf9e7c038639 (2023-07-23 00:17) This should be: commit a31ef26b056d0c4f0a9f08b6eb81456ea257298e Author: Jan Hubicka Date: Fri Jul 21 19:38:26 2023 +0200 Avoid scaling flat loop profiles of vectorized loops Which "fixes" the overactive scaling of scale_profile_for_vect_loop for static profiles. Still not sure why propagating profile later causes regression - will take a look.
[Bug middle-end/110757] [14 Regression] 7% parest regression on zen3 -Ofast -march=native -flto between g:4dbb3af1efe55174 (2023-07-14 00:54) and g:a5088dc3f5ef73c8 (2023-07-17 03:24)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110757 Martin Jambor changed: What|Removed |Added CC||lili.cui at intel dot com, ||rguenth at gcc dot gnu.org Last reconfirmed||2023-07-22 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #3 from Martin Jambor --- And while I am at it, the 2.5% slowdown in April was caused by Richi's r14-332-g24905a4bd1375c (Adjust costing of emulated vectorized gather/scatter) and the 2.8% regression in May by 2.8% is caused by r14-1371-ge5405f065bace0 (Handle FMA friendly in reassoc pass). Both are small and so may not warrant their own bug-report but together they make up almost 6% and we are now 13% slower than GCC 13 on zen 3 and 2 (on the Intel machine in LNT it is just 2.7% and I see no regression on the Aarch64 one).
[Bug middle-end/110757] [14 Regression] 7% parest regression on zen3 -Ofast -march=native -flto between g:4dbb3af1efe55174 (2023-07-14 00:54) and g:a5088dc3f5ef73c8 (2023-07-17 03:24)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110757 --- Comment #2 from Martin Jambor --- The second slow-down of 4.5% was caused by r14-2546-g061f74c06735e1: 061f74c06735e1fa35b910ae0bcf01b61a74ec23 is the first bad commit commit 061f74c06735e1fa35b910ae0bcf01b61a74ec23 Author: Jan Hubicka Date: Sun Jul 16 23:56:59 2023 +0200 Fix profile update in scale_profile_for_vect_loop When vectorizing 4 times, we sometimes do for <4x vectorized body> for <2x vectorized body> for <1x vectorized body> Here the second two fors handling epilogue never iterates. Currently vecotrizer thinks that the middle for itrates twice. This turns out to be scale_profile_for_vect_loop that uses niter_for_unrolled_loop.
[Bug middle-end/110757] [14 Regression] 7% parest regression on zen3 -Ofast -march=native -flto between g:4dbb3af1efe55174 (2023-07-14 00:54) and g:a5088dc3f5ef73c8 (2023-07-17 03:24)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110757 Martin Jambor changed: What|Removed |Added CC||jamborm at gcc dot gnu.org --- Comment #1 from Martin Jambor --- The first (2%) slowdown seems to be due to r14-2524-gaa6741ef2e0c31 (Turn TODO_rebuild_frequencies to a pass), I'm now bisecting the bigger one.
[Bug middle-end/110757] [14 Regression] 7% parest regression on zen3 -Ofast -march=native -flto between g:4dbb3af1efe55174 (2023-07-14 00:54) and g:a5088dc3f5ef73c8 (2023-07-17 03:24)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110757 Andrew Pinski changed: What|Removed |Added Keywords||missed-optimization Summary|7% parest regression on |[14 Regression] 7% parest |zen3 -Ofast -march=native |regression on zen3 -Ofast |-flto between |-march=native -flto between |g:4dbb3af1efe55174 |g:4dbb3af1efe55174 |(2023-07-14 00:54) and |(2023-07-14 00:54) and |g:a5088dc3f5ef73c8 |g:a5088dc3f5ef73c8 |(2023-07-17 03:24) |(2023-07-17 03:24) Target Milestone|--- |14.0 CC||pinskia at gcc dot gnu.org Version|13.1.0 |14.0