https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116083
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> --- The last change gets us to tree DSE : 0.23 ( 16%) 2048 ( 0%) tree slp vectorization : 0.65 ( 47%) 2494k ( 15%) there's another pending improvement. The quadraticness with respect to SLP discovery depth and number of lanes is present since forever, the issue is the 13 branch succeeds with the full lanes SLP discovery by immediately building the store feeding from scalars because the call makes the first lane mismatch indicating a fatal failure as it still has the "bug" treating CFN_LAST as internal_fn_p while GCC 14 corrects this mistake to support OpenMP SIMD and other vectorizable calls. A more meaningful testcase would still run into the quadraticness even with older GCC.