[Bug lto/65950] Loop is not vectorized with lto.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65950 Gian-Carlo Pascutto changed: What|Removed |Added CC||gcp at sjeng dot org --- Comment #6 from Gian-Carlo Pascutto --- I'm seeing similar behavior in gcc-4.9.2 (debian stable) and gcc-5.3.1 (Ubuntu 16.04 prelease). Adding -flto causes certain loops not to get vectorized, adding -fno-lto to affected files fixes it. Unfortunately producing a reduced testcase is not so easy, but program source is available to gcc devs on request.
[Bug lto/65950] Loop is not vectorized with lto.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65950 Andrew Pinski changed: What|Removed |Added Keywords||missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed||2015-12-24 Ever confirmed|0 |1 --- Comment #5 from Andrew Pinski --- Confirmed. Right before the vectorizer: Without LTO: : # k_36 = PHI# ivtmp_88 = PHI <16(5), ivtmp_40(7)> _11 = (long unsigned int) k_36; _12 = _11 * 4; _14 = v_13(D) + _12; _15 = *_14; _16 = k_36 + -1; _17 = (float) _16; _23 = pretmp_83 + _12; _24 = *_23; _25 = _17 * _24; _26 = _15 + _25; *_14 = _26; k_28 = k_36 + 1; ivtmp_40 = ivtmp_88 - 1; if (ivtmp_40 != 0) goto ; else goto ; : goto ; With LTO: : _11 = (long unsigned int) k_2; _12 = _11 * 4; _14 = v_13(D) + _12; _15 = *_14; _16 = k_2 + -1; _17 = (float) _16; _22 = *_21; _23 = _22 + _12; _24 = *_23; _25 = _17 * _24; _26 = _15 + _25; *_14 = _26; k_28 = k_2 + 1; : # k_2 = PHI # ivtmp_36 = PHI <17(3), ivtmp_35(4)> _10 = K_9 + 16; ivtmp_35 = ivtmp_36 - 1; if (ivtmp_35 != 0) goto ; else goto ; Notice the difference in the CFG.
[Bug lto/65950] Loop is not vectorized with lto.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65950 --- Comment #4 from Yuri Rumyantsev ysrumyan at gmail dot com --- The function containing given loop is marked as: foo/24 (foo) @0x7f39f4b84620 Type: function definition analyzed Visibility: prevailing_def_ironly References: Referring: Read from file: /tmp/ccKAP5Mo.o Availability: local First run: 0 Function flags: local nonfreeing_fn unlikely_executed Called by: main/23
[Bug lto/65950] Loop is not vectorized with lto.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65950 --- Comment #3 from Richard Biener rguenth at gcc dot gnu.org --- Do we eventually think the loop is cold?
[Bug lto/65950] Loop is not vectorized with lto.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65950 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com --- Created attachment 35432 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35432action=edit test-case to reproduce Must be compiled with -Ofast and -fopenmp options.
[Bug lto/65950] Loop is not vectorized with lto.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65950 --- Comment #2 from Andrew Pinski pinskia at gcc dot gnu.org --- This vectorized for me with GCC 5.1.0 on aarch64-linux-gnu.