[Bug target/98302] [11 Regression] Wrong code on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98302 --- Comment #13 from CVS Commits --- The master branch has been updated by Richard Sandiford : https://gcc.gnu.org/g:58a12b0eadac62e691fcf7325ab2bc2c93d46b61 commit r11-6381-g58a12b0eadac62e691fcf7325ab2bc2c93d46b61 Author: Richard Sandiford Date: Thu Dec 31 16:51:34 2020 + vect: Avoid generating out-of-range shifts [PR98302] In this testcase we end up with: unsigned long long x = ...; char y = (char) (x << 37); The overwidening pattern realised that only the low 8 bits of x << 37 are needed, but then tried to turn that into: unsigned long long x = ...; char y = (char) x << 37; which gives an out-of-range shift. In this case y can simply be replaced by zero, but as the comment in the patch says, it's kind-of awkward to do that in the middle of vectorisation. Most of the overwidening stuff is about keeping operations as narrow as possible, which is important for vectorisation but could be counter-productive for scalars (especially on RISC targets). In contrast, optimising y to zero in the above feels like an independent optimisation that would benefit scalar code and that should happen before vectorisation. gcc/ PR tree-optimization/98302 * tree-vect-patterns.c (vect_determine_precisions_from_users): Make sure that the precision remains greater than the shift count. gcc/testsuite/ PR tree-optimization/98302 * gcc.dg/vect/pr98302.c: New test.
[Bug target/98302] [11 Regression] Wrong code on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98302 rsandifo at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot gnu.org --- Comment #12 from rsandifo at gcc dot gnu.org --- This is caused by the overwidening pattern recognisers. They correctly realise that in: unsigned long long x = ...; char y = x << 37; only the low 8 bits of x << 37 are needed. But they then take it too far and try to do an 8-bit shift by 37, which is undefined in gimple. The optimal fix would be to get the vectoriser to replace the shift result with zero instead, but that's a bit awkward to do and should really happen before vectorisation. The simplest fix is to make sure that we don't narrow further than the shift amount allows. Testing a patch.
[Bug target/98302] [11 Regression] Wrong code on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98302 --- Comment #11 from Martin Liška --- > which is miscompiled at -O2 -ftree-vectorize or -O3. What a great reduction, can you please share knowledge how did you achieve that?!
[Bug target/98302] [11 Regression] Wrong code on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98302 --- Comment #10 from Alex Coplan --- Reduced to: int c = 1705; char a; long f = 50887638; unsigned long long *h(unsigned long long *k, unsigned long long *l) { return *k ? k : l; } void aa() {} int main() { long d = f; for (char g = 0; g < (char)c - 10; g += 2) { unsigned long long i = d, j = 4; a = *h(&i, &j) << ((d ? 169392992 : 0) - 169392955LL); } if (a) __builtin_abort(); } which is miscompiled at -O2 -ftree-vectorize or -O3.
[Bug target/98302] [11 Regression] Wrong code on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98302 --- Comment #9 from Martin Liška --- Btw. how powerful machine do you use for reduction? What's a wall time of an interestingness test you're going to use?
[Bug target/98302] [11 Regression] Wrong code on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98302 Martin Liška changed: What|Removed |Added Last reconfirmed||2020-12-16 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #8 from Martin Liška --- (In reply to Alex Coplan from comment #7) > Thanks, I can reproduce it now. Great. I tried to reduce that with: gcc10: -Werror -fsanitize=address,undefined -fno-sanitize-recover=all - OK gcc11: -O2 - OK gcc11: -O3 - Assert happens
[Bug target/98302] [11 Regression] Wrong code on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98302 --- Comment #7 from Alex Coplan --- Thanks, I can reproduce it now.
[Bug target/98302] [11 Regression] Wrong code on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98302 --- Comment #6 from Martin Liška --- Created attachment 49777 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49777&action=edit Packed yarpgen test-case
[Bug target/98302] [11 Regression] Wrong code on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98302 --- Comment #5 from Alex Coplan --- Can't repro with that seed (at least on aarch64-elf-gcc). I expect we're seeing different source files. I see: $ md5sum src/* 72fdf911a2c5f9cc21df5af3ffb4726e src/driver.cpp b8fdebf50f579fa5d7c93de5d42ae217 src/func.cpp ce2afb8e50893400329f5fd1d3015caf src/init.h $ yarpgen --version yarpgen version 2.0 (build 9cb35d3 on 2020:10:30) I guess we need (at least) matching (yarpgen commit, seed) to end up with the same source files? Might be easiest to just post the cpp files if possible.
[Bug target/98302] [11 Regression] Wrong code on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98302 --- Comment #4 from Martin Liška --- (In reply to Alex Coplan from comment #3) > Hi Martin, can you post the yarpgen seed (or the original cpp files) and I > will have a go at reproducing/reducing? Sure, it's 202213617. Thank you for help.
[Bug target/98302] [11 Regression] Wrong code on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98302 Alex Coplan changed: What|Removed |Added CC||acoplan at gcc dot gnu.org --- Comment #3 from Alex Coplan --- Hi Martin, can you post the yarpgen seed (or the original cpp files) and I will have a go at reproducing/reducing?
[Bug target/98302] [11 Regression] Wrong code on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98302 --- Comment #2 from Martin Liška --- Created attachment 49772 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49772&action=edit test case 2
[Bug target/98302] [11 Regression] Wrong code on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98302 --- Comment #1 from Martin Liška --- Created attachment 49771 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49771&action=edit test-case 1