https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113965
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|tree-optimization |testsuite Blocks|53947 | Target Milestone|14.0 |--- Summary|[14 Regression] |gcc.target/aarch64/sve/mask |gcc.target/aarch64/sve/mask |_struct_load_3_run.c still |_struct_load_3_run.c still |fails with qemu due to |fails |_Float16 rounding error --- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> --- I think this might be a rounding issue with qemu and _Float16 I think. So _Float16 has 10/11 bits of Significand precision which means it can represent 0 - 2047 and then outside of that it will need to truncate some bits. In the testcase the first time where we get a failure is: i: 40 mask: 1 out[i]: 2906.000000 if_true: 2904.000000 if_false: 140.000000 expected[i]: 2904.000000 As you can see there is one bit difference at the end. I have not looked into why not using SVE here we are able to get the correct value though. But I can say for sure that the vectorizer is doing the correct thing in that using "unsigned short" instead produces the exact same .optimized file out for the function (test_f16_f16_i8_4). So removing the regression markers and moving this to testsuite as I think this is not an issue with the code generation. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations