https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113965

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|tree-optimization           |testsuite
             Blocks|53947                       |
   Target Milestone|14.0                        |---
            Summary|[14 Regression]             |gcc.target/aarch64/sve/mask
                   |gcc.target/aarch64/sve/mask |_struct_load_3_run.c still
                   |_struct_load_3_run.c still  |fails with qemu due to
                   |fails                       |_Float16 rounding error

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I think this might be a rounding issue with qemu and _Float16 I think.
So _Float16 has 10/11 bits of Significand precision which means it can
represent 0 - 2047 and then outside of that it will need to truncate some bits.

In the testcase the first time where we get a failure is:
i: 40
mask: 1
out[i]: 2906.000000
if_true: 2904.000000
if_false: 140.000000
expected[i]: 2904.000000

As you can see there is one bit difference at the end. I have not looked into
why not using SVE here we are able to get the correct value though.
But I can say for sure that the vectorizer is doing the correct thing in that
using "unsigned short" instead produces the exact same .optimized file out for
the function (test_f16_f16_i8_4).

So removing the regression markers and moving this to testsuite as I think this
is not an issue with the code generation.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

Reply via email to