https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111156

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note I think GCC should be able to vectorize this loop but it goes wrong.


SVE the 7 part gets lost:

```
  vect__3.12_54 = .MASK_LOAD (_48, 16B, loop_mask_52);
  _32 = cond_17(D) + POLY_INT_CST [16, 16];
  _25 = &MEM <vector([8,8]) short unsigned int> [(uint16_t *)_32 + ivtmp_77 *
2];
  vect__3.13_56 = .MASK_LOAD (_25, 16B, loop_mask_53);
  _1 = &MEM <vector([16,16]) signed char> [(int8_t *)src_18(D) + ivtmp_77 * 1];
  vect_pretmp_29.16_60 = .MASK_LOAD (_1, 8B, loop_mask_59);
  mask__14.19_66 = vect__3.12_54 > { 2, ... };
  mask__14.19_67 = vect__3.13_56 > { 2, ... };
  mask_patt_4.20_68 = VEC_PACK_TRUNC_EXPR <mask__14.19_66, mask__14.19_67>;
  vect_array.23 ={v} {CLOBBER};
  vect_array.23[0] = vect_pretmp_29.16_60;
  vect_array.23[1] = vect_pretmp_29.16_60;
  vec_mask_and_74 = loop_mask_59 & mask_patt_4.20_68;
  _2 = ivtmp_77 * 2;
  _3 = &MEM <vector([16,16]) signed char[2]> [(int8_t *)dest_19(D) + _2 * 1];
```


But RISCV is able to vectorize it correctly:
```
  vect__3.12_52 = .MASK_LEN_LOAD (vectp_cond.10_13, 16B, { -1, ... }, _72, 0);
  vect_pretmp_29.15_56 = .MASK_LEN_LOAD (vectp_src.13_54, 8B, { -1, ... }, _72,
0);
  mask__27.16_58 = vect__3.12_52 <= { 7, ... };
  .MASK_LEN_SCATTER_STORE (vectp_dest.17_60, { 0, 2, 4, ... }, 1,
vect_pretmp_29.15_56, mask__27.16_58, _72, 0);
  mask__14.19_64 = vect__3.12_52 > { 2, ... };
  .MASK_LEN_SCATTER_STORE (vectp_dest.20_67, { 0, 2, 4, ... }, 1,
vect_pretmp_29.15_56, mask__14.19_64, _72, 0);

```

By using 2 stores and scatter here.

Reply via email to