https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63594

--- Comment #4 from Kirill Yukhin <kyukhin at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #3)
> Created attachment 33761 [details]
> WIP patch for discussions
> 
> From what I see, if TARGET_AVX512BW is not defined, then we obviously can't
> use
> ix86_vector_duplicate_value, but need two instructions (either it can be
> QI->V32QI / HI->V16HI broadcast followed by concat of the two parts, or
> QI->V16QI / HI->V8HI broadcast followed by concat of the 4 parts together).
> But, it seems even for -mavx2 or -mavx we actually generate terrible code,
> for -mavx2 there is no point in using 2 instructions when in theory
> vpbroadcast{b,w} should handle it alone just fine.
Right!

> The patch enables all of that, but unfortunately we generate perhaps not so
> good code with it, e.g. for -mavx2 in testchar32, we spill the argument
> always to memory, and then broadcast it from memory, even when vmovd +
> broadcast from register could have been used.
> And in testchar16, for some reason we spill into memory, and broadcast from
> vmovd result (so the spill is totally useless).
I think this is because of subreg:QI of reg:SI.
Before reload we have (for testchar32):
(insn 2 5 3 2 (set (reg:SI 86 [ c ])
        (reg:SI 5 di [ c ])) 1.c:22 90 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 5 di [ c ])
        (nil)))
(insn 7 4 12 2 (set (reg:V32QI 88 [ v ])
        (vec_duplicate:V32QI (subreg:QI (reg:SI 86 [ c ]) 0))) 1.c:22 4112
{vec_dupv32qi}
     (expr_list:REG_DEAD (reg:SI 86 [ c ])
        (nil)))

After reload we need to get rid off subreg:
(insn 2 5 3 2 (set (mem/c:SI (plus:DI (reg/f:DI 6 bp)
                (const_int -20 [0xffffffffffffffec])) [8 %sfp+-4 S4 A32])
        (reg:SI 5 di [ c ])) 1.c:22 90 {*movsi_internal}
     (nil))
(insn 7 4 12 2 (set (reg:V32QI 21 xmm0 [orig:88 v ] [88])
        (vec_duplicate:V32QI (mem/c:QI (plus:DI (reg/f:DI 6 bp)
                    (const_int -20 [0xffffffffffffffec])) [8 %sfp+-4 S1 A32])))
1.c:22 4112 {vec_dupv32qi}
     (nil))

> Uros/Kyrill, any thoughts on this?
I like the patch.

Reply via email to