On Tue, Sep 27, 2022 at 10:46 AM Robin Dapp via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> > I did bootstrapping and ran the testsuite on x86(-64), aarch64, Power9
> > and s390.  Everything looks good except two additional fails on x86
> > where code actually looks worse.
> >
> > gcc.target/i386/keylocker-encodekey128.c
> >
> > 17c17,18
> > <       movaps  %xmm4, k2(%rip)
> > ---
> >>       pxor    %xmm0, %xmm0
> >>       movaps  %xmm0, k2(%rip)
> >
> > gcc.target/i386/keylocker-encodekey256.c:
> >
> > 19c19,20
> > <       movaps  %xmm4, k3(%rip)
> > ---
> >>       pxor    %xmm0, %xmm0
> >>       movaps  %xmm0, k3(%rip)
>
> Before the patch and after postreload we have:
>
> (insn (set (reg:V2DI xmm0)
>         (reg:V2DI xmm4))
>      (expr_list:REG_DEAD (reg:V2DI 24 xmm4)
>         (expr_list:REG_EQUIV (const_vector:V2DI [
>                     (const_int 0 [0]) repeated x2
>                 ])))))
> (insn (set (mem/c:V2DI (symbol_ref:DI ("k2"))
>         (reg:V2DI xmm0))))
>
> which is converted by cprop_hardreg to:
>
> (insn (set (mem/c:V2DI (symbol_ref:DI ("k2")))
>         (reg:V2DI xmm4))))
>
> With the change there is:
>
> (insn (set (reg:V2DI xmm0)
>         (const_vector:V2DI [
>                 (const_int 0 [0]) repeated x2
>             ])))
> (insn (set (mem/c:V2DI (symbol_ref:DI ("k2")))
>         (reg:V2DI xmm0))))
>
> which is not simplified further because xmm0 needs to be explicitly
> zeroed while xmm4 is assumed to be zeroed by encodekey128.  I'm not
> familiar with this so I'm supposing this is correct even though I found
> "XMM4 through XMM6 are reserved for future usages and software should
> not rely upon them being zeroed." online.

I opened:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107061

> Even inf xmm4 were zeroed explicity, I guess in this case the simple
> costing of mov reg,reg vs mov reg,imm (with the latter not being more
> expensive) falls short?  cprop_hardreg can actually propagate the zeroed
> xmm4 into the next move.
> The same mechanism could possibly even elide many such moves which would
> mean we'd unnecessarily emit many mov reg,0?  Hmm...

This sounds like an issue.

-- 
H.J.

Reply via email to