https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90883

--- Comment #7 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 18 Jun 2019, law at redhat dot com wrote:

> slow ()
> {
>   struct C D.25898;
>   struct C D.29462;
> 
> ;;   basic block 2, loop depth 0, count 1073741824 (estimated locally), maybe
> hot
> ;;    prev block 0, next block 1, flags: (NEW, REACHABLE, VISITED)
> ;;    pred:       ENTRY [always]  count:1073741824 (estimated locally)
> (FALLTHRU,EXECUTABLE)
>   D.25898.a = {};
>   D.29462 = D.25898;
>   D.25898 ={v} {CLOBBER};
>   return D.29462;
> ;;    succ:       EXIT [always]  count:1073741824 (estimated locally)
> 
> }
> 
> WHich still isn't sufficient to get good code.
> 
> I'm not really sure what you want DSE to do here Richi :-)

I observed that

  D.26322 = {};
  D.26322.a = {};

looks like that the later store is dead (a C testcase showing actual
layout might be nice here).  Of course DSE doesn't work this way
around but trimming might be able to trim the second store instead of
the first (to nothing)?  I also noticed that

  MEM[(struct C *)&D.26322 + 7B] = {};
  D.26322.a = {};

here the first store is at offset 7 which will result in unaligned
and or small stores.  DSE doesn't seem to exploit the fact that
we do not need to preserve the stores into the padding (in fact
we do not expand that way I think).

Given that -fno-tree-dse produces nearly optimal code
(well, RTL manages to clean up all the useless stuff) some of
the above might help.

Reply via email to