https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121894

            Bug ID: 121894
           Summary: SRA (and DSE) vs. -ftrivial-auto-var-init=
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jakub at gcc dot gnu.org
  Target Milestone: ---

Consider
struct S { int a, b, c, d; };
void bar (int, int, int, int);

void
foo ()
{
  S s;
  s.a = 1;
  s.c = 2;
  s.d = 3;
  s.a++;
  s.c++;
  s.d++;
  bar (s.a, s.b, s.c, s.d);
}
with -O2 -fno-tree-fre -fno-tree-pre -ftrivial-auto-var-init=pattern
-Wuninitialized
With s/s.b/0/ SRA nicely optimizes it into
  s$a_21 = .DEFERRED_INIT (4, 1, &"s"[0]);
  s$c_22 = .DEFERRED_INIT (4, 1, &"s"[0]);
  s$d_23 = .DEFERRED_INIT (4, 1, &"s"[0]);
  s = .DEFERRED_INIT (16, 1, &"s"[0]);
  s$a_24 = 1;
  s$c_25 = 2;
  s$d_26 = 3;
  _1 = s$a_24;
  _2 = _1 + 1;
  s$a_27 = _2;
  _3 = s$c_25;
  _4 = _3 + 1;
  s$c_28 = _4;
  _5 = s$d_26;
  _6 = _5 + 1;
  s$d_29 = _6;
  _7 = s$d_29;
  _8 = s$c_28;
  _9 = s$a_27;
  bar (_9, 0, _8, _7);
during esra and in optimized that is
  bar (2, 0, 3, 4); [tail call]
and nothing else.
But, with s.b in there we get (correctly -Wuninitialized warning), but SRA
doesn't
decide to scalarize s.b, so we end up with
  s = .DEFERRED_INIT (16, 1, &"s"[0]);
  _1 = s.b;
  bar (2, _1, 3, 4); [tail call]
  s ={v} {CLOBBER(eos)};
So, we uselessly clear or pattern set 16 bytes only to load 4 bytes from it.
Why can't it be handled like the others and turned into 
  s$b_41 = .DEFERRED_INIT (4, 1, &"s"[0]);
  bar (2, s$b_41, 3, 4);
?

I guess we should look at other optimizations.  Say for memset tree DSE can
optimize in following -O2:
struct S { int a, b, c, d, e, f, g, h; };
void bar (S &);

void
foo (void)
{
  S s;
  __builtin_memset (&s, 0, sizeof (s));
  s.a = 1;
  s.b = 2;
  s.h = 3;
  bar (s);
}
  __builtin_memset (&s, 0, 32);
into
  __builtin_memset (&MEM <char> [(void *)&s + 8B], 0, 20);
but without the __builtin_memset call in there and -O2
-ftrivial-auto-var-init=zero
  s = .DEFERRED_INIT (32, 2, &"s"[0]);
is not optimized into
  MEM <char> [(void *)&s + 8B] = .DEFERRED_INIT (20, 2, &s"s"[0]);

Yet another thing, for -ftrivial-auto-var-init=zero, it would be nice after
late uninit pass to change the remaining .DEFERRED_INIT calls into memset or =
{} and let something optimize subsequent initializations of parts of the region
to zero.
Say in
struct S { int a, b, c, d, e, f, g, h; };
void bar (S &);

void
foo (void)
{
  S s;
  s.c = 0;
  s.d = 0;
  s.f = 0;
  bar (s);
}
it is desirable for -Wuninitialized purposes to keep until late uninit the
explicit stores of zero into the parts of the structure (or memsets or = {} of
parts of it),
but if .DEFERRED_INIT just zero initializes the whole struct, there is no point
in
overwriting it again.
So
  s = .DEFERRED_INIT (32, 2, &"s"[0]);
  MEM <vector(2) int> [(int *)&s + 8B] = { 0, 0 };
  s.f = 0;
  bar (&s);
should be just turned into
  s = {};
  bar (&s);

Reply via email to