https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90883
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jamborm at gcc dot gnu.org, | |law at gcc dot gnu.org, | |rguenth at gcc dot gnu.org Component|c++ |tree-optimization --- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- So before SRA we have slow () { struct C D.26322; struct C D.29886; <bb 2> : D.26322 = {}; D.26322.a = {}; D.26322.b = 0; D.29886 = D.26322; D.26322 ={v} {CLOBBER}; return D.29886; fast () { struct C c; struct C D.29889; <bb 2> : c = {}; D.29889 = c; c ={v} {CLOBBER}; return D.29889; and then SRA does nothing to fast but half-way optimizes slow() to <bb 2> : D.26322 = {}; D.26322.a = {}; D.29886 = D.26322; D.26322 ={v} {CLOBBER}; return D.29886; where then probably DSE wrecks things by trimming the earlier store: <bb 2> : MEM[(struct C *)&D.26322 + 7B] = {}; D.26322.a = {}; D.29886 = D.26322; D.26322 ={v} {CLOBBER}; return D.29886; which we later do not recover from (store-merging?). I also wonder why SRA does not elide the aggregate copy. -O2 -fno-tree-dse generates _Z4slowv: .LFB1099: .cfi_startproc movq $0, -32(%rsp) xorl %eax, %eax xorl %edx, %edx ret