https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122274
Bug ID: 122274
Summary: Prevent copy propagation from a non-frame related insn
to a frame related insn
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: jskumari at gcc dot gnu.org
Target Milestone: ---
Created attachment 62552
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62552&action=edit
RTL before pro_and_epilogue
For a test with the following gimple:
void * foo (struct n * const this, const size_t size)
{
struct n * const D.10636;
int (*__vtbl_ptr_type) () * _1;
int (*__vtbl_ptr_type) () _2;
void * _6;
void * _9;
void * PROF_13;
void * _15;
void * _16;
struct n * const _19;
;; basic block 2, loop depth 0
;; pred: ENTRY
_1 = this_4(D)->D.3556._vptr.n;
_2 = MEM[(int (*__vtbl_ptr_type) () *)_1 + 40B];
_6 = __builtin_return_address (0);
PROF_13 = [obj_type_ref] OBJ_TYPE_REF(_2;(struct n)this_4(D)->5B);
if (PROF_13 == _ZThn8_N3t9a8barEmPKv)
goto <bb 3>; [80.00%]
else
goto <bb 4>; [20.00%]
;; succ: 3
;; 4
;; basic block 3, loop depth 0
;; pred: 2
_19 = this_4(D) + 18446744073709551608;
_16 = *.LTHUNK9 (_19, size_7(D), _6); [tail call]
goto <bb 5>; [100.00%]
;; succ: 5
;; basic block 4, loop depth 0
;; pred: 2
_15 = OBJ_TYPE_REF(_2;(struct n)this_4(D)->5B) (this_4(D), size_7(D), _6);
[tail call]
;; succ: 5
;; basic block 5, loop depth 0
;; pred: 4
;; 3
# _9 = PHI <_15(4), _16(3)>
return _9;
;; succ: EXIT
}
The RTL before pro_and_epilogue pass is attached. In brief, it is as follows:
(Note that insn 38 maps back to the call to __builtin_return_address(0)).
BB2:
...
...
(insn 38 2 7 2 (set (reg:DI 5 5 [131])
(reg:DI 96 lr)) 694 {*movdi_internal64}
(nil))
...
...
(insn 11 8 12 2 (set (reg:CC 100 0 [128])
(compare:CC (reg/f:DI 12 12 [orig:118 _2 ] [118])
(reg/f:DI 10 10 [127]))) 874 {*cmpdi_signed}
(nil))
(jump_insn 12 11 13 2 (set (pc)
(if_then_else (ne (reg:CC 100 0 [128])
(const_int 0 [0]))
(label_ref 21)
(pc))) 963 {*cbranch}
BB3:
...
...
(call_insn/j 18 17 19 3 (parallel [
(set (reg:DI 3 3)
(call (mem:SI (symbol_ref:DI ("*.LTHUNK9.lto_priv.0") [flags
0x3] <function_decl 0x7fff802cd200 *.LTHUNK9>) [0 *.LTHUNK9 S4 A8])
(const_int 0 [0])))
(simple_return)
]) "foo.cpp":24:20 discrim 1 863 {*sibcall_value_aixdi}
(expr_list:REG_CALL_DECL (symbol_ref:DI ("*.LTHUNK9.lto_priv.0") [flags
0x3] <function_decl 0x7fff802cd200 *.LTHUNK9>)
(nil))
(expr_list (use (reg:DI 2 2))
(expr_list:DI (use (reg:DI 3 3))
(expr_list:DI (use (reg:DI 4 4))
(expr_list:DI (use (reg:DI 5 5))
(nil))))))
BB4:
code_label 21
...
...
(call_insn 28 27 35 4 (parallel [
(set (reg:DI 3 3)
(call (mem:SI (reg:DI 97 ctr) [0
*OBJ_TYPE_REF(_2;this_4(D)->5B) S4 A8])
(const_int 0 [0])))
(use (const_int 0 [0]))
(set (reg:DI 2 2)
(unspec:DI [
(const_int 24 [0x18])
] UNSPEC_TOCSLOT))
(clobber (reg:DI 96 lr))
]) "foo.cpp":24:20 discrim 1 845 {*call_value_indirect_elfv2di}
(expr_list:REG_CALL_DECL (nil)
(nil))
(expr_list (use (reg:DI 12 12))
(expr_list:DI (use (reg:DI 3 3))
(expr_list:DI (use (reg:DI 4 4))
(expr_list:DI (use (reg:DI 5 5))
(nil))))))
...
...
In the pro_and_epilogue pass, shrink wrapping is attempted and it is determined
that BB4 needs a prolog due to insn 28.
RTL after the pro_and_epilogue pass:
BB2:
...
...
(insn 38 43 7 2 (set (reg:DI 5 5 [131])
(reg:DI 96 lr)) 694 {*movdi_internal64}
(nil))
...
...
BB3:
...
...
(call_insn/j 18 17 57 3 (parallel [
(set (reg:DI 3 3)
(call (mem:SI (symbol_ref:DI ("*.LTHUNK9.lto_priv.0") [flags
0x3] <function_decl 0x7fff802cd200 *.LTHUNK9>) [0 *.LTHUNK9 S4 A8])
(const_int 0 [0])))
(simple_return)
]) "foo.cpp":24:20 discrim 1 863 {*sibcall_value_aixdi}
(expr_list:REG_CALL_DECL (symbol_ref:DI ("*.LTHUNK9.lto_priv.0") [flags
0x3] <function_decl 0x7fff802cd200 *.LTHUNK9>)
(nil))
(expr_list (use (reg:DI 2 2))
(expr_list:DI (use (reg:DI 3 3))
(expr_list:DI (use (reg:DI 4 4))
(expr_list:DI (use (reg:DI 5 5))
(nil))))))
BB4:
(insn/f 44 22 45 4 (set (reg:DI 0 0)
(reg:DI 96 lr)) "foo.cpp":23:1 -1
(nil))
(insn/f 45 44 46 4 (set (mem/c:DI (plus:DI (reg/f:DI 1 1)
(const_int 16 [0x10])) [18 S8 A8])
(reg:DI 0 0)) "foo.cpp":23:1 -1
(nil))
...
...
(call_insn 28 27 35 4 (parallel [
(set (reg:DI 3 3)
(call (mem:SI (reg:DI 97 ctr) [0
*OBJ_TYPE_REF(_2;this_4(D)->5B) S4 A8])
(const_int 0 [0])))
(use (const_int 0 [0]))
(set (reg:DI 2 2)
(unspec:DI [
(const_int 24 [0x18])
] UNSPEC_TOCSLOT))
(clobber (reg:DI 96 lr))
]) "foo.cpp":24:20 discrim 1 845 {*call_value_indirect_elfv2di}
(expr_list:REG_CALL_DECL (nil)
(nil))
(expr_list (use (reg:DI 12 12))
(expr_list:DI (use (reg:DI 3 3))
(expr_list:DI (use (reg:DI 4 4))
(expr_list:DI (use (reg:DI 5 5))
(nil))))))
And after cprop_hardreg, we have:
BB4:
(insn/f 44 22 45 4 (set (reg:DI 0 0)(reg:DI 96 lr)) "foo.cpp":23:1 694
{*movdi_internal64}
(expr_list:REG_DEAD (reg:DI 96 lr)
(expr_list:REG_UNUSED (reg:DI 0 0)
(nil))))
(insn/f 45 44 46 4 (set (mem/c:DI (plus:DI (reg/f:DI 1 1)
(const_int 16 [0x10])) [18 S8 A8])
(reg:DI 5 5 [0])) "foo.cpp":23:1 694 {*movdi_internal64}
(nil))
As we can see, in insn 45, r0 is replaced with r5 as cprop_hardreg figures out
that r0 at insn 45 (resulting from insn 44) has the same value as r5 (from insn
38).
Since r0 is REG_UNUSED (in insn 44), in the DCE pass, insn 44 is deleted.
This results in the following Call Frame Instructions being generated:
DW_CFA_advance_loc: 60 to 000000000001a7dc
DW_CFA_def_cfa_offset: 32
DW_CFA_offset_extended_sf: r5 at cfa+16
DW_CFA_advance_loc: 16 to 000000000001a7ec
DW_CFA_def_cfa_offset: 0
DW_CFA_advance_loc: 8 to 000000000001a7f4
DW_CFA_restore_extended: r65
DW_CFA_nop
DW_CFA_nop
DW_CFA_nop
This doesn’t tell us that the ‘lr’ register is at cfa+16. Due to this, we are
facing issues during stack unwinding.
What cprop_hardreg didn't take into account was that the setup of r5 (insn 38)
was not frame related but the setup of r0 was, and it's now substituting
something _into_ a frame-related insn. So the connection between insn 44 and
45
(via the frame-relatedness) that dwarf2out needs to figure out unwind info
is lost. Insn 44 is then also deleted, but that itself isn't the problem,
it's the loss of connection between insn 38 (ex-44) and insn 45 that is.
cprop_hardreg should not deem a non-frame-related insn equivalent to
a frame-related one, at least not when the target of propagation is
a use in a frame-related insn itself.
This looks like an age-old problem in regcprop that doesn't trigger
very often because it's simply not often the case that frame related
instructions come after non-frame-related instructions (that is what shrink
wrapping enables) _and_ where the non-frame-related earlier insns do the same
as
what the prologue insns do (that's what the builtin_return_address enables).