https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102436
Bug ID: 102436 Summary: [11/12 Regression] Lost Load/Store Motion Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: law at gcc dot gnu.org Target Milestone: --- Created attachment 51492 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51492&action=edit Testcase So consider this loop (-O2, lim2 dump, trunk, x86_64): ;; basic block 3, loop depth 1 ;; pred: 2 ;; 10 # target_8 = PHI <target_13(D)(2), target_17(10)> _4 = board[target_8]; if (_4 == 13) goto <bb 4>; [94.50%] else goto <bb 7>; [5.50%] ;; succ: 4 ;; 7 ;; basic block 4, loop depth 1 ;; pred: 3 if (captures.32_5 == 0) goto <bb 5>; [33.00%] else goto <bb 6>; [67.00%] ;; succ: 5 ;; 6 ;; basic block 5, loop depth 1 ;; pred: 4 numb_moves.1_21 = numb_moves; _22 = (long unsigned int) numb_moves.1_21; _23 = _22 * 24; _24 = (struct move_s *) _23; _24->from = gfrom.30_1; _24->target = target_8; _24->captured = 13; _24->castled = 0; _24->promoted = 0; _24->ep = 0; _26 = numb_moves.1_21 + 1; numb_moves = _26; ;; succ: 6 ;; basic block 6, loop depth 1 ;; pred: 4 ;; 5 target_17 = target_8 + offset_14; _7 = board[target_17]; if (_7 != 0) goto <bb 10>; [94.50%] else goto <bb 9>; [5.50%] ;; succ: 10 ;; 9 ;; basic block 10, loop depth 1 ;; pred: 6 goto <bb 3>; [100.00%] ;; succ: 3 In particular note the load from and store to numb_moves in block #5 within the loop. I don't immediately see an aliasing issue that would prevent LSM. The bigger problem is control flow, obviously the load/store may not be executed, but I thought our LIM/LSM code handled that correctly. If we look at gcc-10 we get something like this: ;; basic block 3, loop depth 1 ;; pred: 2 ;; 10 # target_9 = PHI <target_14(D)(2), target_19(10)> # numb_moves_lsm.43_6 = PHI <numb_moves_lsm.43_34(2), numb_moves_lsm.43_2(10)> # numb_moves_lsm_flag.44_20 = PHI <numb_moves_lsm_flag.44_35(2), numb_moves_lsm_flag.44_18(10)> _4 = board[target_9]; if (_4 == 13) goto <bb 4>; [94.50%] else goto <bb 7>; [5.50%] ;; succ: 4 ;; 7 ;; basic block 4, loop depth 1 ;; pred: 3 if (captures.32_5 == 0) goto <bb 5>; [33.00%] else goto <bb 6>; [67.00%] ;; succ: 5 ;; 6 ;; basic block 5, loop depth 1 ;; pred: 4 numb_moves.1_21 = numb_moves_lsm.43_6; _22 = (long unsigned int) numb_moves.1_21; _23 = _22 * 24; _24 = (struct move_s *) _23; _24->from = gfrom.30_1; _24->target = target_9; _24->captured = 13; _24->castled = 0; _24->promoted = 0; _24->ep = 0; _26 = numb_moves.1_21 + 1; numb_moves_lsm.43_37 = _26; numb_moves_lsm_flag.44_38 = 1; ;; succ: 6 ;; basic block 6, loop depth 1 ;; pred: 4 ;; 5 # numb_moves_lsm.43_2 = PHI <numb_moves_lsm.43_6(4), numb_moves_lsm.43_37(5)> # numb_moves_lsm_flag.44_18 = PHI <numb_moves_lsm_flag.44_20(4), numb_moves_lsm_flag.44_38(5)> target_19 = target_9 + offset_15; _8 = board[target_19]; if (_8 != 0) goto <bb 10>; [94.50%] else goto <bb 11>; [5.50%] ;; succ: 10 ;; 11 [ ... ] ;; basic block 10, loop depth 1 ;; pred: 6 goto <bb 3>; [100.00%] ;; succ: 3 Obviously with the load before the loop and the store after.