https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110327
--- Comment #3 from Jeffrey A. Law <law at gcc dot gnu.org> --- Two block copies isn't fatal when the second one is the one with the actual jump thread. But costing does get more complex. Basically we copy 8 so that we can isolate its two incoming paths which thread differently in bb10. That's pretty standard stuff. It looks like that particular threading possibility is hidden until after DOM3 is complete. Prior to and during DOM3, there's another block in the way. # c.3_16 = PHI <0(3), 2(5)> _9 = h; _10 = *_9; _11 = *_10; _12 = *_11; _13 = (char) _12; if (_13 != -28) goto <bb 8>; [0.00%] else goto <bb 9>; [100.00%] ;; succ: 8 ;; 9 ;; basic block 8, loop depth 0 ;; pred: 7 __builtin_unreachable (); ;; succ: ;; basic block 9, loop depth 1 ;; pred: 7 if (_12 <= 4) goto <bb 10>; [50.00%] else goto <bb 15>; [50.00%] ;; succ: 10 ;; 15 ;; basic block 10, loop depth 1 ;; pred: 9 _15 = d; *_10 = _15; if (c.3_16 <= 0) goto <bb 11>; [5.50%] else goto <bb 12>; [94.50%] Note bb9.