I was curious why LRA needed to handle noncanonical rtl in addresses: if (CONSTANT_P (arg0) || code1 == PLUS || code1 == MULT || code1 == ASHIFT) { tloc = arg1_loc; arg1_loc = arg0_loc; arg0_loc = tloc; arg0 = *arg0_loc; code0 = GET_CODE (arg0); arg1 = *arg1_loc; code1 = GET_CODE (arg1); }
and the culprit in all the cases I could see was emit_block_move_via_loop, which generated (plus (const...) (reg ...)) rather than the correct (plus (reg ...) (const...)). Tested on x86_64-linux-gnu and applied as obvious. Richard gcc/ * expr.c (emit_block_move_via_loop): Use simplify_gen_binary rather than gen_rtx_PLUS. Index: gcc/expr.c =================================================================== --- gcc/expr.c 2012-10-23 19:37:49.000000000 +0100 +++ gcc/expr.c 2012-10-25 09:36:33.071286674 +0100 @@ -1464,11 +1464,11 @@ emit_block_move_via_loop (rtx x, rtx y, emit_label (top_label); tmp = convert_modes (x_addr_mode, iter_mode, iter, true); - x_addr = gen_rtx_PLUS (x_addr_mode, x_addr, tmp); + x_addr = simplify_gen_binary (PLUS, x_addr_mode, x_addr, tmp); if (x_addr_mode != y_addr_mode) tmp = convert_modes (y_addr_mode, iter_mode, iter, true); - y_addr = gen_rtx_PLUS (y_addr_mode, y_addr, tmp); + y_addr = simplify_gen_binary (PLUS, y_addr_mode, y_addr, tmp); x = change_address (x, QImode, x_addr); y = change_address (y, QImode, y_addr);