Sorry for the delay in replying. Richard's e-mails from googlemail appear
to end
up in my spam folder. Another reason for the delay is PR 120144/124313,
which
I'm a little surprised anyone working on MIPS wouldn't have noticed.
But I agree with Richard that my MIPS change did result in a performance
regression, indeed the sequence of two shifts by 32 was shown in one of
the examples of my post and the commit message. The solution that I've
been investigating looks like:
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index 6a1b63b00d2e..b84e671ee537 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -3946,13 +3946,12 @@
[(set_attr "move_type" "signext,load")
(set_attr "mode" "SI")])
-;; Combiner patterns for truncate/sign_extend combinations. The SI
versions
-;; use the shift/truncate patterns.
+;; Combiner patterns for truncate/sign_extend combinations.
(define_insn_and_split "*extenddi_truncate<mode>"
[(set (match_operand:DI 0 "register_operand" "=d")
(sign_extend:DI
- (truncate:SUBDI (match_operand:DI 1 "register_operand" "d"))))]
+ (truncate:SHORT (match_operand:DI 1 "register_operand" "d"))))]
"TARGET_64BIT && !TARGET_MIPS16 && !ISA_HAS_EXTS"
"#"
"&& reload_completed"
@@ -3969,6 +3968,16 @@
[(set_attr "move_type" "shift_shift")
(set_attr "mode" "DI")])
+;; Like truncdisi2, but with semantics
+(define_insn "*extenddi_truncatesi"
+ [(set (match_operand:DI 0 "register_operand" "=d")
+ (sign_extend:DI
+ (truncate:SI (match_operand:DI 1 "register_operand" "d"))))]
+ "TARGET_64BIT"
+ "sll\t%0,%1,0"
+ [(set_attr "move_type" "sll0")
+ (set_attr "mode" "SI")])
+
(define_insn_and_split "*extendsi_truncate<mode>"
[(set (match_operand:SI 0 "register_operand" "=d")
(sign_extend:SI
Notice that this solution contains the explicit sign-extension, allowing it
to
be correctly optimized and folded by the middle-end. The alternative, the
current situation is that the sign-extension semantics are hidden in the RTL
and exposed purely as !TRULY_NOOP_TRUNCATION. Targets such as nvptx
are also non-noop truncation, but they don't perform sign or zero extension.
I'm also investigating whether the mode_rep_extended change is required...
The problem is that the ABI/target requires/guarantees some invariants, but
they are not automatically provided by the hardware, but instead must be
maintained by the RTL optimizers. Some passes can take advantage of these
invariants, but other passes that change the RTL need to be aware of these
invariants and ensure that they are preserved/restored.
But I'm happy to go along with whatever Richard Sandford recommends,
access to real hardware trumps my fights with libgloss, combined trees and
cross-builds [all of which are much less stable now than when I wrote the
original correctness fix last year].
Cheers,
Roger
--
> -----Original Message-----
> From: Richard Sandiford <[email protected]>
> Sent: 08 March 2026 23:52
> To: [email protected]
> Cc: [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: ***SPAM*** [PATCH] Revert two changes in r16-7265-ga9e48eca3a6eef
> [PR118608]
>
> Sorry to be awkward, but I'd like to revert the rtlanal.cc and
config/mips/mips.md
> parts of r16-7265-ga9e48eca3a6eef. I think the expr.cc part of that patch
is
> enough to fix the bug. The other parts seem unnecessary and are likely to
regress
> code quality on MIPS compared to previous releases. (See the testing
below for
> examples.)
>
> The rtlanal.cc part added the following code to truncated_to_mode:
>
> /* This explicit TRUNCATE may be needed on targets that require
> MODE to be suitably extended when stored in X. Targets such as
> mips64 use (sign_extend:DI (truncate:SI (reg:DI x))) to perform
> an explicit extension, avoiding use of (subreg:SI (reg:DI x))
> which is assumed to already be extended. */
> scalar_int_mode imode, omode;
> if (is_a <scalar_int_mode> (mode, &imode)
> && is_a <scalar_int_mode> (GET_MODE (x), &omode)
> && targetm.mode_rep_extended (imode, omode) != UNKNOWN)
> return false;
>
> I think this has two problems. The first is that mode_rep_extended
describes a
> canonical form that is obtained by correctly honouring
> TARGET_TRULY_NOOP_TRUNCATION. It is not an independent restriction on
> what RTL optimisers can do. If we need to disable an optimisation on
MIPS-like
> targets, the restrictions should be based on TARGET_TRULY_NOOP_TRUNCATION
> instead.
>
> The second problem is that, although the comment treats MIPS-like
> DI->SI truncation as a special case, truncated_to_mode is specifically
> written for such cases. The comment above the function says:
>
> /* Suppose that truncation from the machine mode of X to MODE is not a
> no-op. See if there is anything special about X so that we can
> assume it already contains a truncated value of MODE. */
>
> Thus we're already in the realm of MIPS-like truncations that need
TRUNCATE
> rather than SUBREG (and that in turn guarantee sign-extension in some
cases).
> It's the caller that checks for that condition:
>
> && (TRULY_NOOP_TRUNCATION_MODES_P (mode, GET_MODE (op))
> || truncated_to_mode (mode, op)))
>
> So I think the patch has the effect of disabling exactly the kind of
optimisation that
> truncated_to_mode is supposed to provide.
>
> truncated_to_mode makes an implicit assumption that sign-extension is
enough
> to allow a SUBREG to be used in place of a TRUNCATE. This is true for
MIPS and
> was true for the old SH64 port. I don't know whether it's true for gcn
and nvptx,
> although I assume that it must be, since no-one seems to have complained.
> However, it would not be true for a port that required zero rather than
sign
> extension (which AFAIK we've never had).
>
> It's probably worth noting that this assumption is in the opposite
direction from
> what mode_rep_extended describes. mode_rep_extended says that "proper"
> truncation leads to a guarantee of sign extension.
> truncated_for_mode assumes that sign extension avoids the need for
"proper"
> truncation. On MIPS, the former is only true for truncation from 64 bits
to 32 bits,
> whereas the latter is true for all cases (such as 64 bits to 16 bits).
>
> And that feeds into the mips.md change in r16-7265-ga9e48eca3a6eef.
> The change was:
>
> (define_insn_and_split "*extenddi_truncate<mode>"
> [(set (match_operand:DI 0 "register_operand" "=d")
> (sign_extend:DI
> - (truncate:SHORT (match_operand:DI 1 "register_operand"
"d"))))]
> + (truncate:SUBDI (match_operand:DI 1 "register_operand"
> + "d"))))]
> "TARGET_64BIT && !TARGET_MIPS16 && !ISA_HAS_EXTS"
>
> The old :SHORT pattern existed because QI and HI values are only
guaranteed to
> be sign-extensions of bit 31 of the register, not bits
> 7 or 15 (respectively). Thus we have the worst of both worlds:
>
> (1) truncation from DI is not a nop. It requires a left shift by
> at least 32 bits and a right shift by the same amount.
>
> (2) sign extension to DI is not a nop. It requires a left shift and
> a right shift in the normal way (by 56 bits for QI and 48 bits
> for HI).
>
> So a separate truncation and extension would yield four shifts.
> The pattern above exists to reduce this to two shifts, since (2) subsumes
(1).
>
> But the :SI case is different:
>
> (1) truncation from DI is not a nop. It requires a left shift by 32
> and a right shift by 32, as above.
>
> (2) sign extension from SI to DI is a nop.
>
> (2) is implemented by:
>
> ;; When TARGET_64BIT, all SImode integer and accumulator registers ;;
should
> already be in sign-extended form (see TARGET_TRULY_NOOP_TRUNCATION ;;
> and truncdisi2). We can therefore get rid of register->register ;;
instructions if we
> constrain the source to be in the same register as ;; the destination.
> ;;
> ;; Only the pre-reload scheduler sees the type of the register
alternatives; ;; we
> split them into nothing before the post-reload scheduler runs.
> ;; These alternatives therefore have type "move" in order to reflect ;;
what
> happens if the two pre-reload operands cannot be tied, and are ;; instead
> allocated two separate GPRs. We don't distinguish between ;; the GPR and
LO
> cases because we don't usually know during pre-reload ;; scheduling
whether an
> operand will be LO or not.
> (define_insn_and_split "extendsidi2"
> [(set (match_operand:DI 0 "register_operand" "=d,l,d")
> (sign_extend:DI (match_operand:SI 1 "nonimmediate_operand"
"0,0,m")))]
> "TARGET_64BIT"
> "@
> #
> #
> lw\t%0,%1"
> "&& reload_completed && register_operand (operands[1], VOIDmode)"
> [(const_int 0)]
> {
> emit_note (NOTE_INSN_DELETED);
> DONE;
> }
> [(set_attr "move_type" "move,move,load")
> (set_attr "mode" "DI")])
>
> So extending the first pattern above from :SHORT to :SUBDI is not really
an
> optimisation, in the sense that it doesn't add new information.
> Not providing the combination allows the truncation or sign-extension to
be
> optimised with surrounding code.
>
> I suppose the argument in favour of going from :SHORT to :SUBDI is that it
might
> avoid a move in some cases. But (a) I think that would need to be
measured
> further, (b) it might instead mean that the
> extendsidi2 pattern needs to be tweaked for modern RA choices, and (c) it
doesn't
> really feel like stage 4 material.
>
> I can understand where the changes came from. The output of combine was
> clearly wrong before r16-7265-ga9e48eca3a6eef. And what combine did
looked
> bad. But I don't think combine itself did anything wrong.
> IMO, all it did was expose the problems in the existing RTL. Expand
dropped a
> necessary sign-extension and the rest flowed from there.
>
> In particular, the old decisions based on truncated_to_mode seemed
correct. The
> thing that the truncated_to_mode patch changed was the assumption that a
64-
> bit register containing a "u16 lower" parameter could be truncated with a
> SUBREG. And that's true, since it's guaranteed by the ABI. The parameter
is
> zero-extended from bit 16 and so the register contains a sign extension of
bit 16
> (i.e. 0).
> And that was the information that truncated_to_mode was using.
>
> I tested the patch on mips64-linux-gnu (all 3 ABIs). The patch fixes
regressions in:
>
> - gcc.target/mips/octeon-exts-7.c (n32 & 64)
> - gcc.target/mips/truncate-1.c (n32 & 64)
> - gcc.target/mips/truncate-2.c (n32)
> - gcc.target/mips/truncate-6.c (64)
>
> OK to install?
>
> Thanks,
> Richard
>
>
> gcc/
> Revert:
>
> 2026-02-03 Roger Sayle <[email protected]>
>
> PR middle-end/118608
> * rtlanal.cc (truncated_to_mode): Call targetm.mode_rep_extended
> to check whether an explicit TRUNCATE is required (i.e. performs
> an extension) on this target.
> * config/mips/mips.md (*extenddi_truncate<mode>): Handle all
> SUBDI modes, not just SHORT modes.
> ---
> gcc/config/mips/mips.md | 2 +-
> gcc/rtlanal.cc | 11 -----------
> 2 files changed, 1 insertion(+), 12 deletions(-)
>
> diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md index
> 6a1b63b00d2..85e7d67901f 100644
> --- a/gcc/config/mips/mips.md
> +++ b/gcc/config/mips/mips.md
> @@ -3952,7 +3952,7 @@
> (define_insn_and_split "*extenddi_truncate<mode>"
> [(set (match_operand:DI 0 "register_operand" "=d")
> (sign_extend:DI
> - (truncate:SUBDI (match_operand:DI 1 "register_operand" "d"))))]
> + (truncate:SHORT (match_operand:DI 1 "register_operand" "d"))))]
> "TARGET_64BIT && !TARGET_MIPS16 && !ISA_HAS_EXTS"
> "#"
> "&& reload_completed"
> diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc index
c5062ab7715..88561a54e5a
> 100644
> --- a/gcc/rtlanal.cc
> +++ b/gcc/rtlanal.cc
> @@ -6201,17 +6201,6 @@ truncated_to_mode (machine_mode mode, const_rtx
> x)
> if (REG_P (x) && rtl_hooks.reg_truncated_to_mode (mode, x))
> return true;
>
> - /* This explicit TRUNCATE may be needed on targets that require
> - MODE to be suitably extended when stored in X. Targets such as
> - mips64 use (sign_extend:DI (truncate:SI (reg:DI x))) to perform
> - an explicit extension, avoiding use of (subreg:SI (reg:DI x))
> - which is assumed to already be extended. */
> - scalar_int_mode imode, omode;
> - if (is_a <scalar_int_mode> (mode, &imode)
> - && is_a <scalar_int_mode> (GET_MODE (x), &omode)
> - && targetm.mode_rep_extended (imode, omode) != UNKNOWN)
> - return false;
> -
> /* See if we already satisfy the requirements of MODE. If yes we
> can just switch to MODE. */
> if (num_sign_bit_copies_in_rep[GET_MODE (x)][mode]
> --
> 2.53.0