On 12/28/23 12:35, Roger Sayle wrote:

Hi Jeff,
Thanks for the speedy review.

On 12/28/23 07:59, Roger Sayle wrote:
This patch fixes PR rtl-optmization/104914 by tweaking/improving the
way that fields are written into a pseudo register that needs to be
kept sign extended.
Well, I think "fixes" is a bit of a stretch.  We're avoiding the issue by 
changing the
early RTL generation, but if I understand what's going on in the RTL optimizers
and MIPS backend correctly, the core bug still remains.  Admittedly I haven't 
put it
under a debugger, but that MIPS definition of NOOP_TRUNCATION just seems
badly wrong and is just waiting to pop it's ugly head up again.

I think this really is the/a correct fix. The MIPS backend defines 
NOOP_TRUNCATION
to false, so it's not correct to use a SUBREG to convert from DImode to SImode.
The problem then is where in the compiler (middle-end or backend) is this 
invalid
SUBREG being created and how can it be fixed.  In this particular case, the 
fault
is in RTL expansion.  There may be other places where a SUBREG is 
inappropriately
used instead of a TRUNCATE, but this is the place where things go wrong for
PR rtl-optimization/104914.
Maybe a better way to put it is I think you're patch one piece of the solution, but we still have the potential for bugs due to the seemingly bogus defintion of TRULY_NOOP_TRUNCATION in the mips port.




Once an inappropriate SImode SUBREG is in the RTL stream, it can remain
harmlessly latent (most of the time), unless it gets split, simplified or 
spilled.
Copying this SImode expression into it's own pseudo, results in incorrect code.
One approach might be to use an UNSPEC for places where backend
invariants are temporarily invalid, but in this case it's machine independent
middle-end code that's using SUBREGs as though the target was an x86/pdp11.

So I agree that on the surface, both of these appear to be identical:
(set (reg:DI) (sign_extend:DI (truncate:SI (reg:DI))))
(set (reg:DI) (sign_extend:DI (subreg:SI (reg:DI))))

But should they get split or spilled by reload:
Even if they're not spilled, on a target like mips they're not equivalent. It highlights the poor design around TRULY_NOOP_TRUNCATION. Essentially it's out of band information on how RTL need to be interpreted. We have similar problems with SHIFT_COUNT_TRUNCATED and other macros.




2023-12-28  Roger Sayle  <ro...@nextmovesoftware.com>

gcc/ChangeLog
          PR rtl-optimization/104914
          * expr.cc (expand_assignment): When target is
SUBREG_PROMOTED_VAR_P
          a sign or zero extension is only required if the modified field
          overlaps the SUBREG's most significant bit.  On MODE_REP_EXTENDED
          targets, don't refer to the temporarily incorrectly extended value
          using a SUBREG, but instead generate an explicit TRUNCATE rtx.
[ ... ]


+             /* Check if the field overlaps the MSB, requiring extension.  */
+             else if (known_eq (bitpos + bitsize,
+                                GET_MODE_BITSIZE (GET_MODE (to_rtx))))
Do you need to look at the size of the field as well?  ie, the starting 
position might
be before the sign bit, but the width of the field might cover the mode's sign 
bit?

I'm not real good in the RTL expansion code, so if I'm offbase on this, just 
let me
know.

There are two things that help here.  The first is that the most significant
bit never appears in the middle of a field, so we don't have to worry about
overlapping, nor writes to the paradoxical bits of the SUBREG.  And secondly,
bits are numbered from zero for least significant, to MODE_BITSIZE (mode) - 1
for most significant, irrespective of the endian-ness.  So the code only needs
to check the highest value bitpos + bitsize is the maximum value for the mode.
The above logic stays the same, but which byte insert requires extension will
change between mips64be and mips64le.  i.e. we test that the most significant
bit of the field/byte being written in the most significant bit of the SUBREG
target. [That's my understanding/rationalization, I could wrong].
Yea, if the type is mode M, then we won't have a field that exceeds the size of M. So we just need to know if the position + size lands squarely on the MSB. Thanks for walking me what should have been fairly obvious in retrospect.



One thing I could be more cautious about is using maybe_eq instead of
known_eq, but the rest of the code (including truly_noop_truncation) assumes
scalar integer modes, so variable length vectors aren't (yet) a concern.
Would using maybe_eq be better coding style?
It's probably better to use maybe_eq for future proofing. OK with that change after the usual testing.

Thanks diving into this.

jeff

Reply via email to