[Bug testsuite/115262] [15 regression] gcc.target/powerpc/pr66144-3.c fails after r15-831-g05daf617ea22e1

2024-06-12 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115262

Peter Bergner  changed:

   What|Removed |Added

URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2024-June/65
   ||4397.html
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |bergner at gcc dot 
gnu.org

[Bug target/115389] Invalid ROP hashst offset is emitted when using -mabi=no-altivec

2024-06-11 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115389

--- Comment #4 from Peter Bergner  ---
(In reply to Segher Boessenkool from comment #2)
> So, what value do we output? And why?
The invalid offset is zero, so: hashst r0,0(r1)
As the assembler mentions, the range of valid offsets is [-512,-8] and the
offset must be a multiple of 8.

The "bug" is that we initialize rop_hash_save_offset to zero very early, before
any option processing.  Later, we compute the actual offset, but only in the
case where Altivec is enabled (TARGET_ALTIVEC_ABI is true).  If Altivec is
disabled as in this test case, we end up using rop_hash_save_offset's invalid
initial zero value.

[Bug testsuite/115262] [15 regression] gcc.target/powerpc/pr66144-3.c fails after r15-831-g05daf617ea22e1

2024-06-11 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115262

--- Comment #2 from Peter Bergner  ---
(In reply to Jeffrey A. Law from comment #1)
> It looks like the test wants to see xxsel, but after that change we get
> xxlor and  what looks like a slight difference in register allocation.  I
> can't really judge if the new code is better, worse is equivalent.

xxsel XT,XA,XB,XC computes XT = (XA & ~XC) | (XB & XC).  Using De Morgan's law
given XB == XC, that seems to simplify to XT = XA | XB which is what you're
producing and an xxlor (a simple logical or) is not going to be slower than a
xxsel and is probably faster.  I agree with Bill that this looks like an
example of needing to update the expected results of the test case.  I'll let
Segher and/or Ke Wen comment though.

[Bug target/115389] Invalid ROP hashst offset is emitted when using -mabi=no-altivec

2024-06-07 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115389

Peter Bergner  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |bergner at gcc dot 
gnu.org
 Target||powerpc64le-linux
 Ever confirmed|0   |1
   Last reconfirmed||2024-06-07
 CC||linkw at gcc dot gnu.org,
   ||segher at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED

--- Comment #1 from Peter Bergner  ---
I have a patch I'm testing.

[Bug target/115389] New: Invalid ROP hashst offset is emitted when using -mabi=no-altivec

2024-06-07 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115389

Bug ID: 115389
   Summary: Invalid ROP hashst offset is emitted when using
-mabi=no-altivec
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bergner at gcc dot gnu.org
  Target Milestone: ---

We emit a hashst instruction with an invalid offset when compiling with
-mabi=no-altivec.

bergner@ltcd97-lp3:~/ROP$ cat bug.c 
extern void foo (void);
long
bar (void)
{
  foo ();
  return 0;
}
bergner@ltcd97-lp3:~/ROP$ gcc -c -O2 -mcpu=power10 -mrop-protect -mno-vsx
-mno-altivec -mabi=altivec bug.c
bergner@ltcd97-lp3:~/ROP$ gcc -c -O2 -mcpu=power10 -mrop-protect -mno-vsx
-mno-altivec -mabi=no-altivec bug.c 
/tmp/ccSzxbv5.s: Assembler messages:
/tmp/ccSzxbv5.s:15: Error: invalid offset: must be in the range [-512, -8] and
be a multiple of 8
/tmp/ccSzxbv5.s:25: Error: invalid offset: must be in the range [-512, -8] and
be a multiple of 8

The bug is we only compute the ROP hash save slot offset when
TARGET_ALTIVEC_ABI is true. If TARGET_ALTIVEC_ABI is false and we enable ROP
mitigation, then we use the initialized value of zero which is an illegal
offset value for hashst and hashchk.

This has been broken since the rs6000 ROP mitigation code was first added, so
not a regression.

[Bug target/115355] [12/13/14/15 Regression] PPCLE: Auto-vectorization creates wrong code for Power9

2024-06-05 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115355

--- Comment #7 from Peter Bergner  ---
The test fails when setToIdentityBAD's index var is unsigned int.  It passes
when using unsigned long long, unsigned long, unsigned short and unsigned char.
 When using unsigned long long/unsigned long, we do no vectorize the loop.  We
vectorize the loop when using unsigned int/short/char.  The vectorized code is
a little strange, in that the smaller the integer type we use for the index
var, the more code we generate.  

The vectorized code for unsigned char is truly huge!  ...although it does seem
to work correctly.  I'm attaching the "unsigned char i" code gen for
setToIdentityBAD for people to examine.  Even though it gives "correct"
results, it can't really be the code we want to generate, correct???

[Bug target/115355] [12/13/14/15 Regression] PPCLE: Auto-vectorization creates wrong code for Power9

2024-06-05 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115355

--- Comment #6 from Peter Bergner  ---
Created attachment 58361
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58361=edit
setToIdentityBAD-char.s

Code generated for setToIdentityBAD.c when using unsigned char for the index
variable.

[Bug target/115355] PPCLE: Auto-vectorization creates wrong code for Power9

2024-06-05 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115355

--- Comment #5 from Peter Bergner  ---
FYI, fails for me with gcc 12 and later and works with gcc 11.  It also fails
with -O3 -mcpu=power10.

[Bug target/115355] PPCLE: Auto-vectorization creates wrong code for Power9

2024-06-05 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115355

Peter Bergner  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org,
   ||dje at gcc dot gnu.org,
   ||linkw at gcc dot gnu.org,
   ||meissner at gcc dot gnu.org,
   ||segher at gcc dot gnu.org

--- Comment #3 from Peter Bergner  ---
I'll find someone to look into this.  Thanks for the test case!

[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return

2024-05-29 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846

--- Comment #9 from Peter Bergner  ---
(In reply to Kewen Lin from comment #8)
> Should be fixed on trunk, it's not a regression, but we probably want
> backporting this?

For code correctness bugs, yes, we want them backported.

[Bug target/113652] [14/15 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-05-08 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

--- Comment #25 from Peter Bergner  ---
(In reply to Michael Meissner from comment #23)
> 3) Only build the IEEE 128-bit libgcc bits if the user configured the
> compiler with --with-cpu=power7, --with-cpu=power8, --with-cpu=power9,
> --with-cpu=power10 (and in the future --with-cpu=power11 or
> --with-cpu=future).  This could be code that if __VSX__ is not defined, the
> libgcc support functions won't get built.  We would then remove the -mvsx
> option from the library support functions.

I think this is the solution we want, meaning if the target we're building
supports VSX, then we'll build the IEEE128 bits, otherwise, we won't build
them.  I think that is the only sane answer.

I also believe that if the user specifies a -mcpu= option (either implicitly or
explicitly) that doesn't support VSX (eg, power4, or 7450, etc.) and they also
explicitly use -mvsx, then we should emit an error message saying the -mcpu=
option doesn't support using -mvsx at the same time.  Ditto for -maltivec,
-mmma, etc.  We should not be silently enabling instruction support over and
above their -mcpu= selection just because its needed for VSX/Altivec/MMA/etc.
support.  Currently we don't emit an error and instead silently enable
generating instructions not supported by their -mcpu= option.

[Bug target/112868] GCC passes -many to the assembler for --enable-checking=release builds

2024-05-06 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112868

--- Comment #14 from Peter Bergner  ---
(In reply to Niels Möller from comment #13)
> I'm not that familiar with gcc development procedures. Do I understand you
> correctly, that a fix for this bug will not be included in gcc-14 (according
> to https://gcc.gnu.org/develop.html#timeline, gcc-14 stage1 ended several
> months ago), it will have to wait for gcc-15?

Correct, I meant waiting for GCC 15 stage1.  I want it to burn-in on trunk for
a long while, because it had the potential to disrupt distro package builds. 
It seems clean so far with the practice Gentoo builds, but I'll feel more
comfortable when other distros start using it too. 

That said, Jeevitha, now that we're in stage1, can you please post your patch?

[Bug target/101865] _ARCH_PWR8 is not defined when using -mcpu=power8

2024-05-02 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101865

Peter Bergner  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #28 from Peter Bergner  ---
Fixed everywhere.

[Bug target/101345] wrong code at -O1 with vector modulo

2024-05-01 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101345

Peter Bergner  changed:

   What|Removed |Added

 Depends on||101129

--- Comment #4 from Peter Bergner  ---
(In reply to Jeevitha from comment #3)
> The commit that resolved the incorrect code was
> ad5f8ac1d2f2dc92d43663243b52f9e9eb3cf7c0, where Bill disabled the swap for
> mult with subreg. This addressed the issue.

Ok, so that was the fix for PR101129.

Thanks for tracking that down Jeevitha!


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101129
[Bug 101129] [11/12 Regression] wrong code at -O1 since r11-5839

[Bug target/101345] wrong code at -O1 with vector modulo

2024-04-18 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101345

--- Comment #2 from Peter Bergner  ---
(In reply to Peter Bergner from comment #1)
> Jeevitha, can you please do a git bisect from the two commits above to
> identify the commit that fixes this for posterity sake?  Thanks.

I'll note I used -O1 -mcpu=power8 for my compiles.

[Bug target/101345] wrong code at -O1 with vector modulo

2024-04-18 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101345

Peter Bergner  changed:

   What|Removed |Added

 Resolution|--- |FIXED
  Known to work||13.0, 14.0
 CC||jeevitha at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED
   Last reconfirmed||2024-4-18

--- Comment #1 from Peter Bergner  ---
I can confirm the checkout used at the time
(b019b28ebd65462a092c96d95e9e356c8bb39b78) does produce "subfic rX,rX,4".  That
said, with commit b85e79dce149df68b92ef63ca2a40ff1dfa61396 (about the time
gcc13 branches), it is fixed to "subfic rX,rX,2", so I'm marking this as
RESOLVED/FIXED.  It remains fixed since that commit too.

Jeevitha, can you please do a git bisect from the two commits above to identify
the commit that fixes this for posterity sake?  Thanks.

[Bug target/114759] Power: multiple issues with -mrop-protect

2024-04-17 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114759

--- Comment #5 from Peter Bergner  ---
(In reply to Peter Bergner from comment #4)
> If instead we want to just silently ignore (or with a warning), we'd just
> need to modify the rs6000.cc hunk to disable rs6000_rop_protect instead of
> calling error().

Like so:

-  /* If we are inserting ROP-protect instructions, disable shrink wrap.  */
+
   if (rs6000_rop_protect)
-flag_shrink_wrap = 0;
+{
+  if (!TARGET_POWER8 || DEFAULT_ABI != ABI_ELFv2)
+   rs6000_rop_protect = 0;
+  else
+   /* If we are inserting ROP-protect instructions, disable shrink wrap. 
*/
+   flag_shrink_wrap = 0;
+}

[Bug target/114759] Power: multiple issues with -mrop-protect

2024-04-17 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114759

--- Comment #4 from Peter Bergner  ---
Created attachment 57977
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57977=edit
Patch that emits an error for invalid ROP option combinations.

Here's a patch that treats invalid ROP option combinations (currently assuming
P7 and earlier are invalid) as an error.

If instead we want to just silently ignore (or with a warning), we'd just need
to modify the rs6000.cc hunk to disable rs6000_rop_protect instead of calling
error().

[Bug target/114759] Power: multiple issues with -mrop-protect

2024-04-17 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114759

--- Comment #3 from Peter Bergner  ---
(In reply to Segher Boessenkool from comment #2)
>> 1. We always define the __ROP_PROTECT__ predefined macro [snip]
> 
> No.  Whenever the -mrop-protect option is in effect, we should do that
> predefine.
What we do now is:

  gcc -dM -E -mcpu=power10 test.c | grep ROP
  gcc -dM -E -mcpu=power10 -mrop-protect test.c | grep ROP
  #define __ROP_PROTECT__ 1
  gcc -dM -E -mcpu=power4 -mrop-protect test.c | grep ROP
  #define __ROP_PROTECT__ 1

...and that last compile is a wrong code bug.  If we want to continue to
silently disable ROP protection with -mcpu=power4, then we also need to not
define __ROP_PROTECT__.  If we decide we want an error for this, which I
mentioned later as an solution, then this is fixed automatically by doing that.


> If you want to refuse the option without a -mcpu= that can generate useful
> code for it, that's fine, but that is not what we do.  Instead, we generate
> code that will do the ROP-protection boogaloo on CPUs that implement support
> for that, and does nothing otherwise.
We do not currently do "nothing" when we see -mcpu=power4 -mrop-protect.
Yes, we do not emit the hashst and hashchk insns, but we *do* emit the
__ROP_PROTECT__ predefined macro and that is bad.  The most common usage
of that macro is in .S assembler files and if we define __ROP_PROTECT__
in the wrong cases, they can end up with assembler errors.  Again, if
we decide to emit an error for -mcpu=power4 -mrop-protect, then this is
just fixed automatically.



>> 2.  We always disable shrink-wrapping when -mrop-protect is used, [...]
> 
> Yes, this is problematic, and seems to be completely unnecessary.  
For the case where we silently ignore -mcpu=power4 -mrop-protect, it is
completely unnecessary.  If we decide to emit an error for this instead,
then like the above, this is just automatically fixed.



> By exactly the same argument we should *also* do ROP-protection in all
> leaf functions, btw!
I'm not 100% convinced we need to "protect" leaf functions, since the return
address of the leaf function ever makes it onto the stack to be potentially
corrupted.  Can you explain how a leaf-function could be attacked if we
never save its return address to the stack?



>> 3.  We silently disable ROP protection for everything other than
>> -mcpu=power10.  The binutils assembler accepts the ROP insns back
>> to Power8, so we should emit them for Power8 and later.
> 
> The ISA claims it will work for anything after ISA 2.04, even.
True, but given the binutils assembler doesn't accept hashst and hashchk
for anything before Power8, it seemed convenient to match that behavior.
If we enable it for ISA 2.04 and later, then we either have to fix
binutils to do the same (which we can do), but we still run the risk of
some compiles failing because the user is using an older unfixed assembler.


>> 4.  We give an error when -mrop-protect is used with any -mabi=ABI
>> value not equal to ELFv2, [...]
> 
> Yes, we should make it work everywhere.  Even on -m32.  But it requires
> adjusting the ABI as well!
That's a nice goal, but I'd like to fix the present issues before tackling
expanding its use to other ABIs.


So the first question to ask is, do we want to silently disable (maybe with
a warning) emitting ROP instructions if used with -mcpu=CPU or -mabi=ABI that
we don't want or can't emit them for?  ...or do we want to produce an error?
The answer to this question will help guide us on how to fix the other
issues or whether we even have to do anything for them.

[Bug target/114759] Power: multiple issues with -mrop-protect

2024-04-17 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114759

Peter Bergner  changed:

   What|Removed |Added

 CC||dje at gcc dot gnu.org,
   ||linkw at gcc dot gnu.org,
   ||segher at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |bergner at gcc dot 
gnu.org
 Target||powerpc64le-linux
   Last reconfirmed||2024-04-17
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

--- Comment #1 from Peter Bergner  ---
Confirmed.

[Bug target/114759] New: Power: multiple issues with -mrop-protect

2024-04-17 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114759

Bug ID: 114759
   Summary: Power: multiple issues with -mrop-protect
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bergner at gcc dot gnu.org
  Target Milestone: ---

There are multiple issues with the -mrop-protect option which are all
inter-related.

1. We always define the __ROP_PROTECT__ predefined macro when using
-mrop-protect, even when we've silently disabled ROP protection because of a
too old -mcpu=CPU value.  We should only emit __ROP_PROTECT__ when it's legal
to emit the ROP insns.

2. We always disable shrink-wrapping when -mrop-protect is used, even when
we've silently disabled ROP protection because of a too old -mcpu=CPU value. 
We should not disable shrink-wrapping if we've disabled ROP protection.

3. We silently disable ROP protection for everything other than -mcpu=power10. 
The binutils assembler accepts the ROP insns back to Power8, so we should emit
them for Power8 and later.

4. We give an error when -mrop-protect is used with any -mabi=ABI value not
equal to ELFv2, whereas a too old -mcpu=CPU value only causes us to silently
disable ROP protection.  I think both scenarios should behave similarly, so
either we silently disable ROP protection for both or we give an error for
both.

This is not a regression.  I consider 1. to be a correctness/wrong code bug.

[Bug rtl-optimization/85099] [meta-bug] selective scheduling issues

2024-04-17 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85099
Bug 85099 depends on bug 69031, which changed state.

Bug 69031 Summary: ICE: in hash_rtx_cb, at cse.c:2533 with -fPIC 
-fselective-scheduling and __builtin_setjmp()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69031

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WORKSFORME

[Bug target/69031] ICE: in hash_rtx_cb, at cse.c:2533 with -fPIC -fselective-scheduling and __builtin_setjmp()

2024-04-17 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69031

Peter Bergner  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WORKSFORME

--- Comment #3 from Peter Bergner  ---
Maybe already fixed?  Marking as resolved for now and we can reopen if someone
can actually recreate the ICE.  I could not.

[Bug rtl-optimization/96865] ICE in hash_rtx_cb, at cse.c:2548

2024-04-17 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96865

Peter Bergner  changed:

   What|Removed |Added

  Known to fail||12.0, 13.0, 14.0

--- Comment #2 from Peter Bergner  ---
Fails on trunk and basically all earlier versions.

[Bug rtl-optimization/96865] ICE in hash_rtx_cb, at cse.c:2548

2024-04-17 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96865

Peter Bergner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

[Bug testsuite/114518] [15 regression] gcc.target/powerpc/combine-2-2.c fails after r14-9692-g839bc42772ba7a

2024-04-15 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114518

Peter Bergner  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |segher at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

[Bug target/101865] _ARCH_PWR8 is not defined when using -mcpu=power8

2024-04-12 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101865

--- Comment #21 from Peter Bergner  ---
Fixed on trunk.  I'll let it burn-in there for a bit before backporting to the
release branches.

[Bug target/101865] _ARCH_PWR8 is not defined when using -mcpu=power8

2024-04-11 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101865

Peter Bergner  changed:

   What|Removed |Added

URL|https://gcc.gnu.org/piperma |https://gcc.gnu.org/piperma
   |il/gcc-patches/2022-Septemb |il/gcc-patches/2024-April/6
   |er/601825.html  |49329.html

--- Comment #19 from Peter Bergner  ---
New patch submitted as an update to Will's patch.

[Bug ipa/114698] [12/13/14 regression] dcfldd produces wrong sha256 sum on ppc64le and -O3 since r12-5244-g64f3e71c302b4a

2024-04-11 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114698

Peter Bergner  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #9 from Peter Bergner  ---
(In reply to Andrew Pinski from comment #6)
> Note this implementation of sha2.c is shared all over the place it seems and
> has this known issue ...

Confirmed that the patch fixes the error.  It's too bad the "fix" hasn't been
as widely shared. :-(

Closing this as INVALID.

[Bug ipa/114698] [12/13/14 regression] dcfldd produces wrong sha256 sum on ppc64le and -O3 since r12-5244-g64f3e71c302b4a

2024-04-11 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114698

--- Comment #8 from Peter Bergner  ---
(In reply to Andrew Pinski from comment #6)
> Note this implementation of sha2.c is shared all over the place it seems and
> has this known issue ...

(In reply to Andrew Pinski from comment #4)
> (In reply to Andrew Pinski from comment #3)
> > uint8_t buffer[SHA256_BLOCK_LENGTH];
> > 
> > W256 = (sha2_word32*)context->buffer;
> > 
> > This is starting to smell like the code is violating strict aliasing rules
> > ...
> 
> The patch in  https://github.com/NetBSD/pkgsrc/issues/122  applies directly
> here too.

Thanks for the pointer, I'll try the patch and report back.  Jan's commit does
seem to make a change in the alias handling, so it very well could have exposed
that type of problem in the sha2 routine.

[Bug ipa/114698] dcfldd produces wrong sha256 sum on ppc64le and -O3

2024-04-11 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114698

Peter Bergner  changed:

   What|Removed |Added

  Component|target  |ipa
 CC||hubicka at gcc dot gnu.org

--- Comment #2 from Peter Bergner  ---
I've confirmed that the function being miscompiled is
src/sha2.c:SHA256_Transform() on line 440.  I can add configure dcfldd with a
normal -O2 and add a __attribute__((optimize (3))) to this function and I see
bad output.  I can also configure dcfldd with -O3 and add a
__attribute__((optimize (2))) to this function and I see good output.


Doing a git bisect, it identified the following GCC commit as causing the bug:

64f3e71c302b4a13e61656ee509e7050b9bce978 is the first bad commit
commit 64f3e71c302b4a13e61656ee509e7050b9bce978
Author: Jan Hubicka 
Date:   Sun Nov 14 18:49:15 2021 +0100

Extend modref to track kills

This patch adds kill tracking to ipa-modref.  This is representd by array
of accesses to memory locations that are known to be overwritten by the
function.

gcc/ChangeLog:

2021-11-14  Jan Hubicka  

* ipa-modref-tree.c (modref_access_node::update_for_kills): New
member function.
(modref_access_node::merge_for_kills): Likewise.
(modref_access_node::insert_kill): Likewise.
* ipa-modref-tree.h (modref_access_node::update_for_kills,
modref_access_node::merge_for_kills,
modref_access_node::insert_kill):
Declare.
(modref_access_node::useful_for_kill): New member function.
* ipa-modref.c (modref_summary::useful_p): Release useless kills.
(lto_modref_summary): Add kills.
(modref_summary::dump): Dump kills.
(record_access): Add mdoref_access_node parameter.
(record_access_lto): Likewise.
(merge_call_side_effects): Merge kills.
(analyze_call): Add ALWAYS_EXECUTED param and pass it around.
(struct summary_ptrs): Add always_executed filed.
(analyze_load): Update.
(analyze_store): Update; record kills.
(analyze_stmt): Add always_executed; record kills in clobbers.
(analyze_function): Track always_executed.
(modref_summaries::duplicate): Duplicate kills.
(update_signature): Release kills.
* ipa-modref.h (struct modref_summary): Add kills.
* tree-ssa-alias.c (alias_stats): Add kill stats.
(dump_alias_stats): Dump kill stats.
(store_kills_ref_p): Break out from ...
(stmt_kills_ref_p): Use it; handle modref info based kills.

gcc/testsuite/ChangeLog:

2021-11-14  Jan Hubicka  

* gcc.dg/tree-ssa/modref-dse-3.c: New test.

 gcc/ipa-modref-tree.c| 179 +++
 gcc/ipa-modref-tree.h|  15 ++
 gcc/ipa-modref.c | 126 +---
 gcc/ipa-modref.h |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/modref-dse-3.c |  22 +++
 gcc/tree-ssa-alias.c | 207 +++
 6 files changed, 471 insertions(+), 79 deletions(-)

[Bug target/114698] dcfldd produces wrong sha256 sum on ppc64le and -O3

2024-04-11 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114698

Peter Bergner  changed:

   What|Removed |Added

  Known to work||11.0
 CC||dje at gcc dot gnu.org,
   ||linkw at gcc dot gnu.org,
   ||segher at gcc dot gnu.org
 Target||powerpc64le-linux
   Last reconfirmed||2024-04-11
  Known to fail||12.0, 13.0, 14.0
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Keywords||wrong-code

--- Comment #1 from Peter Bergner  ---
Confirmed.

[Bug target/114698] New: dcfldd produces wrong sha256 sum on ppc64le and -O3

2024-04-11 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114698

Bug ID: 114698
   Summary: dcfldd produces wrong sha256 sum on ppc64le and -O3
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bergner at gcc dot gnu.org
  Target Milestone: ---

Building the dcfldd v1.9.1 package on powerpc64le-linux when configured to use
-O3 produces an incorrect sha256 sum for GCC trunk, 13 and 12.  GCC 11 and
earlier produces correct output.  For example (-O3 trunk build):

bergner@ltcden2-lp1:dcfldd [v1.9.1]$ echo TestInput | ./src/dcfldd hash=sha256
TestInput

Total (sha256):
d627605bdee37e388a5c232dc407cb5cd287d27187d6787999ad3bb59d383e9a

0+1 records in
0+1 records out

...versus expected output from an -O2 trunk build:

bergner@ltcden2-lp1:dcfldd [v1.9.1]$ echo TestInput | ./src/dcfldd hash=sha256
TestInput

Total (sha256):
8021973df8498a650e444fd84c705d9168639a246bc6024066e4091b2b450da6

0+1 records in
0+1 records out

...and from sha256sum:

bergner@ltcden2-lp1:dcfldd-git [v1.9.1]$ echo TestInput | /usr/bin/sha256sum 
8021973df8498a650e444fd84c705d9168639a246bc6024066e4091b2b450da6  -


Current steps to recreate:

git clone https://github.com/resurrecting-open-source-projects/dcfldd.git
cd dcfldd/
git checkout v1.9.1 -b v1.9.1
./autogen.sh
./configure CFLAGS="-O3"
make
echo TestInput | ./src/dcfldd hash=sha256

[Bug rtl-optimization/114664] -fno-omit-frame-pointer causes an ICE during the build of the greenlet package

2024-04-10 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664

Peter Bergner  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |INVALID

--- Comment #15 from Peter Bergner  ---
(In reply to Richard Sandiford from comment #14)
> Yeah, I think so.

Ok, then marking as INVALID and greenlet will need to come up with some other
solution than the one they're using.

[Bug rtl-optimization/114664] -fno-omit-frame-pointer causes an ICE during the build of the greenlet package

2024-04-10 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664

--- Comment #13 from Peter Bergner  ---
So I think the conclusion is we should close this as INVALID, correct?

[Bug rtl-optimization/114664] -fno-omit-frame-pointer causes an ICE during the build of the greenlet package

2024-04-10 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664

--- Comment #11 from Peter Bergner  ---
(In reply to Richard Sandiford from comment #10)
> Yeah, I agree it's an error.  The PR says “ICE”, but is there an internal
> error?  The “cannot be used in ‘asm’ here” is a normal user-facing error,
> albeit with bad error recovery, leading us to report the same thing multiple
> times.

My bad for calling it an ICE.  Clearly it's not an ICE but a normal error as
you say.



> > but how are users supposed to know whether
> > -fno-omit-frame-pointer is in effect or not?  I've looked and there is no
> > pre-defined macro a user could check.
> That might be a useful thing to have, but if the programmer has no control
> over the build flags (i.e. cannot require/force -fomit-frame-pointer) then I
> think the asm has to take care to save and restore the frame pointer itself.
> 
> Dropping "31" from the asm means that the asm must preserve the register. 
> Things will go badly if the asm doesn't do that.

So r31 which we use as our frame-pointer reg is a non-volatile/callee saved
register, so it must be saved, but I guess they (greenlet) cannot use the
method of mentioning it in the asm clobber list to perform that.

[Bug rtl-optimization/114664] -fno-omit-frame-pointer causes an ICE during the build of the greenlet package

2024-04-10 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664

--- Comment #9 from Peter Bergner  ---
(In reply to Kewen Lin from comment #8)
> I noticed even without -fno-omit-frame-pointer, the test case still fails
> with the same symptom (with error msg rather than ICE), did I miss something?

With no option, we default to -fomit-frame-pointer and that option does not
guarantee we actually will omit the frame pointer.

[Bug rtl-optimization/114664] -fno-omit-frame-pointer causes an ICE during the build of the greenlet package

2024-04-09 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664

--- Comment #7 from Peter Bergner  ---
(In reply to Andrew Pinski from comment #6)
> Pre-IRA fix was done to specifically reject this:
> https://inbox.sourceware.org/gcc-patches/
> ab3a61990702021658w4dc049cap53de8010a7d86...@mail.gmail.com/

Then that would seem to indicate that mentioning the frame pointer reg in the
asm clobber list is an error, but how are users supposed to know whether
-fno-omit-frame-pointer is in effect or not?  I've looked and there is no
pre-defined macro a user could check.

[Bug rtl-optimization/114664] -fno-omit-frame-pointer causes an ICE during the build of the greenlet package

2024-04-09 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664

--- Comment #4 from Peter Bergner  ---
(In reply to Andrew Pinski from comment #3)
> Well I am going to say this about the code in that repo, the inline-asm in
> slp_switch looks very broken anyways.

100% agree, but broken for other reasons.  I think still TBD whether the
minimal test case here is supposed to work or not.

[Bug rtl-optimization/114664] -fno-omit-frame-pointer causes an ICE during the build of the greenlet package

2024-04-09 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664

Peter Bergner  changed:

   What|Removed |Added

 CC||doko at gcc dot gnu.org,
   ||law at gcc dot gnu.org,
   ||linkw at gcc dot gnu.org,
   ||meissner at gcc dot gnu.org,
   ||rsandifo at gcc dot gnu.org,
   ||segher at gcc dot gnu.org,
   ||vmakarov at gcc dot gnu.org
   Last reconfirmed||2024-04-09
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

[Bug rtl-optimization/114664] -fno-omit-frame-pointer causes an ICE during the build of the greenlet package

2024-04-09 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664

--- Comment #2 from Peter Bergner  ---
CC'ing some architecture and RA experts for their input.

I believe the riscv64 test showing the same issue would be:

void
bug (void)
{
  __asm__ volatile ("" : : : "s0");
}

...but I don't have a cross compiler right now to verify.

Interestingly, I tried what I thought would be the aarch64 test case
(clobbering x29), but it did not ICE.  Did I use the wrong hard frame pointer
register or is aarch64 doing something different here?

[Bug rtl-optimization/114664] New: -fno-omit-frame-pointer causes an ICE during the build of the greenlet package

2024-04-09 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664

Bug ID: 114664
   Summary: -fno-omit-frame-pointer causes an ICE during the build
of the greenlet package
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bergner at gcc dot gnu.org
  Target Milestone: ---

Current builds of the greenlet package on one specific distro, are seeing an
ICE on multiple architectures (ppc64le & riscv64) when being built with
-fno-omit-frame-pointer.  The upstream github issue is here:

  https://github.com/python-greenlet/greenlet/issues/395

A minimized test case on Power is:

bergner@ltcden2-lp1:$ cat bug.c 
void
bug (void)
{
  __asm__ volatile ("" : : : "r31");
}
bergner@ltcden2-lp1:$ /opt/gcc-nightly/trunk/bin/gcc -S -fno-omit-frame-pointer
bug.c
bug.c: In function ‘bug’:
bug.c:5:1: error: 31 cannot be used in ‘asm’ here
5 | }
  | ^
bug.c:5:1: error: 31 cannot be used in ‘asm’ here

This is not a regression, as all gcc's I have easy access to (back to gcc v8)
ICE the same way.

The code that is ICEing here is in ira.c:ira_setup_eliminable_regset():

  /* Build the regset of all eliminable registers and show we can't
 use those that we already know won't be eliminated.  */
  for (i = 0; i < (int) ARRAY_SIZE (eliminables); i++)
{
  bool cannot_elim
= (! targetm.can_eliminate (eliminables[i].from, eliminables[i].to)
   || (eliminables[i].to == STACK_POINTER_REGNUM &&
frame_pointer_needed));

  if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, eliminables[i].from))
{
SET_HARD_REG_BIT (eliminable_regset, eliminables[i].from);

if (cannot_elim)
  SET_HARD_REG_BIT (ira_no_alloc_regs, eliminables[i].from);
}
  else if (cannot_elim)
error ("%s cannot be used in % here",
   reg_names[eliminables[i].from]);
  else
df_set_regs_ever_live (eliminables[i].from, true);
}

On Power, targetm.can_eliminate(r31,r1) returns true (ie, the port will allow
us to eliminate r31 into r1) even in the face of -fno-omit-frame-pointer, but
it's the RA specific test (eliminables[i].to == STACK_POINTER_REGNUM &&
frame_pointer_needed) that is catching us here.

The question I have is, is it legal to mention the hard frame pointer register
in an asm clobber list when using -fno-omit-frame-pointer?  Ie, is this user
error or should the compiler be able to handle this?

[Bug target/112868] GCC passes -many to the assembler for --enable-checking=release builds

2024-04-08 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112868

--- Comment #11 from Peter Bergner  ---
(In reply to Sam James from comment #10)
> No problems reported yet and we have several people testing on ppc w/ gcc 14.

Thanks for the testing!  This is clearly a stage1 patch, so we'll wait until
then before submitting it.

[Bug target/101865] _ARCH_PWR8 is not defined when using -mcpu=power8

2024-04-03 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101865

Peter Bergner  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|willschm at gcc dot gnu.org|bergner at gcc dot 
gnu.org
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2022-Septemb
   ||er/601825.html

--- Comment #17 from Peter Bergner  ---
I'm working on updating the patch Will submitted to take into consideration the
patch reviews plus trunk changes since it was submitted.  Mine now.

[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-03-29 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

Peter Bergner  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #22 from Peter Bergner  ---
I kicked off a 7450 build on one of our BE system and I can confirm we hit this
error.  I also saw us generating the 2 operand form of the mfcr instruction
which also leads to an assembler error because the 7450 doesn't support that
either.

Were we are in the build is compiling libgcc and the routines for handling
KFmode values.  The build machinery is explicitly adding -mvsx -mfloat128 to
the compiler options when building those source files and that seems bogus to
me, since the 7450 does not have VSX hardware.  It's the explicit addition of
the -mvsx option to the command line that is causing the lfiwzx and 2 operand
mfcr instructions to be generated, not some internal mishandling of the
-mcpu=7450 option.

My $0.02 worth is we should be generating an error when trying to use -msx with
-mcpu=7450 or any other cpu that doesn't have VSX hardware. I also don't think
we should be building these KFmode files which require VSX when the underlying
cpu we're targeting doesn't support it.

[Bug target/113950] PowerPC, ICE with -O1 or higher compiling __builtin_vsx_splat_2di test case

2024-03-15 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113950

--- Comment #4 from Peter Bergner  ---
The bogus vsx_splat_ code goes all the way back to GCC 8, so we need
backports to the open release branches (GCC 13, 12, 11).

[Bug target/97367] powerpc64 g5 and cell optimizations result in .machine power7

2024-03-08 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97367

Peter Bergner  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org

--- Comment #7 from Peter Bergner  ---
(In reply to Sam James from comment #6)
> Please send it to the ML with git-send-email.

...and CC our port maintainers, Segher, Ke Wen and David who are all on CC
here.

[Bug target/54284] -mabi=ieeelongdouble problems

2024-03-04 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54284

Peter Bergner  changed:

   What|Removed |Added

 CC|bergner at vnet dot ibm.com,   |bergner at gcc dot 
gnu.org,
   |dje.gcc at gmail dot com   |dje at gcc dot gnu.org
 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Peter Bergner  ---
I'm pretty sure this has been long ago fixed, so I'm going to close this as
FIXED.

[Bug target/50329] [PowerPC] Unnecessary stack frame set up

2024-03-04 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50329

Peter Bergner  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Peter Bergner  ---
(In reply to Segher Boessenkool from comment #2)
> Current trunk (to be GCC 6) optimises "c" perfectly.  Not the other
> two, alas.

Current trunk (to be GCC 14) optimizes all of them now.  Marking as FIXED.

a:
li 9,-1
rldicr 9,9,0,0
std 9,0(3)
blr
b:
li 9,-1
rldicr 9,9,0,0
std 9,0(3)
blr
c:
li 9,0
li 10,-1
rldimi 9,10,63,0
std 9,0(3)
blr

[Bug target/36557] -m32 -mpowerpc64 produces better code than -m64 for a!=0

2024-03-04 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36557

Peter Bergner  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||bergner at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #5 from Peter Bergner  ---
(In reply to Segher Boessenkool from comment #4)
> We now do
> 
> cntlzw 3,3
> srwi 3,3,5
> xori 3,3,0x1
> blr
> 
> which is still not optimal (and not what -m32 / -m32 -mpowerpc64 do).

My GCC 10 and later compiles show we now generate:

addic 9,3,-1
subfe 3,9,3
blr

Marking as FIXED.

[Bug target/33236] -mminimal-toc register should be psedu-register

2024-03-04 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33236

Peter Bergner  changed:

   What|Removed |Added

 Resolution|--- |WONTFIX
 Status|NEW |RESOLVED
 CC||bergner at gcc dot gnu.org

--- Comment #5 from Peter Bergner  ---
(In reply to Segher Boessenkool from comment #4)
> Still happens.

I'm marking this as WONTFIX since -mminimal-toc is an option that is basically
never used with the introduction of -mcmodel=medium (and is the default) and
which results in ideal code for this testcase.

[Bug target/31557] return 0x80000000UL code gen can be improved

2024-03-04 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31557

Peter Bergner  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 CC||bergner at gcc dot gnu.org
 Status|REOPENED|RESOLVED

--- Comment #7 from Peter Bergner  ---
(In reply to Segher Boessenkool from comment #6)
> Actually, huh, *not* fixed on trunk yet.

This was fixed in GCC 13.  Marking it as FIXED.

[Bug target/101893] There is no vgbbd on p7

2024-03-03 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101893

Peter Bergner  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org

--- Comment #3 from Peter Bergner  ---
So this looks fixed and we can mark it RESOLVED / FIXED?

[Bug target/105522] [powerpc-darwin] ICE: in decode_addr_const, at varasm.c:3059

2024-03-03 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105522

Peter Bergner  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org

--- Comment #12 from Peter Bergner  ---
(In reply to Sergey Fedorov from comment #11)
> (In reply to GCC Commits from comment #10)
> > The master branch has been updated by Iain D Sandoe :
> 
> Iain, thank you very much for addressing this!

If this is fixed for you, can you please move this to RESOLVED / FIXED?

[Bug target/100799] Stackoverflow in optimized code on PPC

2024-03-01 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

Peter Bergner  changed:

   What|Removed |Added

 CC||aagarwa at gcc dot gnu.org

--- Comment #32 from Peter Bergner  ---
(In reply to Peter Bergner from comment #31)
> Ok, I think that gives us some idea what needs to be done.  I'll look for
> someone in the team to have a look at implementing this workaround.  Thanks.

Ajit has agreed to try and implement the workaround.

[Bug target/112868] GCC passes -many to the assembler for --enable-checking=release builds

2024-02-27 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112868

Peter Bergner  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jeevitha at gcc dot 
gnu.org

--- Comment #6 from Peter Bergner  ---
(In reply to Sam James from comment #4)
> I was quite surprised by this behaviour. It should really be documented if
> we're going to stick with it, but I don't think we should at all..

I have asked Jeevitha to prepare a patch to remove the -many assembler option
usage on --enable-checking=release builds.

It would be nice if we can get some distro help to do some practice distro
builds using the patch to verify whether there it causes any fallout on distro
builds to help decide whether we should push the patch or leave things as they
are.

[Bug target/100799] Stackoverflow in optimized code on PPC

2024-02-27 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #31 from Peter Bergner  ---
(In reply to Jakub Jelinek from comment #30)
> Either tree parmdef = ssa_default_def (cfun, parm) is NULL, or has_zero_uses
> (parmdef).
> Not sure if has_zero_uses will work properly after some bbs are converted
> from GIMPLE to RTL, but maybe it will, I think the expansion generally
> doesn't gsi_remove statements it expands nor calls update_stmt on them.  One
> could always also just compute in generic code at the start of expansion the
> number of unused DECL_HIDDEN_STRING_LENGTH PARM_DECLs at the end of the
> argument list, save that as a flag in struct function or where and let the
> backends use it from there.

Ok, I think that gives us some idea what needs to be done.  I'll look for
someone in the team to have a look at implementing this workaround.  Thanks.

[Bug sanitizer/113284] [14 regression] many failures in asan after r14-6946-ge66dc37b299cac

2024-02-26 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113284

--- Comment #9 from Peter Bergner  ---
(In reply to GCC Commits from comment #8)
> The master branch has been updated by Ilya Leoshkevich :

Bill, can you double check our testsuite results and close this if it's now
fixed?

[Bug sanitizer/113728] libasan uses incorrect prctl prototype

2024-02-26 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113728

--- Comment #3 from Peter Bergner  ---
(In reply to Florian Weimer from comment #2)
> This has been worked around in glibc. Should we close this issue?

As the bug reporter and given glibc now has a workaround, I think you're fine
to close this if you think there's nothing to be done in GCC.

[Bug target/100799] Stackoverflow in optimized code on PPC

2024-02-26 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #29 from Peter Bergner  ---
(In reply to Jakub Jelinek from comment #28)
> Yes, so it is the backend that told function.cc that there is a parameter
> save area and it should be adding REG_EQUIV notes.  So, the idea would be
> that for the case we talk about (<= 8 normal arguments, then only unused
> DECL_HIDDEN_STRING_LENGTH ones) that the backend would also say that there
> is no parameter save area, basically pretend there are <= 8 arguments.

How can we know there are no uses of the hidden arg(s)?  That backend function
is being called at expand time, so we haven't yet run any RTL dataflow
information to tell us.  Is there some tree attribute for the arg that can tell
is whether it's used or not?  ...or is there some SSA data for that arg that
can show it has no use?  ...and if so, would that still work for -O0 compiles?

[Bug target/100799] Stackoverflow in optimized code on PPC

2024-02-24 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #27 from Peter Bergner  ---
(In reply to Jakub Jelinek from comment #26)
> But I still think the workaround is possible on the callee side.
> Sure, if the DECL_HIDDEN_STRING_LENGTH argument(s) is(are) used in the
> function, then there is no easy way but expect the parameter save area (ok,
> sure, it could just load from the assumed parameter location and don't
> assume the rest is there, nor allow storing to the slots it loaded them
> from).
> But that is actually not what BLAS etc. suffers from.
[snip]
> So, the workaround could be for the case of unused DECL_HIDDEN_STRING_LENGTH
> arguments at the end of PARM_DECLs don't try to load those at all and don't
> assume there is parameter save area unless the non-DECL_HIDDEN_STRING_LENGTH
> or used DECL_HIDDEN_STRING_LENGTH arguments actually require it.
So I looked closer at what the failure mode was in this PR (versus the one
you're seeing with flexiblas).  As in your case, there is a mismatch in the
number of parameters the C caller thinks there are (8 args, so no param save
area needed) versus what the Fortran callee thinks there are (9 params which
include the one hidden arg, so there is a param save area).  The Fortran
function doesn't actually access the hidden argument in our test case above, in
fact the character argument is never used either.  What I see in the rtl dumps
is that *all* incoming args have a REG_EQUIV generated that points to the param
save area (this doesn't happen when there are 8 or fewer formal params), even
for the first 8 args that are passed in registers:

(insn 2 12 3 2 (set (reg/v/f:DI 117 [ r3 ])
(reg:DI 3 3 [ r3 ])) "callee-3.c":6:1 685 {*movdi_internal64}
 (expr_list:REG_EQUIV (mem/f/c:DI (plus:DI (reg/f:DI 99 ap)
(const_int 32 [0x20])) [1 r3+0 S8 A64])
(nil)))
(insn 3 2 4 2 (set (reg/v:DI 118 [ r4 ])
(reg:DI 4 4 [ r4 ])) "callee-3.c":6:1 685 {*movdi_internal64}
 (expr_list:REG_EQUIV (mem/c:DI (plus:DI (reg/f:DI 99 ap)
(const_int 40 [0x28])) [2 r4+0 S8 A64])
(nil)))
...

We then get to RA and we end up spilling one of the pseudos associated with one
of the other parameters (not the character param JOB).  LRA then uses that
REG_EQUIV note and rather than allocating a new stack slot to spill to, it uses
the parameter save memory location for that parameter for the spill slot.  When
we store to that memory location and the C caller has not allocated the param
save area, we end up clobbering an important part of the C callers stack
causing a crash.

If we were to try and do a callee workaround, we would need to disable setting
those REG_EQUIV notes for the parameters... if that's even possible.  Since
Fortran uses call-by-name parameter passing, isn't the updated param value from
the callee returned in the parameter save area itself???


> Doing the workaround on the caller side is impossible, this is for calls
> from C/C++ to Fortran code, directly or indirectly called and there is
> nothing the compiler could use to guess that it actually calls Fortran code
> with hidden Fortran character arguments.
As a HUGE hammer, every caller could always allocate a param save area.  That
would "fix" the problem from this bug, but would that also fix the bug you're
seeing in flexiblas?

I'm not advocating this though.  I was thinking maybe making callers (under an
option?) conservatively assume the callee is a Fortran function and for those C
arguments that could map to a Fortran parameter with a hidden argument, bump
the number of counted args by 1.  For example, a C function with 2 char/char *
args and 6 int args would think there are 8 normal args and 2 hidden args, so
it needs to allocate a param save area.  Is that not feasible?  ...or does that
not even address the issue you're seeing in your bug?

[Bug target/113950] PowerPC, ICE with -O1 or higher compiling __builtin_vsx_splat_2di test case

2024-02-22 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113950

Peter Bergner  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jeevitha at gcc dot 
gnu.org
 CC||dje at gcc dot gnu.org,
   ||linkw at gcc dot gnu.org,
   ||meissner at gcc dot gnu.org,
   ||segher at gcc dot gnu.org

--- Comment #2 from Peter Bergner  ---
Jeevitha is looking into this for us.

[Bug target/100799] Stackoverflow in optimized code on PPC

2024-02-22 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

Peter Bergner  changed:

   What|Removed |Added

 CC||dje at gcc dot gnu.org,
   ||meissner at gcc dot gnu.org

--- Comment #25 from Peter Bergner  ---
CCing Mike and David for possible comments about the possible workarounds
mentioned in Comment 23 and Comment 24.

[Bug target/100799] Stackoverflow in optimized code on PPC

2024-02-21 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #24 from Peter Bergner  ---
(In reply to Jakub Jelinek from comment #23)
> if the PowerPC backend maintainers wanted, there could be a similar workaround
> on the rs6000 backend side, in the decisions whether the callee can use
> the parameter save area or not ignore counting DECL_HIDDEN_STRING_LENGTH
> PARM_DECLs, so if e.g. 9 arguments are passed but one of them is
> DECL_HIDDEN_STRING_LENGTH, assume parameter save area is not there.

If the callee has 9 arguments, even if one is a hidden str len arg, then there
MUST be a parameter save area, since that is where the callee is supposed to
load the 9th argument from.  There is simply no other location that 9th
argument exists at.

I think the only viable rs6000 workaround is for the caller to allocate a
parameter save area in some cases where it doesn't think it needs one.  Ie, the
caller is calling a function which it thinks has 8 parameters and there might
be a hidden one (maybe one param is a string or whatever the Fortran CHARACTER
with len great than 1 maps to) because the callee might be a Fortran routine. 
That would solve the problem of the callee scribbling data into the caller's
frame, but wouldn't solve the issue of the caller didn't actually place a valid
value for the missing hidden parameter.  Thoughts on that?

[Bug target/112103] [14 regression] gcc.target/powerpc/rlwinm-0.c fails after r14-4941-gd1bb9569d70304

2024-02-20 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112103

Peter Bergner  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=114004
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #9 from Peter Bergner  ---
Fixed.


(In reply to Peter Bergner from comment #5)
> We still want to remove the superfluous instruction, but that should be
> covered in a separate bug.

The fixing of the superfluous insn is being tracked in PR114004.

[Bug target/114004] New: GCC emits a superfluous instruction for simple test case on ppc

2024-02-19 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114004

Bug ID: 114004
   Summary: GCC emits a superfluous instruction for simple test
case on ppc
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bergner at gcc dot gnu.org
  Target Milestone: ---

We emit a superfluous rldicl insn for the following test case.  The rlwinm is
all that is needed/required.  This is not a regression.

bergner@ltcden2-lp1:PR112103$ cat bug.c 
unsigned int
rot (unsigned int x)
{
 return x & 0xbfff;
}
bergner@ltcden2-lp1:PR112103$ /opt/gcc-nightly/trunk/bin/gcc -S -O2 bug.c 
bergner@ltcden2-lp1:PR112103$ cat bug.s 
.file   "bug.c"
.machine power10
.abiversion 2
.section".text"
.align 2
.p2align 4,,15
.globl rot
.type   rot, @function
rot:
.LFB0:
.cfi_startproc
.localentry rot,1
rlwinm 3,3,0,2,0
rldicl 3,3,0,32
blr
.long 0
.byte 0,0,0,0,0,0,0,0
.cfi_endproc
.LFE0:
.size   rot,.-rot
.ident  "GCC: (GNU) 14.0.1 20240219 (experimental) [remotes/origin/HEAD
r14-9074-gd70facd54a]"
.section.note.GNU-stack,"",@progbits

[Bug target/112103] [14 regression] gcc.target/powerpc/rlwinm-0.c fails after r14-4941-gd1bb9569d70304

2024-02-19 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112103

Peter Bergner  changed:

   What|Removed |Added

URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2024-Februar
   ||y/646008.html

--- Comment #7 from Peter Bergner  ---
Testing was clean, so submitted.

[Bug target/112103] [14 regression] gcc.target/powerpc/rlwinm-0.c fails after r14-4941-gd1bb9569d70304

2024-02-19 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112103

Peter Bergner  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |bergner at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #6 from Peter Bergner  ---
Testing the obvious patch on both LE and BE to ensure it works everywhere.

[Bug target/112103] [14 regression] gcc.target/powerpc/rlwinm-0.c fails after r14-4941-gd1bb9569d70304

2024-02-19 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112103

--- Comment #5 from Peter Bergner  ---
(In reply to Jakub Jelinek from comment #4)
> So, let's just adjust the testcase then?

We still want to remove the superfluous instruction, but that should be covered
in a separate bug.  So yeah, I think this just needs a testsuite update.

Should we also drop the priority down too?  A P1 seems a little high for a
simple test case update.

[Bug target/113950] PowerPC, ICE with -O1 or higher compiling __builtin_vsx_splat_2di test case

2024-02-15 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113950

Peter Bergner  changed:

   What|Removed |Added

   Last reconfirmed||2024-02-16
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Peter Bergner  ---
Confirmed.

[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-02-08 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

Peter Bergner  changed:

   What|Removed |Added

 CC||meissner at gcc dot gnu.org

--- Comment #15 from Peter Bergner  ---
(In reply to Kewen Lin from comment #11)
> In gcc, lfiwzx is guarded with TARGET_LFIWZX => TARGET_POPCNTD (ISA2.06),
> while -mvsx will guarantee TARGET_POPCNTD (ISA_2_6_MASKS_SERVER) set, so it
> considers lfiwzx is supported. IMHO the underlying philosophy is that having
> the capability of vsx the supported ISA level is at least 2.06, lfiwzx is
> supported from 2.06, so it's supported.
> 
> But binutils seems not to follow it:
> {"xvadddp", XX3(60,96), XX3_MASK,PPCVSX,PPCVLE,
> {XT6, XA6, XB6}},
> {"lfiwzx",  X(31,887),  X_MASK,   POWER7|PPCA2, 0, 
> {FRT, RA0, RB}},
> Both are guarded with different masks and apparently PPCVSX doesn't enable
> POWER7.

That's because xvadddp is a VSX instruction (ie, mentioned in the VSX section
of the ISA), while lfiwzx is a floating point instruction and part of the base
ISA (for Power7 and above).  To me, that means the -mvsx assembler option is
correct to not enable lfiwzx.  ...and as Alan mentioned, even changing the
assembler to have -mvsx enable lfiwzx isn't a solution, since old already
released assemblers would still be broken.

The problem seems to be that the GCC option -mvsx enables some base (ie,
non-vsx) instructions not included in the 7450 which seems dangerous to me.  If
the vsx support in the compiler really needs those base power7 instructions to
function correctly, then we should be emitting an error when the user does
-mcpu=CPU -mvsx and CPU is something less the power7.  If the vsx support
doesn't really need those base power7 instructions to operate, then we
shouldn't be enabling them.   

Mike, can you confirm whether our -mvsx VSX support requires those base power7
instructions or not?

[Bug target/112103] [14 regression] gcc.target/powerpc/rlwinm-0.c fails after r14-4941-gd1bb9569d70304

2024-01-19 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112103

--- Comment #3 from Peter Bergner  ---
(In reply to Segher Boessenkool from comment #2)
> In all those cases the code is perfectly fine, but also in all of those
> cases the
> code is still suboptimal: the rldicl is just as superfluous as the second
> rlwinm
> was!  :-)

So the superfluous second instruction is not really a regression, correct?  All
that changed with Roger's patch is we replaced a superfluous rlwinm with a
superfluous rldicl, correct?

...which is what caused the testcase to FAIL given it was looking for the old
mnemonic and found the new one.

[Bug other/113317] New test case libgomp.c++/ind-base-2.C fails with ICE

2024-01-19 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113317

--- Comment #8 from Peter Bergner  ---
...unless the other P9 systems that were tested built with those "broken"
versions of the compilers.  If that's the case, then it points to something
else wrong on that system.

[Bug testsuite/109705] [14 regression] gcc.dg/vect/pr25413a.c fails after r14-333-g6d4b59a9356ac4

2024-01-19 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109705

--- Comment #6 from Peter Bergner  ---
(In reply to GCC Commits from comment #5)
> commit r14-7270-g39fa71a0882928a25bd170580e3e9e89a69dce36
> Author: Kewen Lin 
> Date:   Mon Jan 15 20:55:40 2024 -0600
> 
> testsuite: Fix vect_long_mult on Power [PR109705]
> 
> As pointed out by the discussion in PR109705, the current
> vect_long_mult effective target check on Power is broken.
> This patch is to fix it accordingly.

Does this need backporting?

[Bug other/113317] New test case libgomp.c++/ind-base-2.C fails with ICE

2024-01-19 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113317

--- Comment #7 from Peter Bergner  ---
(In reply to seurer from comment #6)
> I tried an older compiler (8.4) and it worked ok.
> 
> I just experimented a bit and it fails with the current gcc 11 and 12 used
> as the build compiler as well.  It works when I use gcc 13.

When you say current gcc 11 and 12, you mean the FSF release branches?  ...or
builds you had around?  If the current FSF release branches, then we'll want to
git bisect to figure out when it broke and when it got fixed (or just went
latent?).  What about gcc 9 and gcc 10?

[Bug other/113317] New test case libgomp.c++/ind-base-2.C fails with ICE

2024-01-10 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113317

--- Comment #1 from Peter Bergner  ---
(In reply to seurer from comment #0)
> g:1413af02d62182bc1e19698aaa4dae406f8f13bf, r14-7033-g1413af02d62182
> 
> Note I only saw this failure on one powerpc64 LE system.  It works OK on
> others.

You tend to build using --with-cpu=.  Is this build different from your
other builds wrt --with-cpu=?   If so, then

[Bug target/112886] We need a new print_operand output modifier for vector double

2024-01-09 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112886

Peter Bergner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-01-09
 Ever confirmed|0   |1

--- Comment #1 from Peter Bergner  ---
(In reply to Michael Meissner from comment #0)
> I've been working with vector double support to provide faster memory

Typo: s/vector double/vector pair/, otherwise confirmed.

[Bug target/113115] [14 Regression] ICE In extract_constrain_insn_cached recog.cc with ppc64le-linux-gnu crosscompiler from r14-3592-g9ea1248604d7b6

2024-01-08 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113115

--- Comment #5 from Peter Bergner  ---
(In reply to Kewen Lin from comment #4)
> Yes, I agree it's duplicated of PR109987, Jeevitha's commit just exposed
> this known issue, since we are in stage 3, I wonder if we can go with
> power9-vector guarding first
> (https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587310.html) since
> power9-vector still exists in this release, and we can try to remove these
> workaround options in next release. (Sorry that I missed to follow up the
> power{8,9}-vector removal)

I really dislike the -mpower{8,9}-vector options, but maybe it's too late to
remove them for this release?  I'm not sure how involved/invasive that patch
would be.  Segher, do you have a preference on remove them now or use the
workaround above and remove in the next release?

[Bug target/113115] [14 Regression] ICE In extract_constrain_insn_cached recog.cc with ppc64le-linux-gnu crosscompiler from r14-3592-g9ea1248604d7b6

2024-01-08 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113115

Peter Bergner  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org,
   ||dje at gcc dot gnu.org,
   ||linkw at gcc dot gnu.org,
   ||segher at gcc dot gnu.org

--- Comment #3 from Peter Bergner  ---
Ke Wen, is this just a duplicate of PR109987 and PR103627?  I know it was
bisected to Jeevitha's commit, but it seems more like her commit exposed the
same latent issue as those other PRs, rather than causing it.  Your thoughts?

[Bug tree-optimization/113026] New: Bogus -Wstringop-overflow warning on simple memcpy type loop

2023-12-14 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113026

Bug ID: 113026
   Summary: Bogus -Wstringop-overflow warning on simple memcpy
type loop
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bergner at gcc dot gnu.org
  Target Milestone: ---

The following testcase has a bogus warning on trunk back to at least gcc 11.

bergner@ltcden2-lp1:LTC193379$ cat bug.c 
char dst[16];
long n = 16;

void
foo (char *src)
{
  for (long i = 0; i < n; i++)
dst[i] = src[i];
}

bergner@ltcden2-lp1:LTC193379$ /opt/gcc-nightly/trunk/bin/gcc -S -O3
-mcpu=power8 bug.c 
bug.c: In function ‘foo’:
bug.c:8:12: warning: writing 1 byte into a region of size 0
[-Wstringop-overflow=]
8 | dst[i] = src[i];
  | ~~~^~~~
bug.c:1:6: note: at offset 16 into destination object ‘dst’ of size 16
1 | char dst[16];
  |  ^~~
bug.c:8:12: warning: writing 1 byte into a region of size 0
[-Wstringop-overflow=]
8 | dst[i] = src[i];
  | ~~~^~~~
bug.c:1:6: note: at offset 17 into destination object ‘dst’ of size 16
1 | char dst[16];
  |  ^~~

[Bug tree-optimization/112822] [14 regression] ICE: invalid RHS for gimple memory store after r14-5831-gaae723d360ca26

2023-12-12 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112822

Peter Bergner  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #12 from Peter Bergner  ---
This should be fixed now.

[Bug middle-end/112822] [14 regression] ICE: invalid RHS for gimple memory store after r14-5831-gaae723d360ca26

2023-12-11 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112822

--- Comment #8 from Peter Bergner  ---
(In reply to Peter Bergner from comment #7)
> This fixes the ICE on the large original test case and the smaller test
> cases.  I'll bootstrap and regtest it and report back on the results.

I did a normal bootstrap and regtest on powerpc64le-linux and a
--with-cpu=power10 powerpc64le-linux bootstrap and regtest and both were clean
with no regressions.

[Bug middle-end/112822] [14 regression] ICE: invalid RHS for gimple memory store after r14-5831-gaae723d360ca26

2023-12-11 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112822

--- Comment #7 from Peter Bergner  ---
(In reply to Martin Jambor from comment #5)
> diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
> index 3bd0c7a9af0..99a1b0a6d17 100644
> --- a/gcc/tree-sra.cc
> +++ b/gcc/tree-sra.cc
> @@ -4219,11 +4219,15 @@ load_assign_lhs_subreplacements (struct access *lacc,
>   if (racc && racc->grp_to_be_replaced)
> { 
>   rhs = get_access_replacement (racc);
> + bool vce = false;
>   if (!useless_type_conversion_p (lacc->type, racc->type))
> -   rhs = fold_build1_loc (sad->loc, VIEW_CONVERT_EXPR,
> -  lacc->type, rhs);
> +   {
> + rhs = fold_build1_loc (sad->loc, VIEW_CONVERT_EXPR,
> +lacc->type, rhs);
> + vce = true;
> +   }
> 
> - if (racc->grp_partial_lhs && lacc->grp_partial_lhs)
> + if (lacc->grp_partial_lhs && (vce || racc->grp_partial_lhs))
> rhs = force_gimple_operand_gsi (>old_gsi, rhs, true,
> NULL_TREE, true,
> GSI_SAME_STMT);
> }

This fixes the ICE on the large original test case and the smaller test cases. 
I'll bootstrap and regtest it and report back on the results.

[Bug middle-end/112822] [14 regression] ICE: invalid RHS for gimple memory store after r14-5831-gaae723d360ca26

2023-12-11 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112822

--- Comment #6 from Peter Bergner  ---
(In reply to Martin Jambor from comment #5)
> The following should fix it.  I'll try a bit more to come up with a testcase
> that would not require __builtin_vec_vsx_st but so far my simple attempts
> failed. 

This patch to the small test case I attached still ICEs for me using the same
compiler options:

@@ -84,7 +84,7 @@
 template  cj cp;
 template  void cl(bu *cr, cj cs) { ct(cr, cs);
}
 typedef __attribute__((altivec(vector__))) double co;
-void ct(double *cr, co cs) { __builtin_vec_vsx_st(cs, 0, cr); }
+void ct(double *cr, co cs) { *(co *)cr = cs; }
 struct cq {
   co q;
 };

I'll give your patch a try.

[Bug target/112868] GCC passes -many to the assembler for --enable-checking=release builds

2023-12-07 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112868

--- Comment #5 from Peter Bergner  ---
(In reply to Richard Biener from comment #3)
> Can't we make sure to pass -mno-any (if that exists...) during bootstrap
> and testsuite instead?

-mno-any does not exist.

[Bug target/112868] GCC passes -many to the assembler for --enable-checking=release builds

2023-12-05 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112868

Peter Bergner  changed:

   What|Removed |Added

   Last reconfirmed||2023-12-05
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
 CC||amodra at gcc dot gnu.org,
   ||dje at gcc dot gnu.org,
   ||linkw at gcc dot gnu.org,
   ||meissner at gcc dot gnu.org,
   ||npiggin at gmail dot com,
   ||segher at gcc dot gnu.org

--- Comment #1 from Peter Bergner  ---
CCing interested parties for their input.

[Bug target/112868] New: GCC passes -many to the assembler for --enable-checking=release builds

2023-12-05 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112868

Bug ID: 112868
   Summary: GCC passes -many to the assembler for
--enable-checking=release builds
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bergner at gcc dot gnu.org
  Target Milestone: ---

Since commit r10-580-ge154242724b084 gcc no longer passes -many to the
assembler for --enable-checking=yes builds.  However, we still pass -many to
the assembler for --enable-checking=release builds.  This can hide wrong code
bugs like in PR112707.

This bugzilla is to discuss whether should we remove passing -many to the
assembler under all conditions or should we leave things as they are?

Let the bikeshedding begin! :-)

[Bug target/112707] [14 regression] gcc 14 outputs invalid assembly on ppc: Error: unrecognized opcode: `fctid'

2023-12-05 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112707

--- Comment #14 from Peter Bergner  ---
(In reply to Segher Boessenkool from comment #13)
> (In reply to Peter Bergner from comment #12)
> > I'll note that you don't always
> > get an assembler error, since gcc still passes -many to the assembler for
> > non --enable-checking gcc builds, which causes it to accept the fctid insn.
> 
> Hrm.  Was that an oversight?  Should we always do that now?  Can you prepare
> a patch (and test on some common configs) please?

I was surprised as you that we were still passing -many to the assembler under
some circumstances.  That said, removing all -many usage is orthogonal to this
bug.  I'll open another bug where we can discuss what to do wrt that.

[Bug target/112707] [14 regression] gcc 14 outputs invalid assembly on ppc: Error: unrecognized opcode: `fctid'

2023-12-04 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112707

--- Comment #11 from Peter Bergner  ---
(In reply to Kewen Lin from comment #10)
> (In reply to HaoChen Gui from comment #9)
>
>> My question is: can "fctid" be executed on powerpc7450 such a 32bit
>> processor? If it's supported, should the assembler be changed also (replace
>> the PPC64 with PPC for fctid)?
> 
> Good question, I think it's no, the assembler implementation looks to match
> the documentation, as I can't find insn fctid in powerpc7450 doc:
> https://www.nxp.com.cn/docs/en/reference-manual/MPC7450UM.pdf

I believe the only 32-bit cpu that supports fctid is the 476 which has
explicitly enabled it here.

[Bug target/112707] [14 regression] gcc 14 outputs invalid assembly on ppc: Error: unrecognized opcode: `fctid'

2023-12-04 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112707

Peter Bergner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2023-12-05

--- Comment #12 from Peter Bergner  ---
Confirmed.

A simpler test case with minimal rtl insns.  I'll note that you don't always
get an assembler error, since gcc still passes -many to the assembler for non
--enable-checking gcc builds, which causes it to accept the fctid insn.

extern double rint (double);
void
foo (long long *dst, double a)
{
  *dst = rint (a);
}

[Bug middle-end/112822] [14 regression] ICE: invalid RHS for gimple memory store after r14-5831-gaae723d360ca26

2023-12-03 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112822

--- Comment #3 from Peter Bergner  ---
Created attachment 56784
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56784=edit
creduce minimized test case

Attached creduce minimized test case.  Use -O3 -mcpu=power10 to recreate.

[Bug middle-end/112822] [14 regression] ICE: invalid RHS for gimple memory store after r14-5831-gaae723d360ca26

2023-12-02 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112822

Peter Bergner  changed:

   What|Removed |Added

   Last reconfirmed||2023-12-02
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #2 from Peter Bergner  ---
(In reply to Andrew Pinski from comment #1)
> >This is a huge C++ program that I have not cut down yet.
> 
> I think it didn't attach because it was too big, maybe compress and attach
> that.

I have a creduce running to minimize it.  Looks like it'll take a while to run.
 I'll attach it when it's done.

I also note Bill's build was configured with --with-cpu=power10, so -O3
-mcpu=power10 are the options required to hit the ICE.

Confirmed.

[Bug target/110606] ICE output_operand: '%&' used without any local dynamic TLS references on powerpc64le-linux-gnu

2023-11-16 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110606

Peter Bergner  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org,
   ||dje at gcc dot gnu.org,
   ||jeevitha at gcc dot gnu.org,
   ||linkw at gcc dot gnu.org,
   ||segher at gcc dot gnu.org

--- Comment #3 from Peter Bergner  ---
I've asked Jeevitha to have a look at this one.

[Bug bootstrap/111601] [14 Regression] bootstrap fails in stagestrain in libcody on x86_64-linux-gnu and powerpc64le-linux-gnu

2023-10-18 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111601

--- Comment #6 from Peter Bergner  ---
(In reply to Andrew Pinski from comment #5)
> (In reply to Peter Bergner from comment #4) 
> > CCing richi and jakub to see if they've seen anything like this before?
> 
> I suspect we are miscompiling the final compiler somehow. I linked 2 other
> reports which reported that PGO is causing wrong code; I have not looked
> into confirming them yet though.

Thanks and yes, I agree.  Luckily those test cases are MUCH smaller than gcc
itself.  Hopefully the bug is the same!

[Bug bootstrap/111601] [14 Regression] bootstrap fails in stagestrain in libcody on x86_64-linux-gnu and powerpc64le-linux-gnu

2023-10-18 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111601

Peter Bergner  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #4 from Peter Bergner  ---
(In reply to Peter Bergner from comment #3)
> I'll try and see if I can reduce the test case.

cvise reduced this down to:


bergner@ltcden2-lp1:$ cat pr111601.ii 
struct param_type {
  param_type() : param_type(0.5) { }
  param_type(double);
};

bergner@ltcden2-lp1:$
/home/bergner/gcc/build/gcc-fsf-mainline-pr111601-regtest/./gcc/xgcc
-B/home/bergner/gcc/build/gcc-fsf-mainline-pr111601-regtest/./gcc
-shared-libgcc -fno-checking -x c++-header -nostdinc++ -O2 -S pr111601.ii
pr111601.ii: In constructor ‘param_type::param_type()’:
pr111601.ii:2:32: internal compiler error: tree check: expected tree that
contains ‘decl common’ structure, have ‘’ in
build_new_method_call, at cp/call.cc:11630
2 |   param_type() : param_type(0.5) { }
  |^
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
See  for instructions.

I'll note that a xgcc built without using "make profiledbootstrap-lean" does
not ICE.

CCing richi and jakub to see if they've seen anything like this before?

[Bug bootstrap/111601] [14 Regression] bootstrap fails in stagestrain in libcody on x86_64-linux-gnu and powerpc64le-linux-gnu

2023-10-18 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111601

Peter Bergner  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-10-18

--- Comment #3 from Peter Bergner  ---
(In reply to Matthias Klose from comment #2)
> this seems to be fixed on x86_64-linux-gnu with trunk 20231017.
> powerpc64le-linux now fails in a different way, trying to build the
> libstdc++ pch headers. 
> 
> Full build log at
> https://buildd.debian.org/status/fetch.php?pkg=gcc-
> snapshot=ppc64el=1%3A20231017-1=1697561774=1

Ok, that is the same error I'm seeing, so Confirmed.  I'll try and see if I can
reduce the test case.

[Bug target/111828] rs6000: Parse inline asm string to figure out it requires HTM feature or not.

2023-10-17 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111828

--- Comment #7 from Peter Bergner  ---
(In reply to Peter Bergner from comment #6)
> That said, I think nearly all (all?) HTM usage on Power uses our HTM
> built-in functions.  Maybe we could remove OPTION_MASK_HTM from the
> power8/power9 default flags and only add it back in if we detect the use of
> an HTM built-in function?

Looking at GLIBC, Power's elision-lock.c use of our htm.h uses inline asm with
HTM instructions and not our HTM built-in functions and it doesn't explicitly
add -mhtm to the command line options like S390 does, so I think this idea
won't work. :-(

[Bug target/111828] rs6000: Parse inline asm string to figure out it requires HTM feature or not.

2023-10-17 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111828

--- Comment #6 from Peter Bergner  ---
(In reply to Kewen Lin from comment #3)
> The motivation of this request is to try our best to make power10 attributed
> code inline more power8/power9 attribute code which likely includes some
> inline asm but not HTM related as the quoted OSS shows. For now, for one
> function which has any non-empty inline asm string, we would consider it's
> possible to have HTM code so it's unsafe to inline it.

We've hit this issue (attempting to inline some Power8/9 function into a
Power10 caller) before with another project (I forget which) and the solution
used was to add no-htm to the attribute target options (ie,
"cpu=power8,no-htm").


> Users usually think higher cpu attributed code can safely inline lower cpu
> attributed code, but it's out of expectation for power10 code inlining
> power8/power9 code as we drops HTM from power10. If we can support it
> better, users don't need more extra efforts to teach about it.

Ideally, we could just go back in time and not enable HTM by default on
Power8/9 and force the user to always use -mhtm if they need HTM support.  That
ship has sailed though.

That said, I think nearly all (all?) HTM usage on Power uses our HTM built-in
functions.  Maybe we could remove OPTION_MASK_HTM from the power8/power9
default flags and only add it back in if we detect the use of an HTM built-in
function?

[Bug target/111828] rs6000: Parse inline asm string to figure out it requires HTM feature or not.

2023-10-17 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111828

--- Comment #5 from Peter Bergner  ---
(In reply to Jan Wassenberg from comment #4)
> I understand the slippery slope concern. But the empty asm string is a
> special case, we and others use it (with +r output and memory clobber) to
> prevent optimizing variables out e.g. during tests.

I agree the empty string is a special case and I'm totally fine with the patch
Ke Wen committed/backported to fix your problem.  I'm just against going
further than that and actually trying to parse the contents of the inline asm
string to determine semantically what it contains.

[Bug target/111828] rs6000: Parse inline asm string to figure out it requires HTM feature or not.

2023-10-16 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111828

--- Comment #2 from Peter Bergner  ---
(In reply to Peter Bergner from comment #1)
> If the user compiles a piece of inline asm that doesn't support the
> features used in that inline asm, then that is user error!

I meant to say: If the user compiles a piece of inline asm using options that
doesn't support the features used in that inline asm, then that is user error!

  1   2   3   4   5   6   7   8   9   10   >