[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register

2023-06-02 Thread uweigand at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

--- Comment #14 from Ulrich Weigand  ---
(In reply to Georg-Johann Lay from comment #13)
> Also I don't have a test case for your scenario.  I can reproduce the bug
> back to v5 on avr and maybe it is even older.  As it appears, this PR lead
> to no hickups on any other target, so for now I'd like to keep the fix
> restricted to what I can test.

I agree that your patch looks correct and unlikely to cause any new problems,
so I won't object to it being committed.  I just wanted to point out that it
might not be a complete fix.

[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register

2023-06-02 Thread uweigand at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

Ulrich Weigand  changed:

   What|Removed |Added

 CC||uweigand at gcc dot gnu.org

--- Comment #12 from Ulrich Weigand  ---
Sorry for not responding earlier, I've been out on vacation.

I think your root cause analysis is correct.  In this part of code:

  if (success)
delete_insn (insn);
  changed |= success;
  insn = next;
  move2add_record_mode (reg);
  reg_offset[regno]
= trunc_int_for_mode (added_offset + base_offset,
  mode);
  continue;

the intent seems to be to manually update the move2add data structures to
account for the effects of "next", because the default logic is now skipped for
the "next" insn.  That's why in particular the reg mode and offset are manually
calculated.

This manual logic however is really only correct if "next" is actually just a
simple SET.  Reading the comment before the whole loop:
  /* For simplicity, we only perform this optimization on
 straightforward SETs.  */
makes me suspect the original author assumed that "next" is in fact a
straightforward SET here as well.  This is however not true due to behavior of
the "single_set" extractor.  (I'm wondering if "single_set" used to be defined
differently back in the days?)

Your fix does look correct to me as far as handling parallel CLOBBERs go. 
However, looking at "single_set", it seems there is yet another case: the
extractor also accepts a parallel of two or more SETs, as long as all except
one of those SETs have destinations that are dead.  These cases would still not
be handled correctly with your patch, I think.

I'm wondering whether it is even worthwhile to attempt to cover those cases. 
Maybe a more straightforward fix would be to keep in line with the
above-mentioned comment about "straightforward SETs" and just check for a
single SET directly instead of using "single_set" here.  Do you think this
would miss any important optimizations?

[Bug debug/108996] Proposal for adding DWARF call site information in GCC with -O0

2023-03-07 Thread uweigand at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108996

--- Comment #9 from Ulrich Weigand  ---
(In reply to Andrew Pinski from comment #7)
> (In reply to Ulrich Weigand from comment #4)
> > (In reply to Jakub Jelinek from comment #3)
> > > What is done on other arches?
> > 
> > That depends on the platform ABI.  On some arches, including x86/x86_64 and
> > arm/aarch64, the ABI requires the generated code reloads the return buffer
> > pointer into a defined register at function exit (either the same it was in
> > on function entry, or some other ABI-defined register).  On those arches,
> > GDB can at least inspect the return value at the point the function return
> > happens.
> 
> aarch64 does not require that. GCC produces it yes but that is a missed
> optimization, see PR 103010 which I filed against GCC for that case.

Well, I was looking at GDB code that at least *assumes* that the aarch64 ABI
does require that:
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/aarch64-tdep.c;h=5b1b9921f87e588f8251a77d858f8f312be1e5ac;hb=HEAD#l2500

If this is incorrect, I guess GDB would have to be fixed.

[Bug debug/108996] Proposal for adding DWARF call site information in GCC with -O0

2023-03-07 Thread uweigand at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108996

--- Comment #8 from Ulrich Weigand  ---
(In reply to Jakub Jelinek from comment #5)
> Though, relying on DW_OP_entry_value is not reliable, if e.g. tail calls are
> (or could be) involved, then GDB needs to punt.

The only way a tail call could happen is if the return value is
passed through directly to the (caller's) caller, so the return
buffer address should still be correct, right?

> So, I wonder if we just shouldn't ask for a DWARF 6 extension here, have
> some way for the compiler to specify DW_AT_location for the return value.
> Then for -O1+ -g with var-tracking that address could be for PowerPC r3
> register in such functions or wherever its initial value is tracked
> (including DW_OP_entry_value).
> While for -O0, we'd see we've spilled that parameter to stack and would set
> DW_AT_location to that place spilled on the stack.

I don't think it is possible to track the value in the callee - the value may
not be available *anywhere* because it is no longer needed.  (Also, I don't
think the implicit return buffer address is guaranteed to be spilled to the
stack even at -O0.)

[Bug debug/108996] Proposal for adding DWARF call site information in GCC with -O0

2023-03-03 Thread uweigand at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108996

--- Comment #4 from Ulrich Weigand  ---
(In reply to Jakub Jelinek from comment #3)
> What is done on other arches?

That depends on the platform ABI.  On some arches, including x86/x86_64 and
arm/aarch64, the ABI requires the generated code reloads the return buffer
pointer into a defined register at function exit (either the same it was in on
function entry, or some other ABI-defined register).  On those arches, GDB can
at least inspect the return value at the point the function return happens.

On a few arches, in particular SPARC and RISC-V, the ABI even guarantees that
the return buffer pointer register remains valid throughout execution of the
function, so that GDB can inspect and/or modify the return value at any point.

But on most other arches, including s390x and ppc/ppc64, the ABI does not
guarantee anything, so GDB simply cannot access the function return value at
all (after the point the return buffer pointer register is no longer needed by
generated code and the register has been reused).

However, *if* the debug info contains an entry-value record for that register
at the call site in the current caller, then the return buffer can be accessed
at any time, on all arches.   Given that in this specific case, most callers
will actually just point the return buffer register to a local stack buffer
(i.e. set the register to "stack pointer plus some constant"), generating an
entry-value record for these special cases should actually be quite
straightforward for the compiler, without requiring a lot of value-tracking
machinery.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2022-10-25 Thread uweigand at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #22 from Ulrich Weigand  ---
(In reply to Jakub Jelinek from comment #15)
> PowerPC I think does, not sure about s390.

For s390x see here:
https://github.com/IBM/s390x-abi

[Bug debug/104194] No way to distinguish IEEE and IBM long double in debug info

2022-07-25 Thread uweigand at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104194

Ulrich Weigand  changed:

   What|Removed |Added

 CC||uweigand at gcc dot gnu.org

--- Comment #8 from Ulrich Weigand  ---
(In reply to Jakub Jelinek from comment #7)
> A temporary workaround now applied.

It turns out this workaround is not transparent to users of the debugger, for
example if you define a variable as
   long double x;
and then issue the "ptype x" command in GDB, you'll now get "_Float128" - which
is quite surprising if you've never even used that type in your source code. 
(This also causes a few GDB test suite failures.)

> The dwarf-discuss thread seems to prefer using separate DW_ATE_* values
> instead of DW_AT_precision/DW_AT_minimum_exponent, but hasn't converged yet.

When I discussed this back in 2017:
https://slideslive.com/38902369/precise-target-floatingpoint-emulation-in-gdb
(see page 16 in the slides), my suggestion was simple
  DW_AT_encoding_variant
which would have the let the details of the floating-point format remain
platform-defined (unspecified by DWARF), but simply allow a platform to define
multiple different formats of the same size if required.

[Bug tree-optimization/97970] [11 regression] 'gcc.dg/gomp/pr82374.c scan-tree-dump-times vect "vectorized 1 loops" 2' for 32-bit x86

2020-11-24 Thread uweigand at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97970

--- Comment #2 from Ulrich Weigand  ---
The patch did not handle flag_excess_precision correctly.  I've reverted for
now and will look into a proper fix.  Sorry for the breakage.