Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-10 Thread Hongtao Liu via Gcc-patches
On Sat, Sep 11, 2021 at 8:29 AM Hongtao Liu  wrote:
>
> On Sat, Sep 11, 2021 at 5:21 AM Segher Boessenkool
>  wrote:
> >
> > On Fri, Sep 10, 2021 at 10:25:45PM +0800, Hongtao Liu wrote:
> > > Updated patch.
> > >
> > >   Bootstrapped and regtested on x86_64-linux-gnu{-m32,},  do I need to
> > > run this patch on other targets machine, or the patch is supposed to
> > > have minimal impact on other targets?
> > >   Then, ok for trunk?
> >
> > [-- Attachment #2: 
> > v2-0001-Check-modes_tieable_p-before-call-gen_lowpart-to-.pat
> > ch --]
> > [-- Type: application/octet-stream, Encoding: base64, Size: 1.4K --]
> >
> > [-- application/octet-stream is unsupported (use 'v' to view this part) --]
> >
> > Please send your patches inline, or if you *have to* use attachments,
> > use text attachments.  Without encoding.
> >
> hmm, my local file is
> $file 0001-Check-modes_tieable_p-before-call-gen_lowpart-to-avo.patch
> --mime-type
> 0001-Check-modes_tieable_p-before-call-gen_lowpart-to-avo.patch: text/x-diff
>
> Didn't figure out how to let webgmail not change mime type of attachment.
> > 
> >
> > (It says "strongly discouraged", which means people will put it to the
> > bottom of the stack of things to look at).
> >
> >
> > Segher
>
> Here is an updated patch.
>
>   Bootstrapped and regtested on x86_64-linux-gnu{-m32,}
>   Ok for trunk?
>
> gcc/ChangeLog:
>
> * machmode.h (TRULY_NOOP_TRUNCATION_MODES_P): Check
> SCALAR_INT_MODE_P for both modes.
> ---
>  gcc/machmode.h | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/machmode.h b/gcc/machmode.h
> index 158351350de..9f95d7d046c 100644
> --- a/gcc/machmode.h
> +++ b/gcc/machmode.h
> @@ -959,9 +959,10 @@ extern scalar_int_mode ptr_mode;
>  /* Target-dependent machine mode initialization - in insn-modes.c.  */
>  extern void init_adjust_machine_modes (void);
>
> -#define TRULY_NOOP_TRUNCATION_MODES_P(MODE1, MODE2) \
> -  (targetm.truly_noop_truncation (GET_MODE_PRECISION (MODE1), \
> - GET_MODE_PRECISION (MODE2)))
> +#define TRULY_NOOP_TRUNCATION_MODES_P(MODE1, MODE2)\
> +  (SCALAR_INT_MODE_P(MODE1) && SCALAR_INT_MODE_P(MODE2)\
will add space for SCALAR_INT_MODE_P (MODE1) && SCALAR_INT_MODE_P (MODE2)
> +   && targetm.truly_noop_truncation (GET_MODE_PRECISION (MODE1),   \
> +GET_MODE_PRECISION (MODE2)))
>
>  /* Return true if MODE is a scalar integer mode that fits in a
> HOST_WIDE_INT.  */
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao


Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-10 Thread Hongtao Liu via Gcc-patches
On Sat, Sep 11, 2021 at 5:21 AM Segher Boessenkool
 wrote:
>
> On Fri, Sep 10, 2021 at 10:25:45PM +0800, Hongtao Liu wrote:
> > Updated patch.
> >
> >   Bootstrapped and regtested on x86_64-linux-gnu{-m32,},  do I need to
> > run this patch on other targets machine, or the patch is supposed to
> > have minimal impact on other targets?
> >   Then, ok for trunk?
>
> [-- Attachment #2: 
> v2-0001-Check-modes_tieable_p-before-call-gen_lowpart-to-.pat
> ch --]
> [-- Type: application/octet-stream, Encoding: base64, Size: 1.4K --]
>
> [-- application/octet-stream is unsupported (use 'v' to view this part) --]
>
> Please send your patches inline, or if you *have to* use attachments,
> use text attachments.  Without encoding.
>
hmm, my local file is
$file 0001-Check-modes_tieable_p-before-call-gen_lowpart-to-avo.patch
--mime-type
0001-Check-modes_tieable_p-before-call-gen_lowpart-to-avo.patch: text/x-diff

Didn't figure out how to let webgmail not change mime type of attachment.
> 
>
> (It says "strongly discouraged", which means people will put it to the
> bottom of the stack of things to look at).
>
>
> Segher

Here is an updated patch.

  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}
  Ok for trunk?

gcc/ChangeLog:

* machmode.h (TRULY_NOOP_TRUNCATION_MODES_P): Check
SCALAR_INT_MODE_P for both modes.
---
 gcc/machmode.h | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/machmode.h b/gcc/machmode.h
index 158351350de..9f95d7d046c 100644
--- a/gcc/machmode.h
+++ b/gcc/machmode.h
@@ -959,9 +959,10 @@ extern scalar_int_mode ptr_mode;
 /* Target-dependent machine mode initialization - in insn-modes.c.  */
 extern void init_adjust_machine_modes (void);

-#define TRULY_NOOP_TRUNCATION_MODES_P(MODE1, MODE2) \
-  (targetm.truly_noop_truncation (GET_MODE_PRECISION (MODE1), \
- GET_MODE_PRECISION (MODE2)))
+#define TRULY_NOOP_TRUNCATION_MODES_P(MODE1, MODE2)\
+  (SCALAR_INT_MODE_P(MODE1) && SCALAR_INT_MODE_P(MODE2)\
+   && targetm.truly_noop_truncation (GET_MODE_PRECISION (MODE1),   \
+GET_MODE_PRECISION (MODE2)))

 /* Return true if MODE is a scalar integer mode that fits in a
HOST_WIDE_INT.  */

-- 
BR,
Hongtao


Re: rs6000: add support for powerpc64le-unknown-freebsd

2021-09-10 Thread Piotr Kubaj via Gcc-patches
Hello again,

it looks like one simple patch got left out by accident. Would it be possible 
for you to commit it?

Thank you,
Piotr Kubaj.

On 20-12-28 06:37:23, Segher Boessenkool wrote:
> On Mon, Dec 28, 2020 at 12:44:15PM +0100, Gerald Pfeifer wrote:
> > On Wed, 16 Dec 2020, Segher Boessenkool wrote:
> > >> Any chance (one of you) can help and commit this?
> > > Done now.
> > > 
> > > Please remind me in a week or so to do the backports?
> > 
> > Thank you, Segher!
> > 
> > And thanks for pushing the backports, too, whenever you get to them,
> > holiday season and such.
> 
> Hey, if I can remember how ssh works ;-)
> 
> Done now (10 and 9).  Cheers!
> 
> 
> Segher

-- 
--- gcc/configure.orig  2021-04-29 10:19:44 UTC
+++ gcc/configure
@@ -29405,7 +29405,7 @@ $as_echo "#define HAVE_LD_PPC_GNU_ATTR_LONG_DOUBLE 1" 
 esac
 
 case "$target:$tm_file" in
-  powerpc64-*-freebsd* | powerpc64*-*-linux* | 
powerpc*-*-linux*rs6000/biarch64.h*)
+  powerpc64*-*-freebsd* | powerpc64*-*-linux* | 
powerpc*-*-linux*rs6000/biarch64.h*)
   case "$target" in
  *le-*-linux*)
  emul_name="-melf64lppc"


signature.asc
Description: PGP signature


Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-10 Thread Segher Boessenkool
On Fri, Sep 10, 2021 at 08:36:12PM +0200, Richard Biener wrote:
> On September 10, 2021 6:24:50 PM GMT+02:00, Segher Boessenkool 
>  wrote:
> >Yes, we should not call TRULY_NOOP_TRUNCATION_MODES_P for any random two
> >modes: such a truncation needs to have a meaning at all, for the
> >question to make any sense.  Maybe we can add an assert to this macro to
> >root out nonsensical callers?
> >
> >Btw.  We have
> >#define TRULY_NOOP_TRUNCATION_MODES_P(MODE1, MODE2) \
> >  (targetm.truly_noop_truncation (GET_MODE_PRECISION (MODE1), \
> >  GET_MODE_PRECISION (MODE2)))
> >which is not optimal, either: does truncating DFmode to HFmode behave
> >the same as truncating DImode to HImode, on every target?  On *any*
> >target, even?!
> 
> When is it for any non-scalar integral mode? I suspect this was only 
> meaningful for integer modes from the start. On x86 i387 math any float mode 
> truncation is noop (with not doing actual truncation to inf). 

And trunc on floating point modes was added later?  Yeah, sounds like a
good theory.

So we should have an assertion in TNTMP that both modes are integral?
(Scalar of course).


Segher


Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-10 Thread Segher Boessenkool
On Fri, Sep 10, 2021 at 10:25:45PM +0800, Hongtao Liu wrote:
> Updated patch.
> 
>   Bootstrapped and regtested on x86_64-linux-gnu{-m32,},  do I need to
> run this patch on other targets machine, or the patch is supposed to
> have minimal impact on other targets?
>   Then, ok for trunk?

[-- Attachment #2: v2-0001-Check-modes_tieable_p-before-call-gen_lowpart-to-.pat
ch --]
[-- Type: application/octet-stream, Encoding: base64, Size: 1.4K --]

[-- application/octet-stream is unsupported (use 'v' to view this part) --]

Please send your patches inline, or if you *have to* use attachments,
use text attachments.  Without encoding.



(It says "strongly discouraged", which means people will put it to the
bottom of the stack of things to look at).


Segher


[PATCH] MAINTAINERS: Adding myself to to DCO and write after approval

2021-09-10 Thread Petter Tomner via Gcc-patches
>From f75e52427846bc453544833b1d167f8568e7cfd8 Mon Sep 17 00:00:00 2001
From: Petter Tomner 
Date: Fri, 10 Sep 2021 21:37:00 +0200
Subject: [PATCH] MAINTAINERS: Adding myself to to DCO and write after approval

2020-09-10  Petter Tomner   

ChangeLog:
* MAINTAINERS: Me added to DCO and write after approval
---
 MAINTAINERS | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 6a1589c4705..1c2f3a1d830 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -635,6 +635,7 @@ Dinar Temirbulatov  

 Kresten Krab Thorup
 Kai Tietz  
 Ilya Tocar 
+Petter Tomner  
 Philipp Tomsich

 Daniel Towner  
 Konrad Trifunovic  
@@ -709,3 +710,4 @@ information.
  Jeff Law  
  Gaius Mulley  
  Trevor Saunders   
+ Petter Tomner 
\ No newline at end of file
-- 
2.30.2



PING – [Patch] Fortran: Fix Bind(C) Array-Descriptor Conversion (Move to Front-End Code)

2021-09-10 Thread Tobias Burnus

Early PING for that patch.

On 06.09.21 12:52, Tobias Burnus wrote:

Hi all,

gfortran's internal array descriptor (xgfc descriptor) and
the descriptor used with BIND(C) (CFI descriptor, ISO_Fortran_binding.h
of TS29113 / Fortran 2018) are different. Thus, when calling a BIND(C)
procedure the gfc descriptor has to be converted to cfi – and when a
BIND(C) procedure is implemented in Fortran, the argument has to be
converted back from CFI to gfc.

The current implementation handles part in the FE and part in
libgfortran,
but there were several issues, e.g. PR101635 failed due to alias issues,
debugging wasn't working well, uninitialized memory was used in some
cases
etc.

This patch now moves descriptor conversion handling to the FE – which
also
can make use of compile-time knowledge, useful both for diagnostic and to
optimize the code.

Additionally:
- Some cases where TS29113 mandates that the array descriptor should be
  used now use the array descriptor, in particular character scalars with
  'len=*' and allocatable/pointer scalars.
- While debugging the alias issue, I simplified 'select rank'. While some
  special case is needed for assumed-shape arrays, those cannot appear
when
  the argument has the pointer or allocatable attribute. That's not
only a
  missed optimization, pointer/allocatable arrays can also be NULL - such
  that accessing desc->dim.ubound[rank-1] can be uninitialized memory ...

OK?  Comments? Suggestions?

 * * *

For some more dumps, see the discussion about the alias issue at:
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578364.html
("[RFH] ME optimizes variable assignment away / Fortran bind(C)
descriptor conversion")
plus the original emails:
- https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578271.html
- and (correct dump)
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578274.html

Debugging - not ideal but not too bad either. For
  subroutine f(x) bind(C)
integer :: x(:)
with an uninitialized size-4 array as argument:

m::f (_x=...) at foo4.f90:3
3   subroutine f(x) bind(C)
(gdb) p x
Cannot access memory at address 0x38
(gdb) p _x
$6 = ( base_addr = 0x7fffe2c0, elem_len = 4, version = 1, rank = 1
'\001', attribute = 2 '\002', type = 1025, dim = (( lower_bound = 0,
extent = 5, sm = 4 )) )
(gdb) s
5 x(1) = 5
(gdb) p x
$7 = (0, 0, 0, -670762413, 0)


Tobias

PS: This patch fixes but not necessarily fully the following PRs:
PR fortran/102086 - [F2008][TS29113] Accepts invalid scalar TYPE(*) as
actual argument to assumed-rank
PR fortran/92189 - Fortran-written bind(C) function with allocatable
argument does not update C descriptor on exit
PR fortran/92621 - Problems with memory handling with allocatable
intent(out) arrays with bind(c)
PR fortran/101308 - Bind(C): gfortran does not create C descriptors
for scalar pointer/allocatable arguments
PR fortran/101635 - FAIL: gfortran.dg/PR93963.f90 – alias-handling
issue with BIND(C)'s _gfortran_cfi_desc_to_gfc_desc
PR fortran/92482 - BIND(C) with array-descriptor mishandled for type
character
and possibly some more.

PPS: I should add some additional testcases – I try to do this as Part
2 of this patch.

PPPS: Once the patch is in, some audit needs to be done which parts of
those PRs remain
as follow-up work. I think some still existing issues are covered by
José's pending
patches + for those which are now fixed, the testcase might still be
added.


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


openmp: Implement OpenMP 5.1 atomics, so far for C only

2021-09-10 Thread Jakub Jelinek via Gcc-patches
Hi!

This patch implements OpenMP 5.1 atomics (with clarifications from upcoming 
5.2).
The most important changes are that it is now possible to write (for C/C++,
for Fortran it was possible before already) min/max atomics and more importantly
compare and exchange in various forms.
Also, acq_rel is now allowed on read/write and acq_rel/acquire are allowed on
update, and there are new compare, weak and fail clauses.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

C++ support will follow next week hopefully.  Various new tests in
c-c++-common are now with { target c }, that is temporary until the C++
support is there.

2021-09-10  Jakub Jelinek  

gcc/
* tree-core.h (enum omp_memory_order): Add OMP_MEMORY_ORDER_MASK,
OMP_FAIL_MEMORY_ORDER_UNSPECIFIED, OMP_FAIL_MEMORY_ORDER_RELAXED,
OMP_FAIL_MEMORY_ORDER_ACQUIRE, OMP_FAIL_MEMORY_ORDER_RELEASE,
OMP_FAIL_MEMORY_ORDER_ACQ_REL, OMP_FAIL_MEMORY_ORDER_SEQ_CST and
OMP_FAIL_MEMORY_ORDER_MASK enumerators.
(OMP_FAIL_MEMORY_ORDER_SHIFT): Define.
* gimple-pretty-print.c (dump_gimple_omp_atomic_load,
dump_gimple_omp_atomic_store): Print [weak] for weak atomic
load/store.
* gimple.h (enum gf_mask): Change GF_OMP_ATOMIC_MEMORY_ORDER
to 6-bit mask, adjust GF_OMP_ATOMIC_NEED_VALUE value and add
GF_OMP_ATOMIC_WEAK.
(gimple_omp_atomic_weak_p, gimple_omp_atomic_set_weak): New inline
functions.
* tree.h (OMP_ATOMIC_WEAK): Define.
* tree-pretty-print.c (dump_omp_atomic_memory_order): Adjust for
fail memory order being encoded in the same enum and also print
fail clause if present.
(dump_generic_node): Print weak clause if OMP_ATOMIC_WEAK.
* gimplify.c (goa_stabilize_expr): Add target_expr and rhs arguments,
handle pre_p == NULL case as a test mode that only returns value
but doesn't change gimplify nor change anything otherwise, adjust
recursive calls, add MODIFY_EXPR, ADDR_EXPR, COND_EXPR, TARGET_EXPR
and CALL_EXPR handling, adjust COMPOUND_EXPR handling for
__builtin_clear_padding calls, for !rhs gimplify as lvalue rather
than rvalue.
(gimplify_omp_atomic): Adjust goa_stabilize_expr caller.  Handle
COND_EXPR rhs.  Set weak flag on gimple load/store for
OMP_ATOMIC_WEAK.
* omp-expand.c (omp_memory_order_to_fail_memmodel): New function.
(omp_memory_order_to_memmodel): Adjust for fail clause encoded
in the same enum.
(expand_omp_atomic_cas): New function.
(expand_omp_atomic_pipeline): Use omp_memory_order_to_fail_memmodel
function.
(expand_omp_atomic): Attempt to optimize atomic compare and exchange
using expand_omp_atomic_cas.
gcc/c-family/
* c-common.h (c_finish_omp_atomic): Add r and weak arguments.
* c-omp.c: Include gimple-fold.h.
(c_finish_omp_atomic): Add r and weak arguments.  Add support for
OpenMP 5.1 atomics.
gcc/c/
* c-parser.c (c_parser_conditional_expression): If omp_atomic_lhs and
cond.value is >, < or == with omp_atomic_lhs as one of the operands,
don't call build_conditional_expr, instead build a COND_EXPR directly.
(c_parser_binary_expression): Avoid calling parser_build_binary_op
if omp_atomic_lhs even in more cases for >, < or ==.
(c_parser_omp_atomic): Update function comment for OpenMP 5.1 atomics,
parse OpenMP 5.1 atomics and fail, compare and weak clauses, allow
acq_rel on atomic read/write and acq_rel/acquire clauses on update.
* c-typeck.c (build_binary_op): For flag_openmp only handle
MIN_EXPR/MAX_EXPR.
gcc/cp/
* parser.c (cp_parser_omp_atomic): Allow acq_rel on atomic read/write
and acq_rel/acquire clauses on update.
* semantics.c (finish_omp_atomic): Adjust c_finish_omp_atomic caller.
gcc/testsuite/
* c-c++-common/gomp/atomic-17.c (foo): Add tests for atomic read,
write or update with acq_rel clause and atomic update with acquire 
clause.
* c-c++-common/gomp/atomic-18.c (foo): Adjust expected diagnostics
wording, remove tests moved to atomic-17.c.
* c-c++-common/gomp/atomic-21.c: Expect only 2 omp atomic release and
2 omp atomic acq_rel directives instead of 4 omp atomic release.
* c-c++-common/gomp/atomic-25.c: New test.
* c-c++-common/gomp/atomic-26.c: New test.
* c-c++-common/gomp/atomic-27.c: New test.
* c-c++-common/gomp/atomic-28.c: New test.
* c-c++-common/gomp/atomic-29.c: New test.
* c-c++-common/gomp/atomic-30.c: New test.
* c-c++-common/goacc-gomp/atomic.c: Expect 1 omp atomic release and
1 omp atomic_acq_rel instead of 2 omp atomic release directives.
* gcc.dg/gomp/atomic-5.c: Adjust expected error diagnostic wording.
* g++.dg/gomp/atomic-18.C:Expect 4 omp atomic 

Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-10 Thread Richard Biener via Gcc-patches
On September 10, 2021 6:24:50 PM GMT+02:00, Segher Boessenkool 
 wrote:
>On Fri, Sep 10, 2021 at 03:58:47PM +0200, Richard Biener wrote:
>> On September 10, 2021 3:30:10 PM GMT+02:00, Segher Boessenkool 
>>  wrote:
>> >TRULY_NOOP_TRUNCATION does not make sense to ask if changing mode class.
>> 
>> OK, so there's a mode class comparison missing here which should be a better 
>> fix than calling validate_subreg? 
>
>Yes, we should not call TRULY_NOOP_TRUNCATION_MODES_P for any random two
>modes: such a truncation needs to have a meaning at all, for the
>question to make any sense.  Maybe we can add an assert to this macro to
>root out nonsensical callers?
>
>Btw.  We have
>#define TRULY_NOOP_TRUNCATION_MODES_P(MODE1, MODE2) \
>  (targetm.truly_noop_truncation (GET_MODE_PRECISION (MODE1), \
>  GET_MODE_PRECISION (MODE2)))
>which is not optimal, either: does truncating DFmode to HFmode behave
>the same as truncating DImode to HImode, on every target?  On *any*
>target, even?!

When is it for any non-scalar integral mode? I suspect this was only meaningful 
for integer modes from the start. On x86 i387 math any float mode truncation is 
noop (with not doing actual truncation to inf). 

>
>
>Segher



Go patch committed: Correct condition for calling memclrHasPointers

2021-09-10 Thread Ian Lance Taylor via Gcc-patches
This Go frontend patch corrects the condition under which we call
memclrHasPointers.  When compiling append(s, make([]typ, ln)...),
where typ has a pointer, and the append fits within the existing
capacity of s, the condition used to clear out the new elements was
reversed.  This fixes https://golang.org/issue/47771.  Bootstrapped
and ran Go tests on x86_64-pc-linux-gnu.  Committed to trunk and to
GCC 10 and 11 branches.

Ian

patch.txt
62749196c08af5619c386d78609def261e93b507
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index c3772694780..ff41af787b1 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-e42c7c0216aec70834e8827174458aa4a50169fa
+21b30eddc59d92a07264c3b21eb032d6c303d16f
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/expressions.cc b/gcc/go/gofrontend/expressions.cc
index 8d4d168f4e3..ddb1d91f3e5 100644
--- a/gcc/go/gofrontend/expressions.cc
+++ b/gcc/go/gofrontend/expressions.cc
@@ -9350,7 +9350,7 @@ Builtin_call_expression::flatten_append(Gogo* gogo, 
Named_object* function,
   ref2 = Expression::make_cast(uint_type, ref2, loc);
   cond = Expression::make_binary(OPERATOR_GT, ref, ref2, loc);
   zero = Expression::make_integer_ul(0, int_type, loc);
-  call = Expression::make_conditional(cond, call, zero, loc);
+  call = Expression::make_conditional(cond, zero, call, loc);
 }
 }
   else


Re: More aggressive threading causing loop-interchange-9.c regression

2021-09-10 Thread Jeff Law via Gcc-patches




On 9/10/2021 7:53 AM, Aldy Hernandez via Gcc-patches wrote:



On 9/10/21 3:16 PM, Michael Matz wrote:

Hi,

On Fri, 10 Sep 2021, Aldy Hernandez via Gcc-patches wrote:


  }
+
+  /* Threading through a non-empty latch would cause code to be added


"through an *empty* latch".  The test in code is correct, though.


Whoops.



And for the before/after loops flag you added: we have a
cfun->curr_properties field which can be used.  We even already have a
PROP_loops flag but that is set throughout compilation from CFG
construction until the RTL loop optimizers, so can't be re-used for what
is needed here.  But you still could invent another PROP_ value 
instead of

adding a new field in struct function.


Oooo, even better.  No inline functions.

Like this?
Aldy

0001-Disable-threading-through-latches-until-after-loop-o.patch

 From ff25faa8dd8721da9bb4715706c662fc09fd4e8c Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Thu, 9 Sep 2021 20:30:28 +0200
Subject: [PATCH] Disable threading through latches until after loop
  optimizations.

The motivation for this patch was enabling the use of global ranges in
the path solver, but this caused certain properties of loops being
destroyed which made subsequent loop optimizations to fail.
Consequently, this patch's mail goal is to disable jump threading
involving the latch until after loop optimizations have run.

As can be seen in the test adjustments, we mostly shift the threading
from the early threaders (ethread, thread[12] to the late threaders
thread[34]).  I have nuked some of the early notes in the testcases
that came as part of the jump threader rewrite.  They're mostly noise
now.

Note that we could probably relax some other restrictions in
profitable_path_p when loop optimizations have completed, but it would
require more testing, and I'm hesitant to touch more things than needed
at this point.  I have added a reminder to the function to keep this
in mind.

Finally, perhaps as a follow-up, we should apply the same restrictions to
the forward threader.  At some point I'd like to combine the cost models.

Tested on x86-64 Linux.

p.s. There is a thorough discussion involving the limitations of jump
threading involving loops here:

https://gcc.gnu.org/pipermail/gcc/2021-September/237247.html

gcc/ChangeLog:

* tree-pass.h (PROP_loop_opts_done): New.
* gimple-range-path.cc (path_range_query::internal_range_of_expr):
Intersect with global range.
* tree-ssa-loop.c (tree_ssa_loop_done): Set PROP_loop_opts_done.
* tree-ssa-threadbackward.c
(back_threader_profitability::profitable_path_p): Disable
threading through latches until after loop optimizations have run.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/ssa-dom-thread-2b.c: Adjust for disabling of
threading through latches.
* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same.

OK
jeff



Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-10 Thread Segher Boessenkool
On Fri, Sep 10, 2021 at 03:58:47PM +0200, Richard Biener wrote:
> On September 10, 2021 3:30:10 PM GMT+02:00, Segher Boessenkool 
>  wrote:
> >TRULY_NOOP_TRUNCATION does not make sense to ask if changing mode class.
> 
> OK, so there's a mode class comparison missing here which should be a better 
> fix than calling validate_subreg? 

Yes, we should not call TRULY_NOOP_TRUNCATION_MODES_P for any random two
modes: such a truncation needs to have a meaning at all, for the
question to make any sense.  Maybe we can add an assert to this macro to
root out nonsensical callers?

Btw.  We have
#define TRULY_NOOP_TRUNCATION_MODES_P(MODE1, MODE2) \
  (targetm.truly_noop_truncation (GET_MODE_PRECISION (MODE1), \
  GET_MODE_PRECISION (MODE2)))
which is not optimal, either: does truncating DFmode to HFmode behave
the same as truncating DImode to HImode, on every target?  On *any*
target, even?!


Segher


Re: [PATCH] Default Alpha/VMS to DWARF2 debugging only

2021-09-10 Thread Douglas Rupp via Gcc-patches
Dwarf2 only works with an ancient GDB. VMS style was useful because it
implemented enough to get a trace back.

But as you say it’s dead, so no basis for objection.

On Fri, Sep 10, 2021 at 6:47 AM Jeff Law  wrote:

>
>
> On 9/10/2021 12:52 AM, Richard Biener via Gcc-patches wrote:
> > This changes the default debug format for Alpha/VMS to DWARF2 only,
> > skipping emission of VMS debug info which is going do be deprecated
> > for GCC 12 alongside the support for STABS.
> >
> > It looks like other flavors of VMS never used VMS_DEBUG by default
> > but only the alpha port did.
> >
> > I have no good means to test anything here, it might be that we have
> > alpha-vms specific testcases that rely on the previous default.
> >
> > OK for trunk?
> >
> > Thanks,
> > Richard.
> >
> > 2021-09-10  Richard Biener  
> >
> >   * config/alpha/vms.h (PREFERRED_DEBUGGING_TYPE): Define to
> >   DWARF2_DEBUG.
> It's a dead target, so yea, go for it.  Worst case it breaks someone
> notices and we know someone still cares about alpha-vms :-)
>
> Jeff
>
>


Re: [PATCH, Fortran] Revert to non-multilib-specific ISO_Fortran_binding.h

2021-09-10 Thread Andreas Schwab
This misses the m68k extended real format.

Andreas.

* ISO_Fortran_binding.h (CFI_type_long_double)
(CFI_type_long_double_Complex) [LDBL_MANT_DIG == 64 &&
LDBL_MIN_EXP == -16382 && LDBL_MAX_EXP == 16384]: Define.
---
 libgfortran/ISO_Fortran_binding.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/libgfortran/ISO_Fortran_binding.h 
b/libgfortran/ISO_Fortran_binding.h
index 5335ea471c7..9c42464affa 100644
--- a/libgfortran/ISO_Fortran_binding.h
+++ b/libgfortran/ISO_Fortran_binding.h
@@ -233,6 +233,13 @@ extern int CFI_setpointer (CFI_cdesc_t *, CFI_cdesc_t *, 
const CFI_index_t []);
 #define CFI_type_long_double (CFI_type_Real + (10 << CFI_type_kind_shift))
 #define CFI_type_long_double_Complex (CFI_type_Complex + (10 << 
CFI_type_kind_shift))
 
+/* This is the 96-bit encoding on m68k; Fortran assigns it kind 10.  */
+#elif (LDBL_MANT_DIG == 64 \
+   && LDBL_MIN_EXP == -16382 \
+   && LDBL_MAX_EXP == 16384)
+#define CFI_type_long_double (CFI_type_Real + (10 << CFI_type_kind_shift))
+#define CFI_type_long_double_Complex (CFI_type_Complex + (10 << 
CFI_type_kind_shift))
+
 /* This is the IEEE 128-bit encoding, same as float128.  */
 #elif (LDBL_MANT_DIG == 113 \
&& LDBL_MIN_EXP == -16381 \
-- 
2.33.0

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-10 Thread Segher Boessenkool
On Fri, Sep 10, 2021 at 11:09:31AM +0800, Hongtao Liu wrote:
> On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool
>  wrote:
> > way too long.  But from that same history it follows that anything you
> > do not super carefully (with testing everywhere) will cause some serious
> Frankly, testing everywhere is too heavy a burden for developers,
> after all, everyone has a limited variety of machines, and may not be
> familiar with using  other targets' simulators.

Your change should be tested on enough relevant targets that we can have
confidence it works properly.  That does not necessarily mean you have
to test everywhere yourself (although that is greatly appreciated, makes
life easier for everyone, including yourself, and as David points out
the cfarm is a great help).

So you didn't realise your patch would wreak havoc on most other targets
than what you tested on.  It happens, it helps if you can avoid it, but
you learn only from things that go wrong :-)

(The patch has still not been reverted btw.  I'll do this later tonight
if you don't).

> And back to the problem we were trying to solve at the beginning
> (subreg:HF(reg:SI)), I guess this is not just a problem in x86
> backend, any backend can encounter similar problems, that's why we
> remove all the weird cases in validate_subreg.

Expand should simply expand to two statements: one doing the real
subreg, the other doing the bit_cast.


Segher


Re: [PATCH] Always default to DWARF2 debug for cygwin and mingw

2021-09-10 Thread Jeff Law via Gcc-patches




On 9/10/2021 12:38 AM, Richard Biener via Gcc-patches wrote:

This removes the fallback to STABS as default for cygwin and mingw
when the assembler does not support .secrel32 and the default is
to emit 32bit code.  Support for .secrel32 was added to binutils 2.16
released in 2005 so instead document that as requirement.

I left the now unused check for .secrel32 in configure around
in case somebody wants to turn that into an error or warning.

OK for trunk?  As before I have no good means to test this but
it should change nothing for people using binutils 2.16+

2021-09-10  Richard Biener  

* config/i386/cygming.h: Always default to DWARF2 debugging.
Do not define DBX_DEBUGGING_INFO, that's done via dbxcoff.h
already.
* doc/install.texi: Document binutils 2.16 as minimum
requirement for mingw.

OK.

jeff



Re: [PATCH] Remove DARWIN_PREFER_DWARF and dead code

2021-09-10 Thread Jeff Law via Gcc-patches




On 9/10/2021 1:19 AM, Richard Biener via Gcc-patches wrote:

This removes the always defined DARWIN_PREFER_DWARF and the code
guarded by it being not defined, removing the possibility to
default some i386 darwin configurations to STABS when it would
not be defined.

OK for trunk?

Thanks,
Richard.

2021-09-10  Richard Biener  

* config/darwin.h (DARWIN_PREFER_DWARF): Do not define.
* config/i386/darwin.h (PREFERRED_DEBUGGING_TYPE): Do not
change based on DARWIN_PREFER_DWARF not being defined.
OK.  I'm not too worried about supporting 32bit darwin 8 and earlier.  
That's got to be at least a decade out of service at this point.


jeff



Re: [PATCH] Always default to DWARF2_DEBUG if not specified, warn about deprecated STABS

2021-09-10 Thread Michael Matz via Gcc-patches
Hello,

On Fri, 10 Sep 2021, Richard Biener via Gcc-patches wrote:

> diagnostic from the Ada frontend.  The warnings are pruned from the
> testsuite output via prune_gcc_output but somehow this doesn't work
> for the tests in gfortran.dg/debug which are now failing with excess
> errors.  That seems to be the case for other fortran .exp as well
> when appending -gstabs, something which works fine for gcc or g++ dgs.

Fortran emits warnings with a capitalized 'W'.  Your regexp only checks 
for lower-case.

> +# Ignore stabs obsoletion warnings
> +regsub -all "(^|\n)\[^\n\]*warning: STABS debugging information is 
> obsolete and not supported anymore\[^\n\]*" $text "" text

This needs to be ".arning" or "\[Ww\]arning".


Ciao,
Michael.


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-10 Thread Segher Boessenkool
On Fri, Sep 10, 2021 at 12:53:37PM +0200, Richard Biener wrote:
> On Fri, Sep 10, 2021 at 1:50 AM Segher Boessenkool
>  wrote:
> > And many targets have strange rules for bit-strings in which modes can
> > be used as bit-strings in which other modes, and at what offsets in
> > which registers.  Now perhaps none of that is optimal (I bet it isn't),
> > but changing this without a transition plan simply does not work.
> 
> But we _do_ already allow some of them :/  Like

Yes.  And all of this is old and ingrained, and targets depend on the
status quo, so changing this will need more care and planning and
cooperation.  It certainly is a worthwhile thing to improve, but it is
not a small project, and it requires a plan.

>   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
>  is the culprit here, and not the backends.  */
>   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> ;
> 
> so for the special case where 'regsize' matches osize it would be
> a bit-cast of a full register from int to float.  But as written it also
> allows (subreg:XF (reg:TI))  which will likely wreck havoc?

That does not pass the isize >= osize test?  Or maybe I don't know
what XFmode is well enough :-)  Hey I can read, I have source, and it
is Friday...

Ah.  So XF has different size on 32-bit and on 64-bit, but that doesn't
even matter here.

> Similar for the omode == word_mode check which allows
> (subreg:DI (reg:TF ..)).  That is, the existing special-cases look
> too broad to me - and they probably exist because when validate_subreg
> rejects sth then we can't put it together later when expand split it
> into two subregs and a pseudo ...

I said it before, and I'll say it again, it is a very important point:
expand should not try to optimise this, *at all*.  And not just this,
not *anything*.  Expand's job in the current compiler is only to
translate Gimple to RTL, and nothing more.

When later passes try to optimise things they will always ask the target
if it agrees with the change (all RTL has to pass recog() for example).
This simplifies life.


Segher


[PATCH v3 3/3] gimple: allow more folding of memcpy [PR102125]

2021-09-10 Thread Richard Earnshaw via Gcc-patches

The current restriction on folding memcpy to a single element of size
MOVE_MAX is excessively cautious on most machines and limits some
significant further optimizations.  So relax the restriction provided
the copy size does not exceed MOVE_MAX * MOVE_RATIO and that a SET
insn exists for moving the value into machine registers.

Note that there were already checks in place for having misaligned
move operations when one or more of the operands were unaligned.

On Arm this now permits optimizing

uint64_t bar64(const uint8_t *rData1)
{
uint64_t buffer;
memcpy(, rData1, sizeof(buffer));
return buffer;
}

from
ldr r2, [r0]@ unaligned
sub sp, sp, #8
ldr r3, [r0, #4]@ unaligned
strdr2, [sp]
ldrdr0, [sp]
add sp, sp, #8

to
mov r3, r0
ldr r0, [r0]@ unaligned
ldr r1, [r3, #4]@ unaligned

PR target/102125 - (ARM Cortex-M3 and newer) missed optimization. memcpy not 
needed operations

gcc/ChangeLog:

PR target/102125
* gimple-fold.c (gimple_fold_builtin_memory_op): Allow folding
memcpy if the size is not more than MOVE_MAX * MOVE_RATIO.
---
 gcc/gimple-fold.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 3f2c176cff6..d9ffb5006f5 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -67,6 +67,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-vector-builder.h"
 #include "tree-ssa-strlen.h"
 #include "varasm.h"
+#include "memmodel.h"
+#include "optabs.h"
 
 enum strlen_range_kind {
   /* Compute the exact constant string length.  */
@@ -957,14 +959,17 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 	= build_int_cst (build_pointer_type_for_mode (char_type_node,
 		  ptr_mode, true), 0);
 
-  /* If we can perform the copy efficiently with first doing all loads
- and then all stores inline it that way.  Currently efficiently
-	 means that we can load all the memory into a single integer
-	 register which is what MOVE_MAX gives us.  */
+  /* If we can perform the copy efficiently with first doing all loads and
+	 then all stores inline it that way.  Currently efficiently means that
+	 we can load all the memory with a single set operation and that the
+	 total size is less than MOVE_MAX * MOVE_RATIO.  */
   src_align = get_pointer_alignment (src);
   dest_align = get_pointer_alignment (dest);
   if (tree_fits_uhwi_p (len)
-	  && compare_tree_int (len, MOVE_MAX) <= 0
+	  && (compare_tree_int
+	  (len, (MOVE_MAX
+		 * MOVE_RATIO (optimize_function_for_size_p (cfun
+	  <= 0)
 	  /* FIXME: Don't transform copies from strings with known length.
 	 Until GCC 9 this prevented a case in gcc.dg/strlenopt-8.c
 	 from being handled, and the case was XFAILed for that reason.
@@ -1000,6 +1005,7 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 	  if (type
 		  && is_a  (TYPE_MODE (type), )
 		  && GET_MODE_SIZE (mode) * BITS_PER_UNIT == ilen * 8
+		  && have_insn_for (SET, mode)
 		  /* If the destination pointer is not aligned we must be able
 		 to emit an unaligned store.  */
 		  && (dest_align >= GET_MODE_ALIGNMENT (mode)


[PATCH v3 2/3] arm: expand handling of movmisalign for DImode [PR102125]

2021-09-10 Thread Richard Earnshaw via Gcc-patches

DImode is currently handled only for machines with vector modes
enabled, but this is unduly restrictive and is generally better done
in core registers.

gcc/ChangeLog:

PR target/102125
* config/arm/arm.md (movmisaligndi): New define_expand.
* config/arm/vec-common.md (movmisalign): Iterate over VDQ mode.
---
 gcc/config/arm/arm.md| 16 
 gcc/config/arm/vec-common.md |  4 ++--
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 5d3f21b91c4..4adc976b8b6 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -12617,6 +12617,22 @@ (define_expand "copysigndf3"
   }"
 )
 
+;; movmisalign for DImode
+(define_expand "movmisaligndi"
+  [(match_operand:DI 0 "general_operand")
+   (match_operand:DI 1 "general_operand")]
+  "unaligned_access"
+{
+  rtx lo_op0 = gen_lowpart (SImode, operands[0]);
+  rtx lo_op1 = gen_lowpart (SImode, operands[1]);
+  rtx hi_op0 = gen_highpart_mode (SImode, DImode, operands[0]);
+  rtx hi_op1 = gen_highpart_mode (SImode, DImode, operands[1]);
+
+  emit_insn (gen_movmisalignsi (lo_op0, lo_op1));
+  emit_insn (gen_movmisalignsi (hi_op0, hi_op1));
+  DONE;
+})
+
 ;; movmisalign patterns for HImode and SImode.
 (define_expand "movmisalign"
   [(match_operand:HSI 0 "general_operand")
diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md
index 68de4f0f943..e71d9b3811f 100644
--- a/gcc/config/arm/vec-common.md
+++ b/gcc/config/arm/vec-common.md
@@ -281,8 +281,8 @@ (define_expand "cml4"
 })
 
 (define_expand "movmisalign"
- [(set (match_operand:VDQX 0 "neon_perm_struct_or_reg_operand")
-	(unspec:VDQX [(match_operand:VDQX 1 "neon_perm_struct_or_reg_operand")]
+ [(set (match_operand:VDQ 0 "neon_perm_struct_or_reg_operand")
+	(unspec:VDQ [(match_operand:VDQ 1 "neon_perm_struct_or_reg_operand")]
 	 UNSPEC_MISALIGNED_ACCESS))]
  "ARM_HAVE__LDST && !BYTES_BIG_ENDIAN
   && unaligned_access && !TARGET_REALLY_IWMMXT"


[PATCH v3 1/3] rtl: directly handle MEM in gen_highpart [PR102125]

2021-09-10 Thread Richard Earnshaw via Gcc-patches

gen_lowpart_general handles forming a lowpart of a MEM by using
adjust_address to rework and validate a new version of the MEM.
Do the same for gen_highpart rather than calling simplify_gen_subreg
for this case.

gcc/ChangeLog:

PR target/102125
* emit-rtl.c (gen_highpart): Use adjust_address to handle
MEM rather than calling simplify_gen_subreg.
---
 gcc/emit-rtl.c | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 77ea8948ee8..0ba110879aa 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -1585,19 +1585,22 @@ gen_highpart (machine_mode mode, rtx x)
   gcc_assert (known_le (msize, (unsigned int) UNITS_PER_WORD)
 	  || known_eq (msize, GET_MODE_UNIT_SIZE (GET_MODE (x;
 
-  result = simplify_gen_subreg (mode, x, GET_MODE (x),
-subreg_highpart_offset (mode, GET_MODE (x)));
-  gcc_assert (result);
-
-  /* simplify_gen_subreg is not guaranteed to return a valid operand for
- the target if we have a MEM.  gen_highpart must return a valid operand,
- emitting code if necessary to do so.  */
-  if (MEM_P (result))
+  /* gen_lowpart_common handles a lot of special cases due to needing to handle
+ paradoxical subregs; it only calls simplify_gen_subreg when certain that
+ it will produce something meaningful.  The only case we need to handle
+ specially here is MEM.  */
+  if (MEM_P (x))
 {
-  result = validize_mem (result);
-  gcc_assert (result);
+  poly_int64 offset = subreg_highpart_offset (mode, GET_MODE (x));
+  return adjust_address (x, mode, offset);
 }
 
+  result = simplify_gen_subreg (mode, x, GET_MODE (x),
+subreg_highpart_offset (mode, GET_MODE (x)));
+  /* Since we handle MEM directly above, we should never get a MEM back
+ from simplify_gen_subreg.  */
+  gcc_assert (result && !MEM_P (result));
+
   return result;
 }
 


[PATCH v3 0/3] lower more cases of memcpy [PR102125]

2021-09-10 Thread Richard Earnshaw via Gcc-patches
Changes since version 2:

patch 1 is reworked again.

patch 2 is unchanged from v2, it's included here only for completeness.

patch 3 is unchanged, it's included here only for completeness.

 -

This short patch series is designed to address some more cases where we
can usefully lower memcpy operations during gimple fold.  The current
code restricts this lowering to a maximum size of MOVE_MAX, ie the size
of a single integer register on the machine, but with modern architectures
this is likely too restrictive.  The motivating example is

uint64_t bar64(const uint8_t *rData1)
{
uint64_t buffer;
__builtin_memcpy(, rData1, sizeof(buffer));
return buffer;
}

which on a 32-bit machine ends up with an inlined memcpy followed by a
load from the copied buffer.

The patch series is in three parts, although the middle patch is an
arm-specific tweak to handle unaligned 64-bit moves on more versions
of the Arm architecture.

Patch 1 changes gen_highpart to directly handle forming the highpart
of a MEM by calling adjust_address, this removes the need to validate
a MEM on return from simplify_gen_subreg, so we replace that with an
assert.

Patch 2 addresses an issue in the arm backend.  Currently
movmisaligndi only supports vector targets.  This patch reworks the
code so that the pattern can work on any architecture version that
supports misaligned accesses.

Patch 3 then relaxes the gimple fold simplification of memcpy to allow
larger memcpy operations to be folded away, provided that the total
size is less than MOVE_MAX * MOVE_RATIO and provided that the machine
has a suitable SET insn for the appropriate integer mode.

With these three changes, the testcase above now optimizes to

mov r3, r0
ldr r0, [r0]@ unaligned
ldr r1, [r3, #4]@ unaligned
bx  lr
R.


Richard Earnshaw (3):
  rtl: directly handle MEM in gen_highpart [PR102125]
  arm: expand handling of movmisalign for DImode [PR102125]
  gimple: allow more folding of memcpy [PR102125]

 gcc/config/arm/arm.md| 16 
 gcc/config/arm/vec-common.md |  4 ++--
 gcc/emit-rtl.c   | 23 +--
 gcc/gimple-fold.c| 16 +++-
 4 files changed, 42 insertions(+), 17 deletions(-)

-- 
2.25.1



Re: [PATCH] analyzer: Define INCLUDE_UNIQUE_PTR

2021-09-10 Thread David Malcolm via Gcc-patches
On Fri, 2021-09-10 at 09:01 +0100, Maxim Blinov wrote:
> Un-break the build for AArch64 Darwin. Build currently fails with an
> error very similar to pr82091:
> 
> ```
> In file included from ../../../gcc-master-wip-apple-
> si/gcc/analyzer/engine.cc:69:
> In file included from
> /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/
> Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/memory:678:
> /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/
> Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/stdexcept:239:5: error:
> no member named 'fancy_abort' in namespace 'std::__1'; did you mean
> simply 'fancy_abort'?
>     _VSTD::abort();
>     ^~~
> /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/
> Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/__config:852:15: note:
> expanded from macro '_VSTD'
> 
> ../../../gcc-master-wip-apple-si/gcc/system.h:777:13: note:
> 'fancy_abort' declared here
> extern void fancy_abort (const char *, int, const char *)
>     ^
> ```
> 
> Judging from the following comment in gcc/system.h, we just need to
> define INCLUDE_UNIQUE_PTR since commit eafa9d96923 added the
> inclusion
> of :
> 
> ```
> /* Some of the headers included by  can use "abort" within a
>    namespace, e.g. "_VSTD::abort();", which fails after we use the
>    preprocessor to redefine "abort" as "fancy_abort" below.
>    Given that unique-ptr.h can use "free", we need to do this after
> "free"
>    is declared but before "abort" is overridden.  */
> 
> ```

Sorry about the breakage.

Does the patch fix the build for you?

If so, looks good for trunk.  Please reference PR bootstrap/102242 in
the ChangeLog entry.

Thanks
Dave


> 
> gcc/analyzer/ChangeLog:
> * engine.cc: Define INCLUDE_UNIQUE_PTR.
> ---
>  gcc/analyzer/engine.cc | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
> index 24f0931197d..f21f8e5b78a 100644
> --- a/gcc/analyzer/engine.cc
> +++ b/gcc/analyzer/engine.cc
> @@ -19,6 +19,7 @@ along with GCC; see the file COPYING3.  If not see
>  .  */
>  
>  #include "config.h"
> +#define INCLUDE_UNIQUE_PTR
>  #include "system.h"
>  #include "coretypes.h"
>  #include "tree.h"




Re: [COMMITTED][patch][version 9]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-09-10 Thread Qing Zhao via Gcc-patches
Hi, Richard, Jose,

Yes, we will try to come up with a patch to gcc-12/changes.html for this 
feature.

Thanks.

Qing

> On Sep 10, 2021, at 8:46 AM, Jose E. Marchesi via Gcc-patches 
>  wrote:
> 
> 
> Hi Richard.
> 
>> On Thu, 9 Sep 2021, Kees Cook wrote:
>> 
>>> On Thu, Sep 09, 2021 at 10:49:11PM +, Qing Zhao wrote:
 Hi, FYI
 
 I just committed the following patch to gcc upstream:
 
 
 https://gcc.gnu.org/pipermail/gcc-cvs/2021-September/353195.html
>>> 
>>> Hurray! Thank you so much for working on this, and thanks also to the
>>> reviewers and everyone else poking at it.
>>> 
>>> I will go update my Linux Plumbers slides to say "supported" instead of
>>> "proposed". :)
>> 
>> Can you two work on wording to add to gcc-12/changes.html for this
>> feature?  I think it deserves a release note.  Likewise the CTF/BTF
>> support btw.
> 
> What about something like this for the BPF, CTF and BTF changes..
> 
> commit 3826495d1a2c265954d5da13ca71925eea390060 (HEAD -> master)
> Author: Jose E. Marchesi 
> Date:   Fri Sep 10 15:44:30 2021 +0200
> 
>gcc-12/changes.html: BPF, CTF and BTF update
> 
>* htdocs/gcc-12/changes.html (BPF): Item about the CO-RE support.
>(Debugging formats): New section with items about the support for
>CTF and BTF.
> 
> diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
> index 946faa49..936af979 100644
> --- a/htdocs/gcc-12/changes.html
> +++ b/htdocs/gcc-12/changes.html
> @@ -143,6 +143,15 @@ a work-in-progress.
> 
> 
> 
> +BPF
> +
> +  Support for CO-RE (compile-once, run-everywhere) has been added
> +  to the BPF backend.  CO-RE allows to compile portable BPF
> +  programs that are able to run among different versions of the
> +  Linux kernel.
> +  
> +
> +
> 
> 
> 
> @@ -210,7 +219,25 @@ a work-in-progress.
> 
> 
> 
> -
> +Other significant improvements
> +
> +Debugging formats
> +
> +
> +  GCC can now generate debugging information
> +  in https://ctfstd.org;>CTF, a lightweight debugging
> +  format that provides information about C types and the
> +  association between functions and data symbols and types.  This
> +  format is designed to be embedded in ELF files and to be very
> +  compact and simple.  A new command-line
> +  option -gctf enables the generation of CTF.
> +  
> +  GCC can now generate debugging information in BTF.  This is a
> +  debugging format mainly used in BPF programs and the Linux
> +  kernel.  The compiler can generate BTF for any target, when
> +  enabled with the command-line option -gbtf
> +  
> +
> 
> 
> 



Re: More aggressive threading causing loop-interchange-9.c regression

2021-09-10 Thread Michael Matz via Gcc-patches
Hello,

On Fri, 10 Sep 2021, Aldy Hernandez via Gcc-patches wrote:

> Like this?

Yep, but I can't approve.


Ciao,
Michael.


Re: [COMMITTED][patch][version 9]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-09-10 Thread Qing Zhao via Gcc-patches
Hi, Thomas,

Thanks for reporting the issue.

> On Sep 10, 2021, at 4:40 AM, Thomas Schwinge  wrote:
> 
> Hi!
> 
> On 2021-09-10T10:47:00+0200, Christophe LYON via Gcc-patches 
>  wrote:
>> On 10/09/2021 00:49, Qing Zhao via Gcc-patches wrote:
>>> I just committed the following patch to gcc upstream:
>>> 
>>> 
>>> https://gcc.gnu.org/pipermail/gcc-cvs/2021-September/353195.html
> 
>> Several of the new tests fail on arm and aarch64 with -mabi=ilp32.
> 
> Similar for 32-bix x86 testing, or x86_64 with '-m32' testing -- as also
> reported by a number of auto-tester instances.
> 
>> On arm:
>> 
>> gcc:gcc.dg/dg.exp=c-c++-common/auto-init-1.c  -Wc++-compat   scan-tree-dump 
>> gimple "temp5 = .DEFERRED_INIT \\(8, 2, 0\\)"
>> gcc:gcc.dg/dg.exp=c-c++-common/auto-init-1.c  -Wc++-compat   
>> scan-tree-dump gimple "temp7 = .DEFERRED_INIT \\(8, 2, 0\\)"
>> gcc:gcc.dg/dg.exp=c-c++-common/auto-init-2.c  -Wc++-compat   
>> scan-tree-dump gimple "temp5 = .DEFERRED_INIT \\(8, 1, 0\\)"
>> gcc:gcc.dg/dg.exp=c-c++-common/auto-init-2.c  -Wc++-compat   
>> scan-tree-dump gimple "temp7 = .DEFERRED_INIT \\(8, 1, 0\\)"
>> gcc:gcc.dg/dg.exp=c-c++-common/auto-init-3.c  -Wc++-compat   
>> scan-tree-dump gimple "temp3 = .DEFERRED_INIT \\(16, 2, 0\\)"
>> gcc:gcc.dg/dg.exp=c-c++-common/auto-init-4.c  -Wc++-compat   
>> scan-tree-dump gimple "temp3 = .DEFERRED_INIT \\(16, 1, 0\\)"
>> gcc:gcc.dg/dg.exp=c-c++-common/auto-init-5.c  -Wc++-compat   
>> scan-tree-dump gimple "temp3 = .DEFERRED_INIT \\(32, 2, 0\\)"
>> gcc:gcc.dg/dg.exp=c-c++-common/auto-init-6.c  -Wc++-compat   
>> scan-tree-dump gimple "temp3 = .DEFERRED_INIT \\(32, 1, 0\\)"
>> gcc:gcc.dg/dg.exp=c-c++-common/auto-init-padding-1.c  -Wc++-compat   
>> scan-tree-dump gimple ".DEFERRED_INIT \\(24, 1, 0\\)"
>> 
>> on aarch64 -mabi=ilp32:
>> 
>> gcc:gcc.dg/dg.exp=c-c++-common/auto-init-1.c  -Wc++-compat   
>> scan-tree-dump gimple "temp5 = .DEFERRED_INIT \\(8, 2, 0\\)"
>> gcc:gcc.dg/dg.exp=c-c++-common/auto-init-1.c  -Wc++-compat   
>> scan-tree-dump gimple "temp7 = .DEFERRED_INIT \\(8, 2, 0\\)"
>> gcc:gcc.dg/dg.exp=c-c++-common/auto-init-2.c  -Wc++-compat   
>> scan-tree-dump gimple "temp5 = .DEFERRED_INIT \\(8, 1, 0\\)"
>> gcc:gcc.dg/dg.exp=c-c++-common/auto-init-2.c  -Wc++-compat   
>> scan-tree-dump gimple "temp7 = .DEFERRED_INIT \\(8, 1, 0\\)"
>> gcc:gcc.dg/dg.exp=c-c++-common/auto-init-padding-1.c  -Wc++-compat   
>> scan-tree-dump gimple ".DEFERRED_INIT \\(24, 1, 0\\)"
>> gcc:gcc.target/aarch64/aarch64.exp=gcc.target/aarch64/auto-init-2.c 
>> scan-rtl-dump-times expand "0xfefefefefefefefe" 2
>> 
>> gcc:gcc.target/aarch64/aarch64.exp=gcc.target/aarch64/auto-init-padding-5.c 
>> scan-assembler-times stp\txzr, xzr, 2
>> 
>> Can you check?
> 
> On 2021-09-10T11:08:22+0200, Martin Liška  wrote:
>> It's the following bug:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102269
> 
> No, not ICEs, but just "regular 'scan-tree-dump' FAILs".  I suppose these
> are all data-type mismatches: for example, 'long' or 'int *' not mapping
> to the expected '8'.

Yes, I guess so, will double check on this and fix the issue.
> 
> 
> Unrelated to the above, I've pushed as obvious
> "Fix 'dg-do run' syntax in 'c-c++-common/auto-init-padding-{2,3}.c'"
> to master branch in commit 5c5c2d86e520c3bf37368309b2fe932c88bdd14f, see
> attached.  (All-PASS per my testing.)

Thanks a lot for the help.

Qing
> 
> 
> Grüße
> Thomas
> 
> 
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955
> From 5c5c2d86e520c3bf37368309b2fe932c88bdd14f Mon Sep 17 00:00:00 2001
> From: Thomas Schwinge 
> Date: Fri, 10 Sep 2021 11:26:50 +0200
> Subject: [PATCH] Fix 'dg-do run' syntax in
> 'c-c++-common/auto-init-padding-{2,3}.c'
> 
> Fix-up for recent commit a25e0b5e6ac8a77a71c229e0a7b744603365b0e9
> "Add -ftrivial-auto-var-init option and uninitialized variable attribute".
> 
>   gcc/testsuite/
>   * c-c++-common/auto-init-padding-2.c: Fix 'dg-do run' syntax.
>   * c-c++-common/auto-init-padding-3.c: Likewise.
> ---
> gcc/testsuite/c-c++-common/auto-init-padding-2.c | 2 +-
> gcc/testsuite/c-c++-common/auto-init-padding-3.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/c-c++-common/auto-init-padding-2.c 
> b/gcc/testsuite/c-c++-common/auto-init-padding-2.c
> index e2b50dc5ae8..462f5aeab91 100644
> --- a/gcc/testsuite/c-c++-common/auto-init-padding-2.c
> +++ b/gcc/testsuite/c-c++-common/auto-init-padding-2.c
> @@ -1,7 +1,7 @@
> /* To test that the compiler can fill all the paddings to zeroes for the 
>structures when the auto variable is partially initialized,  fully 
>initialized, or not initialized for -ftrivial-auto-var-init=zero.  */
> -/* { dg-do run} */
> +/* { dg-do run } */
> /* 

Re: [COMMITTED][patch][version 9]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-09-10 Thread Qing Zhao via Gcc-patches
Hi, Christophe,

Thanks for reporting the issue. 

Yes, the testing cases I added only got tested on aarch64 and x86_64 platforms, 
failures on other platforms are expected.

I will take a look on this and to see how to resolve the testing issues.

Qing

> On Sep 10, 2021, at 3:47 AM, Christophe LYON  
> wrote:
> 
> 
> On 10/09/2021 00:49, Qing Zhao via Gcc-patches wrote:
>> Hi, FYI
>> 
>> I just committed the following patch to gcc upstream:
>> 
>> 
>> https://gcc.gnu.org/pipermail/gcc-cvs/2021-September/353195.html
>> 
> Hi,
> 
> Several of the new tests fail on arm and aarch64 with -mabi=ilp32.
> 
> On arm:
> 
> gcc:gcc.dg/dg.exp=c-c++-common/auto-init-1.c  -Wc++-compat   scan-tree-dump 
> gimple "temp5 = .DEFERRED_INIT \\(8, 2, 0\\)"
>gcc:gcc.dg/dg.exp=c-c++-common/auto-init-1.c  -Wc++-compat   
> scan-tree-dump gimple "temp7 = .DEFERRED_INIT \\(8, 2, 0\\)"
>gcc:gcc.dg/dg.exp=c-c++-common/auto-init-2.c  -Wc++-compat   
> scan-tree-dump gimple "temp5 = .DEFERRED_INIT \\(8, 1, 0\\)"
>gcc:gcc.dg/dg.exp=c-c++-common/auto-init-2.c  -Wc++-compat   
> scan-tree-dump gimple "temp7 = .DEFERRED_INIT \\(8, 1, 0\\)"
>gcc:gcc.dg/dg.exp=c-c++-common/auto-init-3.c  -Wc++-compat   
> scan-tree-dump gimple "temp3 = .DEFERRED_INIT \\(16, 2, 0\\)"
>gcc:gcc.dg/dg.exp=c-c++-common/auto-init-4.c  -Wc++-compat   
> scan-tree-dump gimple "temp3 = .DEFERRED_INIT \\(16, 1, 0\\)"
>gcc:gcc.dg/dg.exp=c-c++-common/auto-init-5.c  -Wc++-compat   
> scan-tree-dump gimple "temp3 = .DEFERRED_INIT \\(32, 2, 0\\)"
>gcc:gcc.dg/dg.exp=c-c++-common/auto-init-6.c  -Wc++-compat   
> scan-tree-dump gimple "temp3 = .DEFERRED_INIT \\(32, 1, 0\\)"
>gcc:gcc.dg/dg.exp=c-c++-common/auto-init-padding-1.c  -Wc++-compat   
> scan-tree-dump gimple ".DEFERRED_INIT \\(24, 1, 0\\)"
> 
> on aarch64 -mabi=ilp32:
> 
>gcc:gcc.dg/dg.exp=c-c++-common/auto-init-1.c  -Wc++-compat   
> scan-tree-dump gimple "temp5 = .DEFERRED_INIT \\(8, 2, 0\\)"
>gcc:gcc.dg/dg.exp=c-c++-common/auto-init-1.c  -Wc++-compat   
> scan-tree-dump gimple "temp7 = .DEFERRED_INIT \\(8, 2, 0\\)"
>gcc:gcc.dg/dg.exp=c-c++-common/auto-init-2.c  -Wc++-compat   
> scan-tree-dump gimple "temp5 = .DEFERRED_INIT \\(8, 1, 0\\)"
>gcc:gcc.dg/dg.exp=c-c++-common/auto-init-2.c  -Wc++-compat   
> scan-tree-dump gimple "temp7 = .DEFERRED_INIT \\(8, 1, 0\\)"
>gcc:gcc.dg/dg.exp=c-c++-common/auto-init-padding-1.c  -Wc++-compat   
> scan-tree-dump gimple ".DEFERRED_INIT \\(24, 1, 0\\)"
>gcc:gcc.target/aarch64/aarch64.exp=gcc.target/aarch64/auto-init-2.c 
> scan-rtl-dump-times expand "0xfefefefefefefefe" 2
>
> gcc:gcc.target/aarch64/aarch64.exp=gcc.target/aarch64/auto-init-padding-5.c 
> scan-assembler-times stp\txzr, xzr, 2
> 
> Can you check?
> 
> 
> Thanks,
> 
> Christophe
> 
> 
> 
>> Thanks.
>> 
>> Qing
>> 
>>> On Sep 6, 2021, at 5:16 AM, Richard Biener  wrote:
>>> 
>>> On Sat, 21 Aug 2021, Qing Zhao wrote:
>>> 
 Hi,
 
 This is the 8th version of the patch for the new security feature for GCC.
 I have tested it with bootstrap on both x86 and aarch64, regression 
 testing on both x86 and aarch64.
 Also tested it with the kernel testing case provided by Kees.
 Also compile CPU2017 (running is ongoing), without any issue.
 
 Please take a look at this patch and let me know any issues.
>>> +  /* If this DECL is a VLA, a temporary address variable for it has been
>>> + created, the replacement for DECL is recorded in DECL_VALUE_EXPR
>>> (decl),
>>> + we should use it as the LHS of the call.  */
>>> +
>>> +  tree lhs_call
>>> += is_vla ? DECL_VALUE_EXPR (decl) : decl;
>>> +  gimplify_assign (lhs_call, call, seq_p);
>>> 
>>> you shouldn't need to replace the lhs with DECL_VALUE_EXPR of it
>>> here, gimplify_assign should take care of that.
>>> 
>>> +/* Return true if the DECL need to be automaticly initialized by the
>>> +   compiler.  */
>>> +static bool
>>> +is_var_need_auto_init (tree decl)
>>> +{
>>> +  if (auto_var_p (decl)
>>> +  && (opt_for_fn (current_function_decl, flag_auto_var_init)
>>> 
>>> maybe I said otherwise at some point but you can test 'flag_auto_var_init'
>>> directly when not in an IPA pass, no need to use 'opt_for_fn'
>>> 
>>> +   > AUTO_INIT_UNINITIALIZED)
>>> +  && (!lookup_attribute ("uninitialized", DECL_ATTRIBUTES (decl
>>> +return true;
>>> +  return false;
>>> 
>>> 
>>> diff --git a/gcc/tree.c b/gcc/tree.c
>>> index e923e67b6942..23d7b17774ce 100644
>>> --- a/gcc/tree.c
>>> +++ b/gcc/tree.c
>>> @@ -9508,6 +9508,22 @@ build_common_builtin_nodes (void)
>>>   tree tmp, ftype;
>>>   int ecf_flags;
>>> 
>>> +  /* If user requests automatic variables initialization, the builtin
>>> + BUILT_IN_CLEAR_PADDING is needed.  */
>>> +  if (flag_auto_var_init > AUTO_INIT_UNINITIALIZED
>>> +  && !builtin_decl_explicit_p (BUILT_IN_CLEAR_PADDING))
>>> 
>>> I think this is prone to fail with LTO and auto-var-init setting
>>> different in different TUs.  Just build 

Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-10 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 10, 2021 at 9:44 PM Hongtao Liu  wrote:
>
> On Fri, Sep 10, 2021 at 9:32 PM Richard Biener
>  wrote:
> >
> > On September 10, 2021 3:27:09 PM GMT+02:00, Hongtao Liu 
> >  wrote:
> > >On Fri, Sep 10, 2021 at 9:16 PM Richard Biener via Gcc-patches
> > > wrote:
> > >>
> > >> On Fri, Sep 10, 2021 at 2:58 PM liuhongt  wrote:
> > >> >
> > >> > gcc/ChangeLog:
> > >> >
> > >> > * expmed.c (extract_bit_field_using_extv): validate_subreg
> > >> > before call gen_lowpart.
> > >> > ---
> > >> >  gcc/expmed.c | 6 +-
> > >> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > >> >
> > >> > diff --git a/gcc/expmed.c b/gcc/expmed.c
> > >> > index 3143f38e057..10d62d857a8 100644
> > >> > --- a/gcc/expmed.c
> > >> > +++ b/gcc/expmed.c
> > >> > @@ -1571,12 +1571,16 @@ extract_bit_field_using_extv (const 
> > >> > extraction_insn *extv, rtx op0,
> > >> >
> > >> >if (GET_MODE (target) != ext_mode)
> > >> >  {
> > >> > +  machine_mode tmode = GET_MODE (target);
> > >> >/* Don't use LHS paradoxical subreg if explicit truncation is 
> > >> > needed
> > >> >  between the mode of the extraction (word_mode) and the target
> > >> >  mode.  Instead, create a temporary and use convert_move to set
> > >> >  the target.  */
> > >> >if (REG_P (target)
> > >> > - && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), 
> > >> > ext_mode))
> > >>
> > >> ^^^
> > >>
> > >> I wonder if herein lies the problem in that the HFmode "truncation" from 
> > >> SImode
> > >> is considered noop?  Note the underlying target hook only looks at the 
> > >> mode
> > >> precision and thus receives 16 and 32, and thus maybe that
> > >> TRULY_NOOP_TRUNCATION_MODES_P query only makes sense for
> > >> integer modes?  Though the documentation of the hook only talks about
> > >> "conversion" of "values" ...
> > >>
> > >> So maybe a targetm.modes_tieable_p (GET_MODE (target), extmode) check
> > >> is missing?
> > >
> > >According to document, it should be true for
> > >targetm.modes_tieable_p(HFmode, SImode) since HFmode can be allocated
> > >to gpr.
> > >
> > >
> > >This hook returns true if a value of mode mode1 is accessible in mode
> > >mode2 without
> > >copying
> > >---
> > >
> > >and also here gen_lowpart (SImode, HFmode, target) is called and hit
> > >gcc_assert, not (subreg:HF (reg:SI) 0)
> >
> > I see. Of course that leads to a suggestion to allow the subreg based on 
> > modes_tieable_p, but then others will know why that's the wrong thing to do?
> I'm testing this
>
> 1 file changed, 2 insertions(+), 1 deletion(-)
> gcc/expmed.c | 3 ++-
>
> modified   gcc/expmed.c
> @@ -1576,7 +1576,8 @@ extract_bit_field_using_extv (const
> extraction_insn *extv, rtx op0,
>   mode.  Instead, create a temporary and use convert_move to set
>   the target.  */
>if (REG_P (target)
> -   && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
> +   && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode)
> +   && targetm.modes_tieable_p (GET_MODE (target), ext_mode))
>   {
> target = gen_lowpart (ext_mode, target);
> if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
>
Updated patch.

  Bootstrapped and regtested on x86_64-linux-gnu{-m32,},  do I need to
run this patch on other targets machine, or the patch is supposed to
have minimal impact on other targets?
  Then, ok for trunk?

> >
> > Richard.
> >
> > >>
> > >> > + && TRULY_NOOP_TRUNCATION_MODES_P (tmode, ext_mode)
> > >> > + && validate_subreg (ext_mode, tmode,
> > >> > + target,
> > >> > + subreg_lowpart_offset (ext_mode, tmode)))
> > >> > {
> > >> >   target = gen_lowpart (ext_mode, target);
> > >> >   if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
> > >> > --
> > >> > 2.27.0
> > >> >
> > >
> > >
> > >
> >
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao


v2-0001-Check-modes_tieable_p-before-call-gen_lowpart-to-.patch
Description: Binary data


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-10 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 10, 2021 at 10:08 PM David Edelsohn  wrote:
>
> On Thu, Sep 9, 2021 at 11:03 PM Hongtao Liu  wrote:
> >
> > On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool
> >  wrote:
> > >
> > > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote:
> > > > > I think we should (longer term) get rid of the overloaded meanings and
> > > > > uses of subregs.  One fairly simple thing is to make a new rtx code
> > > > > "bit_cast" (or is there a nice short more traditional name for it?)
> > > >
> > > > But subreg _is_ bit_cast.
> > >
> > > It is not.  (subreg:M (reg:N) O) for O>0, little-endian, is not a
> > > bit_cast.  It is taking a part of a register, or a single register from
> > > a multi-register thing.  Paradoxicals are not bit-casts either.
> > >
> > > Subregs from or to (but not both) integer modes are generally bit_cast,
> > > yeah.
> > >
> > > > What is odd to me is that a "disallowed" subreg
> > > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of
> > > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 
> > > > 0) 0).
> > > > Of course that's nested and invalid but just push the inner subreg to a
> > > > new pseudo and the thing becomes valid.
> > >
> > > Bingo.
> > >
> > > And many targets have strange rules for bit-strings in which modes can
> > > be used as bit-strings in which other modes, and at what offsets in
> > > which registers.  Now perhaps none of that is optimal (I bet it isn't),
> > > but changing this without a transition plan simply does not work.
> > >
> > > > > But that is not the core problem we had here.  The behaviour of the
> > > > > generic parts of the compiler was changed, without testing if that
> > > > > works on other targets but x86.  That is an understandable mistake, it
> > > > > takes some experience to know where the morasses are.  But this change
> > > > > should have been accompanied by testcases exercising the changed code.
> > > > > We would have clearly seen there are issues then, simply by watching
> > > > > gcc-testresults@ (and/or maintainers are on top of the test results
> > > > > anyway).  Also, if there were testcases for this, we could have some
> > > > > confidence that a change in this area is robust.
> > > >
> > > > Well, that only works if some maintainers that are familiar enough
> > > > with all this chime in ;)
> > >
> > > Not really.  It works always.  And it works way better than the
> > > pandemonium we now have with broken targets left and right.
> > >
> > > With testcases anyone can see if any specific target is broken here.
> > >
> > > > It's stage1 so it's understandable that some
> > > > people (like me ...) are tyring to help people making progress even
> > > > if that involves trying to decipher 30 years of GCC history in this
> > > > area (without much success in the end as we see) ;)
> > >
> > > Yeah :-)  And my thanks to you and everyone involved for tackling this
> > > problematic part of GCC, which has been neglected and patched over for
> > > way too long.  But from that same history it follows that anything you
> > > do not super carefully (with testing everywhere) will cause some serious
> > Frankly, testing everywhere is too heavy a burden for developers,
> > after all, everyone has a limited variety of machines, and may not be
> > familiar with using  other targets' simulators.
>
> Hongtao,
>
> This is why the GNU Toolchain community sponsors the GCC Compile Farm.
> Not every architecture is provided, but a good subset.  And there are
> a few, powerful Linux on Power systems in the cluster, such as gcc135.
>
I see.
> And the comments in the code warned of port-specific problems.  Even
> if you cannot test the patch yourself, you should have publicly
> requested that other port maintainers test the patch to shake out
> problems.
Yes, sorry for the inconvenience.
>
> The Tree-SSA passes are mostly target-independent, but RTL is
> target-dependent, which can elicit widely different behavior in the
> RTL passes.  When you make changes to RTL passes in the common parts
> of the compiler, you must consider the impact on all targets.  GCC
> explicitly supports a wide variety of architectures, ABIs and OSes.
> All of the developers strive to ensure that changes don't adversely
> affect any targets.
>
> Thanks, David



-- 
BR,
Hongtao


Re: [PATCH 0/3] bpf: add -mcpu and related feature options

2021-09-10 Thread Jose E. Marchesi via Gcc-patches


Hi David.

> New instructions have been added over time to the eBPF ISA, but
> previously there has been no good method to select which version to
> target in GCC.
>
> This patch adds the following options to the BPF backend:
>
>   -mcpu={v1, v2, v3}
> Select which version of the eBPF ISA to target. This enables or
> disables generation of certain instructions. The default is v3.
>
>   -mjmpext
> Enable extra conditional branch instructions.
> Enabled for CPU v2 and above.
>
>   -mjmp32
> Enable 32-bit jump/branch instructions.
> Enabled for CPU v3 and above.
>
>   -malu32
> Enable 32-bit ALU instructions.
> Enabled for CPU v3 and above.
>
> Negative versions of -mjmpext, -mjmp32, and -malu32 options are also
> supported.

The series is OK.
Thanks!

>
> David Faust (3):
>   bpf: add -mcpu and related feature options
>   bpf testsuite: add tests for new feature options
>   doc: document BPF -mcpu and related options
>
>  gcc/config/bpf/bpf-opts.h|  7 
>  gcc/config/bpf/bpf-protos.h  |  1 +
>  gcc/config/bpf/bpf.c | 41 
>  gcc/config/bpf/bpf.md| 44 +++--
>  gcc/config/bpf/bpf.opt   | 29 ++
>  gcc/doc/invoke.texi  | 39 ++-
>  gcc/testsuite/gcc.target/bpf/alu-1.c | 56 +++
>  gcc/testsuite/gcc.target/bpf/jmp-1.c | 57 
>  8 files changed, 253 insertions(+), 21 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/bpf/alu-1.c
>  create mode 100644 gcc/testsuite/gcc.target/bpf/jmp-1.c


[committed] libstdc++: Use "test.invalid." for invalid hostname

2021-09-10 Thread Jonathan Wakely via Gcc-patches
This avoids test.invalid.some.domain being successfully resolved.

libstdc++-v3/ChangeLog:

* testsuite/experimental/net/internet/resolver/ops/lookup.cc:
Fix invalid hostname to only match the .invalid TLD.

Tested x86_64-linux. Committed to trunk.

commit 7f8af6dc82a0dac0d97fdd4d1f2055e932f29216
Author: Jonathan Wakely 
Date:   Fri Sep 10 15:08:27 2021

libstdc++: Use "test.invalid." for invalid hostname

This avoids test.invalid.some.domain being successfully resolved.

libstdc++-v3/ChangeLog:

* testsuite/experimental/net/internet/resolver/ops/lookup.cc:
Fix invalid hostname to only match the .invalid TLD.

diff --git 
a/libstdc++-v3/testsuite/experimental/net/internet/resolver/ops/lookup.cc 
b/libstdc++-v3/testsuite/experimental/net/internet/resolver/ops/lookup.cc
index 69be194fa29..8bd4dbacad2 100644
--- a/libstdc++-v3/testsuite/experimental/net/internet/resolver/ops/lookup.cc
+++ b/libstdc++-v3/testsuite/experimental/net/internet/resolver/ops/lookup.cc
@@ -97,7 +97,7 @@ test03()
   std::error_code ec;
   io_context ctx;
   ip::tcp::resolver resolv(ctx);
-  auto addrs = resolv.resolve("test.invalid", "http", ec);
+  auto addrs = resolv.resolve("test.invalid.", "http", ec);
   VERIFY( ec );
   VERIFY( addrs.size() == 0 );
   VERIFY( addrs.begin() == addrs.end() );
@@ -105,7 +105,7 @@ test03()
 #if __cpp_exceptions
   bool caught = false;
   try {
-resolv.resolve("test.invalid", "http");
+resolv.resolve("test.invalid.", "http");
   } catch (const std::system_error& e) {
 caught = true;
 VERIFY( e.code() == ec );


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-10 Thread David Edelsohn via Gcc-patches
On Thu, Sep 9, 2021 at 11:03 PM Hongtao Liu  wrote:
>
> On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool
>  wrote:
> >
> > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote:
> > > > I think we should (longer term) get rid of the overloaded meanings and
> > > > uses of subregs.  One fairly simple thing is to make a new rtx code
> > > > "bit_cast" (or is there a nice short more traditional name for it?)
> > >
> > > But subreg _is_ bit_cast.
> >
> > It is not.  (subreg:M (reg:N) O) for O>0, little-endian, is not a
> > bit_cast.  It is taking a part of a register, or a single register from
> > a multi-register thing.  Paradoxicals are not bit-casts either.
> >
> > Subregs from or to (but not both) integer modes are generally bit_cast,
> > yeah.
> >
> > > What is odd to me is that a "disallowed" subreg
> > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of
> > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 
> > > 0) 0).
> > > Of course that's nested and invalid but just push the inner subreg to a
> > > new pseudo and the thing becomes valid.
> >
> > Bingo.
> >
> > And many targets have strange rules for bit-strings in which modes can
> > be used as bit-strings in which other modes, and at what offsets in
> > which registers.  Now perhaps none of that is optimal (I bet it isn't),
> > but changing this without a transition plan simply does not work.
> >
> > > > But that is not the core problem we had here.  The behaviour of the
> > > > generic parts of the compiler was changed, without testing if that
> > > > works on other targets but x86.  That is an understandable mistake, it
> > > > takes some experience to know where the morasses are.  But this change
> > > > should have been accompanied by testcases exercising the changed code.
> > > > We would have clearly seen there are issues then, simply by watching
> > > > gcc-testresults@ (and/or maintainers are on top of the test results
> > > > anyway).  Also, if there were testcases for this, we could have some
> > > > confidence that a change in this area is robust.
> > >
> > > Well, that only works if some maintainers that are familiar enough
> > > with all this chime in ;)
> >
> > Not really.  It works always.  And it works way better than the
> > pandemonium we now have with broken targets left and right.
> >
> > With testcases anyone can see if any specific target is broken here.
> >
> > > It's stage1 so it's understandable that some
> > > people (like me ...) are tyring to help people making progress even
> > > if that involves trying to decipher 30 years of GCC history in this
> > > area (without much success in the end as we see) ;)
> >
> > Yeah :-)  And my thanks to you and everyone involved for tackling this
> > problematic part of GCC, which has been neglected and patched over for
> > way too long.  But from that same history it follows that anything you
> > do not super carefully (with testing everywhere) will cause some serious
> Frankly, testing everywhere is too heavy a burden for developers,
> after all, everyone has a limited variety of machines, and may not be
> familiar with using  other targets' simulators.

Hongtao,

This is why the GNU Toolchain community sponsors the GCC Compile Farm.
Not every architecture is provided, but a good subset.  And there are
a few, powerful Linux on Power systems in the cluster, such as gcc135.

And the comments in the code warned of port-specific problems.  Even
if you cannot test the patch yourself, you should have publicly
requested that other port maintainers test the patch to shake out
problems.

The Tree-SSA passes are mostly target-independent, but RTL is
target-dependent, which can elicit widely different behavior in the
RTL passes.  When you make changes to RTL passes in the common parts
of the compiler, you must consider the impact on all targets.  GCC
explicitly supports a wide variety of architectures, ABIs and OSes.
All of the developers strive to ensure that changes don't adversely
affect any targets.

Thanks, David


Re: [COMMITTED][patch][version 9]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-09-10 Thread Jeff Law via Gcc-patches




On 9/10/2021 3:08 AM, Martin Liška wrote:

On 9/10/21 10:47, Christophe LYON via Gcc-patches wrote:

Can you check?


It's the following bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102269
And it's already fixed too ;-)  All my targets with a kernel build 
component were failing because of this ;-)


jeff


[PATCH] libsanitizer: Add AM_CCASFLAGS to Makefile.am

2021-09-10 Thread H.J. Lu via Gcc-patches
Add AM_CCASFLAGS to Makefile.am to compile assembly codes with $CET_FLAGS.

* asan/Makefile.am (AM_CCASFLAGS): New.  Set to $(EXTRA_ASFLAGS).
* hwasan/Makefile.am (AM_CCASFLAGS): Likewise.
* interception/Makefile.am (AM_CCASFLAGS): Likewise.
* lsan/Makefile.am (AM_CCASFLAGS): Likewise.
* tsan/Makefile.am (AM_CCASFLAGS): Likewise.
* usan/Makefile.am (AM_CCASFLAGS): Likewise.
* asan/Makefile.in: Regenerate.
* hwasan/Makefile.in: Likewise.
* interception/Makefile.in: Likewise.
* lsan/Makefile.in: Likewise.
* tsan/Makefile.in: Likewise.
* usan/Makefile.in: Likewise.
---
 libsanitizer/asan/Makefile.am | 1 +
 libsanitizer/asan/Makefile.in | 1 +
 libsanitizer/hwasan/Makefile.am   | 1 +
 libsanitizer/hwasan/Makefile.in   | 1 +
 libsanitizer/interception/Makefile.am | 1 +
 libsanitizer/interception/Makefile.in | 1 +
 libsanitizer/lsan/Makefile.am | 1 +
 libsanitizer/lsan/Makefile.in | 1 +
 libsanitizer/tsan/Makefile.am | 1 +
 libsanitizer/tsan/Makefile.in | 1 +
 libsanitizer/ubsan/Makefile.am| 1 +
 libsanitizer/ubsan/Makefile.in| 1 +
 12 files changed, 12 insertions(+)

diff --git a/libsanitizer/asan/Makefile.am b/libsanitizer/asan/Makefile.am
index 74658ca7b9c..4f802f723d6 100644
--- a/libsanitizer/asan/Makefile.am
+++ b/libsanitizer/asan/Makefile.am
@@ -11,6 +11,7 @@ AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings 
-pedantic -Wno-long
 AM_CXXFLAGS += $(LIBSTDCXX_RAW_CXX_CXXFLAGS)
 AM_CXXFLAGS += -std=gnu++14
 AM_CXXFLAGS += $(EXTRA_CXXFLAGS)
+AM_CCASFLAGS = $(EXTRA_ASFLAGS)
 ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config
 
 toolexeclib_LTLIBRARIES = libasan.la
diff --git a/libsanitizer/asan/Makefile.in b/libsanitizer/asan/Makefile.in
index 53efe526f9c..528ab61312c 100644
--- a/libsanitizer/asan/Makefile.in
+++ b/libsanitizer/asan/Makefile.in
@@ -421,6 +421,7 @@ AM_CXXFLAGS = -Wall -W -Wno-unused-parameter 
-Wwrite-strings -pedantic \
-fomit-frame-pointer -funwind-tables -fvisibility=hidden \
-Wno-variadic-macros -fno-ipa-icf \
$(LIBSTDCXX_RAW_CXX_CXXFLAGS) -std=gnu++14 $(EXTRA_CXXFLAGS)
+AM_CCASFLAGS = $(EXTRA_ASFLAGS)
 ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config
 toolexeclib_LTLIBRARIES = libasan.la
 nodist_toolexeclib_HEADERS = libasan_preinit.o
diff --git a/libsanitizer/hwasan/Makefile.am b/libsanitizer/hwasan/Makefile.am
index 5e3a0f1b0a1..7f5a737a6bb 100644
--- a/libsanitizer/hwasan/Makefile.am
+++ b/libsanitizer/hwasan/Makefile.am
@@ -8,6 +8,7 @@ AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings 
-pedantic -Wno-long
 AM_CXXFLAGS += $(LIBSTDCXX_RAW_CXX_CXXFLAGS)
 AM_CXXFLAGS += -std=gnu++14
 AM_CXXFLAGS += $(EXTRA_CXXFLAGS)
+AM_CCASFLAGS = $(EXTRA_ASFLAGS)
 ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config
 
 toolexeclib_LTLIBRARIES = libhwasan.la
diff --git a/libsanitizer/hwasan/Makefile.in b/libsanitizer/hwasan/Makefile.in
index 22c5266a120..4d216ad4a48 100644
--- a/libsanitizer/hwasan/Makefile.in
+++ b/libsanitizer/hwasan/Makefile.in
@@ -409,6 +409,7 @@ AM_CXXFLAGS = -Wall -W -Wno-unused-parameter 
-Wwrite-strings -pedantic \
-funwind-tables -fvisibility=hidden -Wno-variadic-macros \
-fno-ipa-icf $(LIBSTDCXX_RAW_CXX_CXXFLAGS) -std=gnu++14 \
$(EXTRA_CXXFLAGS)
+AM_CCASFLAGS = $(EXTRA_ASFLAGS)
 ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config
 toolexeclib_LTLIBRARIES = libhwasan.la
 hwasan_files = \
diff --git a/libsanitizer/interception/Makefile.am 
b/libsanitizer/interception/Makefile.am
index efa90a49aa1..f7013b4ea94 100644
--- a/libsanitizer/interception/Makefile.am
+++ b/libsanitizer/interception/Makefile.am
@@ -8,6 +8,7 @@ AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings 
-pedantic -Wno-long
 AM_CXXFLAGS += $(LIBSTDCXX_RAW_CXX_CXXFLAGS)
 AM_CXXFLAGS += -std=gnu++14
 AM_CXXFLAGS += $(EXTRA_CXXFLAGS)
+AM_CCASFLAGS = $(EXTRA_ASFLAGS)
 ACLOCAL_AMFLAGS = -I m4
 
 noinst_LTLIBRARIES = libinterception.la
diff --git a/libsanitizer/interception/Makefile.in 
b/libsanitizer/interception/Makefile.in
index 4a872cb4969..326ee9a1818 100644
--- a/libsanitizer/interception/Makefile.in
+++ b/libsanitizer/interception/Makefile.in
@@ -339,6 +339,7 @@ AM_CXXFLAGS = -Wall -W -Wno-unused-parameter 
-Wwrite-strings -pedantic \
-fomit-frame-pointer -funwind-tables -fvisibility=hidden \
-Wno-variadic-macros $(LIBSTDCXX_RAW_CXX_CXXFLAGS) \
-std=gnu++14 $(EXTRA_CXXFLAGS)
+AM_CCASFLAGS = $(EXTRA_ASFLAGS)
 ACLOCAL_AMFLAGS = -I m4
 noinst_LTLIBRARIES = libinterception.la
 interception_files = \
diff --git a/libsanitizer/lsan/Makefile.am b/libsanitizer/lsan/Makefile.am
index f4db8e37683..6ff28ff5eea 100644
--- a/libsanitizer/lsan/Makefile.am
+++ b/libsanitizer/lsan/Makefile.am
@@ -8,6 +8,7 @@ AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings 
-pedantic -Wno-long
 AM_CXXFLAGS += 

Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-10 Thread Richard Biener via Gcc-patches
On September 10, 2021 3:30:10 PM GMT+02:00, Segher Boessenkool 
 wrote:
>On Fri, Sep 10, 2021 at 03:15:56PM +0200, Richard Biener wrote:
>> On Fri, Sep 10, 2021 at 2:58 PM liuhongt  wrote:
>> >if (REG_P (target)
>> > - && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
>> 
>> ^^^
>> 
>> I wonder if herein lies the problem in that the HFmode "truncation" from 
>> SImode
>> is considered noop?  Note the underlying target hook only looks at the mode
>> precision and thus receives 16 and 32, and thus maybe that
>> TRULY_NOOP_TRUNCATION_MODES_P query only makes sense for
>> integer modes?  Though the documentation of the hook only talks about
>> "conversion" of "values" ...
>
>@deftypefn {Target Hook} bool TARGET_TRULY_NOOP_TRUNCATION (poly_uint64 
>@var{outprec}, poly_uint64 @var{inprec})
>This hook returns true if it is safe to ``convert'' a value of
>@var{inprec} bits to one of @var{outprec} bits (where @var{outprec} is
>smaller than @var{inprec}) by merely operating on it as if it had only
>@var{outprec} bits.  The default returns true unconditionally, which
>is correct for most machines.  When @code{TARGET_TRULY_NOOP_TRUNCATION}
>returns false, the machine description should provide a @code{trunc}
>optab to specify the RTL that performs the required truncation.
>
>
>@cindex @code{trunc@var{m}@var{n}2} instruction pattern
>@item @samp{trunc@var{m}@var{n}2}
>Truncate operand 1 (valid for mode @var{m}) to mode @var{n} and
>store in operand 0 (which has mode @var{n}).  Both modes must be fixed
>point or both floating point.
>
>
>TRULY_NOOP_TRUNCATION does not make sense to ask if changing mode class.

OK, so there's a mode class comparison missing here which should be a better 
fix than calling validate_subreg? 

Richard. 

>
>Segher



Re: [PATCH] tree-optimization/102155 - fix LIM fill_always_executed_in CFG walk

2021-09-10 Thread Xionghu Luo via Gcc-patches




On 2021/9/9 18:55, Richard Biener wrote:

diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 5d6845478e7..4b187c2cdaf 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -3074,15 +3074,13 @@ fill_always_executed_in_1 (class loop *loop, sbitmap 
contains_call)
break;
  
  	  if (bb->loop_father->header == bb)

-   {
- if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
-   break;
-
- /* In a loop that is always entered we may proceed anyway.
-But record that we entered it and stop once we leave it
-since it might not be finite.  */
- inn_loop = bb->loop_father;
-   }
+   /* Record that we enter into a subloop since it might not
+  be finite.  */
+   /* ???  Entering into a not always executed subloop makes
+  fill_always_executed_in quadratic in loop depth since
+  we walk those loops N times.  This is not a problem
+  in practice though, see PR102253 for a worst-case testcase.  */
+   inn_loop = bb->loop_father;



Yes your two patches extracted the get_loop_body_in_dom_order out and removed
the inn_loop break logic when it doesn't dominate outer loop.  Confirmed the 
replacement
could improve for saving ~10% build time due to not full DOM walker and marked 
the previously
ignored ALWAYS_EXECUTED bbs.
But if we don't break for inner loop again, why still keep the *inn_loop* 
variable?
It seems unnecessary and confusing, could we just remove it and restore the 
original
infinte loop check in bb->succs for better understanding?


diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index d1e2104233b..82a0509e0c4 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -3200,7 +3200,6 @@ fill_always_executed_in_1 (class loop *loop, sbitmap 
contains_call)
 {
   basic_block bb = NULL, last = NULL;
   edge e;
-  class loop *inn_loop = loop;

   if (ALWAYS_EXECUTED_IN (loop->header) == NULL)
 {
@@ -3213,17 +3212,6 @@ fill_always_executed_in_1 (class loop *loop, sbitmap 
contains_call)
  edge_iterator ei;
  bb = worklist.pop ();

- if (!flow_bb_inside_loop_p (inn_loop, bb))
-   {
- /* When we are leaving a possibly infinite inner loop
-we have to stop processing.  */
- if (!finite_loop_p (inn_loop))
-   break;
- /* If the loop was finite we can continue with processing
-the loop we exited to.  */
- inn_loop = bb->loop_father;
-   }
-
  if (dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
last = bb;

@@ -3232,8 +3220,15 @@ fill_always_executed_in_1 (class loop *loop, sbitmap 
contains_call)

 /* If LOOP exits from this BB stop processing.  */
  FOR_EACH_EDGE (e, ei, bb->succs)
+ {
if (!flow_bb_inside_loop_p (loop, e->dest))
  break;
+   /* Or we enter a possibly non-finite loop.  */
+   if (flow_loop_nested_p (bb->loop_father,
+ e->dest->loop_father)
+   && ! finite_loop_p (e->dest->loop_father))
+ break;
+ }
  if (e)
break;

@@ -3242,15 +3237,6 @@ fill_always_executed_in_1 (class loop *loop, sbitmap 
contains_call)
  if (bb->flags & BB_IRREDUCIBLE_LOOP)
break;

- if (bb->loop_father->header == bb)
-   /* Record that we enter into a subloop since it might not
-  be finite.  */
-   /* ???  Entering into a not always executed subloop makes
-  fill_always_executed_in quadratic in loop depth since
-  we walk those loops N times.  This is not a problem
-  in practice though, see PR102253 for a worst-case testcase.  */
-   inn_loop = bb->loop_father;
-
  /* Walk the body of LOOP sorted by dominance relation.  Additionally,
 if a basic block S dominates the latch, then only blocks dominated
 by S are after it.

 

  
  	  /* Walk the body of LOOP sorted by dominance relation.  Additionally,

 if a basic block S dominates the latch, then only blocks dominated


--
Thanks,
Xionghu


Re: More aggressive threading causing loop-interchange-9.c regression

2021-09-10 Thread Aldy Hernandez via Gcc-patches



On 9/10/21 3:16 PM, Michael Matz wrote:

Hi,

On Fri, 10 Sep 2021, Aldy Hernandez via Gcc-patches wrote:


  }
+
+  /* Threading through a non-empty latch would cause code to be added


"through an *empty* latch".  The test in code is correct, though.


Whoops.



And for the before/after loops flag you added: we have a
cfun->curr_properties field which can be used.  We even already have a
PROP_loops flag but that is set throughout compilation from CFG
construction until the RTL loop optimizers, so can't be re-used for what
is needed here.  But you still could invent another PROP_ value instead of
adding a new field in struct function.


Oooo, even better.  No inline functions.

Like this?
Aldy
>From ff25faa8dd8721da9bb4715706c662fc09fd4e8c Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Thu, 9 Sep 2021 20:30:28 +0200
Subject: [PATCH] Disable threading through latches until after loop
 optimizations.

The motivation for this patch was enabling the use of global ranges in
the path solver, but this caused certain properties of loops being
destroyed which made subsequent loop optimizations to fail.
Consequently, this patch's mail goal is to disable jump threading
involving the latch until after loop optimizations have run.

As can be seen in the test adjustments, we mostly shift the threading
from the early threaders (ethread, thread[12] to the late threaders
thread[34]).  I have nuked some of the early notes in the testcases
that came as part of the jump threader rewrite.  They're mostly noise
now.

Note that we could probably relax some other restrictions in
profitable_path_p when loop optimizations have completed, but it would
require more testing, and I'm hesitant to touch more things than needed
at this point.  I have added a reminder to the function to keep this
in mind.

Finally, perhaps as a follow-up, we should apply the same restrictions to
the forward threader.  At some point I'd like to combine the cost models.

Tested on x86-64 Linux.

p.s. There is a thorough discussion involving the limitations of jump
threading involving loops here:

	https://gcc.gnu.org/pipermail/gcc/2021-September/237247.html

gcc/ChangeLog:

	* tree-pass.h (PROP_loop_opts_done): New.
	* gimple-range-path.cc (path_range_query::internal_range_of_expr):
	Intersect with global range.
	* tree-ssa-loop.c (tree_ssa_loop_done): Set PROP_loop_opts_done.
	* tree-ssa-threadbackward.c
	(back_threader_profitability::profitable_path_p): Disable
	threading through latches until after loop optimizations have run.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/ssa-dom-thread-2b.c: Adjust for disabling of
	threading through latches.
	* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same.

Co-authored-by: Michael Matz 
---
 gcc/gimple-range-path.cc  |  3 ++
 .../gcc.dg/tree-ssa/ssa-dom-thread-2b.c   |  4 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-6.c| 37 +--
 .../gcc.dg/tree-ssa/ssa-dom-thread-7.c| 17 +
 gcc/tree-pass.h   |  2 +
 gcc/tree-ssa-loop.c   |  2 +-
 gcc/tree-ssa-threadbackward.c | 28 +-
 7 files changed, 37 insertions(+), 56 deletions(-)

diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
index a4fa3b296ff..c616b65756f 100644
--- a/gcc/gimple-range-path.cc
+++ b/gcc/gimple-range-path.cc
@@ -127,6 +127,9 @@ path_range_query::internal_range_of_expr (irange , tree name, gimple *stmt)
   basic_block bb = stmt ? gimple_bb (stmt) : exit_bb ();
   if (stmt && range_defined_in_block (r, name, bb))
 {
+  if (TREE_CODE (name) == SSA_NAME)
+	r.intersect (gimple_range_global (name));
+
   set_cache (r, name);
   return true;
 }
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2b.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2b.c
index e1c33e86cd7..823ada982ff 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2b.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2b.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */ 
-/* { dg-options "-O2 -fdump-tree-thread1-stats -fdump-tree-dom2-stats -fdisable-tree-ethread" } */
+/* { dg-options "-O2 -fdump-tree-thread3-stats -fdump-tree-dom2-stats -fdisable-tree-ethread" } */
 
 void foo();
 void bla();
@@ -26,4 +26,4 @@ void thread_latch_through_header (void)
case.  And we want to thread through the header as well.  These
are both caught by threading in DOM.  */
 /* { dg-final { scan-tree-dump-not "Jumps threaded" "dom2"} } */
-/* { dg-final { scan-tree-dump-times "Jumps threaded: 1" 1 "thread1"} } */
+/* { dg-final { scan-tree-dump-times "Jumps threaded: 1" 1 "thread3"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
index c7bf867b084..ee46759bacc 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
@@ -1,41 +1,8 @@
 /* { dg-do compile } 

Re: Remove tilegx port

2021-09-10 Thread Jeff Law via Gcc-patches




On 9/10/2021 5:50 AM, Richard Biener via Libc-alpha wrote:

On Fri, Apr 27, 2018 at 9:32 PM Jeff Law  wrote:

On 04/27/2018 11:42 AM, Richard Biener wrote:

On April 27, 2018 7:26:19 PM GMT+02:00, Jeff Law  wrote:

On 04/27/2018 09:36 AM, Joseph Myers wrote:

Since tile support has been removed from the Linux kernel for 4.17,
this patch removes the (unmaintained) port to tilegx from glibc (the
tilepro support having been previously removed).  This reflects the
general principle that a glibc port needs upstream support for the
architecture in all the components it build-depends on (so binutils,
GCC and the Linux kernel, for the normal case of a port supporting

the

Linux kernel but no other OS), in order to be maintainable.

Apart from removal of sysdeps/tile and sysdeps/unix/sysv/linux/tile
(omitted from the diffs below), there are updates to various comments
referencing tile for which removal of those references seemed
appropriate.  The configuration is removed from README and from
build-many-glibcs.py.  contrib.texi keeps mention of removed
contributions, but I updated Chris Metcalf's entry to reflect that he
also contributed the non-removed support for the generic Linux kernel
syscall interface.  __ASSUME_FADVISE64_64_NO_ALIGN support is

removed,

as it was only used by tile.

Given tilegx/tilepro removal from the kernel and glibc, should we go
ahead and deprecate them in GCC?  The only tilegx/tilepro
configurations
are -linux.

Makes sense to me. Let's deprecate it for GCC 8 and remove from trunk.

Richard.


Jeff

Here's what I committed to the trunk and the release branch.  I'll
find/update the appropriate web page momentarily.

It's been deprecated since GCC 8 now but the port is still on trunk,
guarded by --enable-obsolete - is it time to remove it?

Definitely.  Folks have had years to complain.

If you wanted to add cr16 to the deprecated ports list, I'd fully 
support that as well.  It hasn't built since dropping cc0 and there 
doesn't appear to be anyone interested in making it work.


jeff



Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-10 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 10, 2021 at 9:32 PM Richard Biener
 wrote:
>
> On September 10, 2021 3:27:09 PM GMT+02:00, Hongtao Liu  
> wrote:
> >On Fri, Sep 10, 2021 at 9:16 PM Richard Biener via Gcc-patches
> > wrote:
> >>
> >> On Fri, Sep 10, 2021 at 2:58 PM liuhongt  wrote:
> >> >
> >> > gcc/ChangeLog:
> >> >
> >> > * expmed.c (extract_bit_field_using_extv): validate_subreg
> >> > before call gen_lowpart.
> >> > ---
> >> >  gcc/expmed.c | 6 +-
> >> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/gcc/expmed.c b/gcc/expmed.c
> >> > index 3143f38e057..10d62d857a8 100644
> >> > --- a/gcc/expmed.c
> >> > +++ b/gcc/expmed.c
> >> > @@ -1571,12 +1571,16 @@ extract_bit_field_using_extv (const 
> >> > extraction_insn *extv, rtx op0,
> >> >
> >> >if (GET_MODE (target) != ext_mode)
> >> >  {
> >> > +  machine_mode tmode = GET_MODE (target);
> >> >/* Don't use LHS paradoxical subreg if explicit truncation is 
> >> > needed
> >> >  between the mode of the extraction (word_mode) and the target
> >> >  mode.  Instead, create a temporary and use convert_move to set
> >> >  the target.  */
> >> >if (REG_P (target)
> >> > - && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
> >>
> >> ^^^
> >>
> >> I wonder if herein lies the problem in that the HFmode "truncation" from 
> >> SImode
> >> is considered noop?  Note the underlying target hook only looks at the mode
> >> precision and thus receives 16 and 32, and thus maybe that
> >> TRULY_NOOP_TRUNCATION_MODES_P query only makes sense for
> >> integer modes?  Though the documentation of the hook only talks about
> >> "conversion" of "values" ...
> >>
> >> So maybe a targetm.modes_tieable_p (GET_MODE (target), extmode) check
> >> is missing?
> >
> >According to document, it should be true for
> >targetm.modes_tieable_p(HFmode, SImode) since HFmode can be allocated
> >to gpr.
> >
> >
> >This hook returns true if a value of mode mode1 is accessible in mode
> >mode2 without
> >copying
> >---
> >
> >and also here gen_lowpart (SImode, HFmode, target) is called and hit
> >gcc_assert, not (subreg:HF (reg:SI) 0)
>
> I see. Of course that leads to a suggestion to allow the subreg based on 
> modes_tieable_p, but then others will know why that's the wrong thing to do?
allowing subreg based on modes_tieable_p would be too strict, it even
disallows (subreg SF (reg:SI).
>
> Richard.
>
> >>
> >> > + && TRULY_NOOP_TRUNCATION_MODES_P (tmode, ext_mode)
> >> > + && validate_subreg (ext_mode, tmode,
> >> > + target,
> >> > + subreg_lowpart_offset (ext_mode, tmode)))
> >> > {
> >> >   target = gen_lowpart (ext_mode, target);
> >> >   if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
> >> > --
> >> > 2.27.0
> >> >
> >
> >
> >
>


-- 
BR,
Hongtao


Re: [PATCH] Always default to DWARF2 debugging for RX, even with -mas100-syntax

2021-09-10 Thread Jeff Law via Gcc-patches




On 9/10/2021 1:05 AM, Richard Biener wrote:

The RX port defaults to STABS when -mas100-syntax is used because
the AS100 assembler does not support some of the pseudo-ops used
by DWARF2 debug emission.  Since STABS is going to be deprecated
that has to change.  The following simply always uses DWARF2,
likely leaving -mas100-syntax broken when debug info is generated.

Can the RX port maintainer please sort out the situation?  One
option might be to drop to NO_DEBUG when -mas100-syntax is
specified but maybe there's AS100 assemblers that now support
all the required pseudo ops or there's a way to define the DWARF
output macros to work around the lack of those (it's by no means
the first tagret to have such issues).

OK for trunk?

Thanks,
Richard.

2021-09-10  Richard Biener  

* config/rx/rx.h (PREFERRED_DEBUGGING_TYPE): Always define to
DWARF2_DEBUG.
OK.  I think Nick was the rx maintainer, but if he doesn't chime in my 
recommendation would drop to NO_DEBUG when -mas100-syntax is enabled.  
I'm not immediately aware of any rx-elf users.  While my tester does 
test rx-elf, it's only with gas.


jeff



Re: [PATCH] Default AVR to DWARF2 debug

2021-09-10 Thread Jeff Law via Gcc-patches




On 9/10/2021 1:13 AM, Richard Biener via Gcc-patches wrote:

This switches the AVR port to generate DWARF2 debugging info by
default since the support for STABS is going to be deprecated for
GCC 12.

OK for trunk?

Thanks,
Richard.

2021-09-10  Richard Biener  

* config/avr/elf.h (PREFERRED_DEBUGGING_TYPE): Remove
override, pick up DWARF2_DEBUG define from elfos.h
OK.  And my tester does spin avr-elf, so if there's major breakage we'll 
know.


jeff


Re: [PATCH] Default Alpha/VMS to DWARF2 debugging only

2021-09-10 Thread Jeff Law via Gcc-patches




On 9/10/2021 12:52 AM, Richard Biener via Gcc-patches wrote:

This changes the default debug format for Alpha/VMS to DWARF2 only,
skipping emission of VMS debug info which is going do be deprecated
for GCC 12 alongside the support for STABS.

It looks like other flavors of VMS never used VMS_DEBUG by default
but only the alpha port did.

I have no good means to test anything here, it might be that we have
alpha-vms specific testcases that rely on the previous default.

OK for trunk?

Thanks,
Richard.

2021-09-10  Richard Biener  

* config/alpha/vms.h (PREFERRED_DEBUGGING_TYPE): Define to
DWARF2_DEBUG.
It's a dead target, so yea, go for it.  Worst case it breaks someone 
notices and we know someone still cares about alpha-vms :-)


Jeff



Re: [COMMITTED][patch][version 9]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-09-10 Thread Jose E. Marchesi via Gcc-patches


Hi Richard.

> On Thu, 9 Sep 2021, Kees Cook wrote:
>
>> On Thu, Sep 09, 2021 at 10:49:11PM +, Qing Zhao wrote:
>> > Hi, FYI
>> > 
>> > I just committed the following patch to gcc upstream:
>> > 
>> > 
>> > https://gcc.gnu.org/pipermail/gcc-cvs/2021-September/353195.html
>> 
>> Hurray! Thank you so much for working on this, and thanks also to the
>> reviewers and everyone else poking at it.
>> 
>> I will go update my Linux Plumbers slides to say "supported" instead of
>> "proposed". :)
>
> Can you two work on wording to add to gcc-12/changes.html for this
> feature?  I think it deserves a release note.  Likewise the CTF/BTF
> support btw.

What about something like this for the BPF, CTF and BTF changes..

commit 3826495d1a2c265954d5da13ca71925eea390060 (HEAD -> master)
Author: Jose E. Marchesi 
Date:   Fri Sep 10 15:44:30 2021 +0200

gcc-12/changes.html: BPF, CTF and BTF update

* htdocs/gcc-12/changes.html (BPF): Item about the CO-RE support.
(Debugging formats): New section with items about the support for
CTF and BTF.

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 946faa49..936af979 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -143,6 +143,15 @@ a work-in-progress.
 
 
 
+BPF
+
+  Support for CO-RE (compile-once, run-everywhere) has been added
+  to the BPF backend.  CO-RE allows to compile portable BPF
+  programs that are able to run among different versions of the
+  Linux kernel.
+  
+
+
 
 
 
@@ -210,7 +219,25 @@ a work-in-progress.
 
 
 
-
+Other significant improvements
+
+Debugging formats
+
+
+  GCC can now generate debugging information
+  in https://ctfstd.org;>CTF, a lightweight debugging
+  format that provides information about C types and the
+  association between functions and data symbols and types.  This
+  format is designed to be embedded in ELF files and to be very
+  compact and simple.  A new command-line
+  option -gctf enables the generation of CTF.
+  
+  GCC can now generate debugging information in BTF.  This is a
+  debugging format mainly used in BPF programs and the Linux
+  kernel.  The compiler can generate BTF for any target, when
+  enabled with the command-line option -gbtf
+  
+
 
 
 


Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-10 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 10, 2021 at 9:32 PM Richard Biener
 wrote:
>
> On September 10, 2021 3:27:09 PM GMT+02:00, Hongtao Liu  
> wrote:
> >On Fri, Sep 10, 2021 at 9:16 PM Richard Biener via Gcc-patches
> > wrote:
> >>
> >> On Fri, Sep 10, 2021 at 2:58 PM liuhongt  wrote:
> >> >
> >> > gcc/ChangeLog:
> >> >
> >> > * expmed.c (extract_bit_field_using_extv): validate_subreg
> >> > before call gen_lowpart.
> >> > ---
> >> >  gcc/expmed.c | 6 +-
> >> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/gcc/expmed.c b/gcc/expmed.c
> >> > index 3143f38e057..10d62d857a8 100644
> >> > --- a/gcc/expmed.c
> >> > +++ b/gcc/expmed.c
> >> > @@ -1571,12 +1571,16 @@ extract_bit_field_using_extv (const 
> >> > extraction_insn *extv, rtx op0,
> >> >
> >> >if (GET_MODE (target) != ext_mode)
> >> >  {
> >> > +  machine_mode tmode = GET_MODE (target);
> >> >/* Don't use LHS paradoxical subreg if explicit truncation is 
> >> > needed
> >> >  between the mode of the extraction (word_mode) and the target
> >> >  mode.  Instead, create a temporary and use convert_move to set
> >> >  the target.  */
> >> >if (REG_P (target)
> >> > - && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
> >>
> >> ^^^
> >>
> >> I wonder if herein lies the problem in that the HFmode "truncation" from 
> >> SImode
> >> is considered noop?  Note the underlying target hook only looks at the mode
> >> precision and thus receives 16 and 32, and thus maybe that
> >> TRULY_NOOP_TRUNCATION_MODES_P query only makes sense for
> >> integer modes?  Though the documentation of the hook only talks about
> >> "conversion" of "values" ...
> >>
> >> So maybe a targetm.modes_tieable_p (GET_MODE (target), extmode) check
> >> is missing?
> >
> >According to document, it should be true for
> >targetm.modes_tieable_p(HFmode, SImode) since HFmode can be allocated
> >to gpr.
> >
> >
> >This hook returns true if a value of mode mode1 is accessible in mode
> >mode2 without
> >copying
> >---
> >
> >and also here gen_lowpart (SImode, HFmode, target) is called and hit
> >gcc_assert, not (subreg:HF (reg:SI) 0)
>
> I see. Of course that leads to a suggestion to allow the subreg based on 
> modes_tieable_p, but then others will know why that's the wrong thing to do?
I'm testing this

1 file changed, 2 insertions(+), 1 deletion(-)
gcc/expmed.c | 3 ++-

modified   gcc/expmed.c
@@ -1576,7 +1576,8 @@ extract_bit_field_using_extv (const
extraction_insn *extv, rtx op0,
  mode.  Instead, create a temporary and use convert_move to set
  the target.  */
   if (REG_P (target)
-   && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
+   && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode)
+   && targetm.modes_tieable_p (GET_MODE (target), ext_mode))
  {
target = gen_lowpart (ext_mode, target);
if (partial_subreg_p (GET_MODE (spec_target), ext_mode))

>
> Richard.
>
> >>
> >> > + && TRULY_NOOP_TRUNCATION_MODES_P (tmode, ext_mode)
> >> > + && validate_subreg (ext_mode, tmode,
> >> > + target,
> >> > + subreg_lowpart_offset (ext_mode, tmode)))
> >> > {
> >> >   target = gen_lowpart (ext_mode, target);
> >> >   if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
> >> > --
> >> > 2.27.0
> >> >
> >
> >
> >
>


-- 
BR,
Hongtao


Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-10 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 10, 2021 at 9:27 PM Hongtao Liu  wrote:
>
> On Fri, Sep 10, 2021 at 9:16 PM Richard Biener via Gcc-patches
>  wrote:
> >
> > On Fri, Sep 10, 2021 at 2:58 PM liuhongt  wrote:
> > >
> > > gcc/ChangeLog:
> > >
> > > * expmed.c (extract_bit_field_using_extv): validate_subreg
> > > before call gen_lowpart.
> > > ---
> > >  gcc/expmed.c | 6 +-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/gcc/expmed.c b/gcc/expmed.c
> > > index 3143f38e057..10d62d857a8 100644
> > > --- a/gcc/expmed.c
> > > +++ b/gcc/expmed.c
> > > @@ -1571,12 +1571,16 @@ extract_bit_field_using_extv (const 
> > > extraction_insn *extv, rtx op0,
> > >
> > >if (GET_MODE (target) != ext_mode)
> > >  {
> > > +  machine_mode tmode = GET_MODE (target);
> > >/* Don't use LHS paradoxical subreg if explicit truncation is 
> > > needed
> > >  between the mode of the extraction (word_mode) and the target
> > >  mode.  Instead, create a temporary and use convert_move to set
> > >  the target.  */
> > >if (REG_P (target)
> > > - && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
> >
> > ^^^
> >
> > I wonder if herein lies the problem in that the HFmode "truncation" from 
> > SImode
> > is considered noop?  Note the underlying target hook only looks at the mode
> > precision and thus receives 16 and 32, and thus maybe that
> > TRULY_NOOP_TRUNCATION_MODES_P query only makes sense for
> > integer modes?  Though the documentation of the hook only talks about
> > "conversion" of "values" ...
> >
> > So maybe a targetm.modes_tieable_p (GET_MODE (target), extmode) check
> > is missing?
>
> According to document, it should be true for
> targetm.modes_tieable_p(HFmode, SImode) since HFmode can be allocated
> to gpr.
I was wrong it needs *any* r, so targetm.modes_tieable_p do return false here.

If TARGET_HARD_REGNO_MODE_OK (r, mode1) and TARGET_HARD_REGNO_MODE_OK (r,
mode2) are always the same for any r, then TARGET_MODES_TIEABLE_P (mode1,
mode2) should be true. If they differ for any r, you should define this hook to
return false unless some other mechanism ensures the accessibility of
the value in a
narrower mode.
You should define this hook to return true in as many cases as
possible since doing so
will allow GCC to perform better register allocation. The default
definition returns
true unconditionally.

>
> 
> This hook returns true if a value of mode mode1 is accessible in mode
> mode2 without
> copying
> ---
>
> and also here gen_lowpart (SImode, HFmode, target) is called and hit
> gcc_assert, not (subreg:HF (reg:SI) 0)
>
> >
> > > + && TRULY_NOOP_TRUNCATION_MODES_P (tmode, ext_mode)
> > > + && validate_subreg (ext_mode, tmode,
> > > + target,
> > > + subreg_lowpart_offset (ext_mode, tmode)))
> > > {
> > >   target = gen_lowpart (ext_mode, target);
> > >   if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
> > > --
> > > 2.27.0
> > >
>
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao


Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-10 Thread Richard Biener via Gcc-patches
On September 10, 2021 3:27:09 PM GMT+02:00, Hongtao Liu  
wrote:
>On Fri, Sep 10, 2021 at 9:16 PM Richard Biener via Gcc-patches
> wrote:
>>
>> On Fri, Sep 10, 2021 at 2:58 PM liuhongt  wrote:
>> >
>> > gcc/ChangeLog:
>> >
>> > * expmed.c (extract_bit_field_using_extv): validate_subreg
>> > before call gen_lowpart.
>> > ---
>> >  gcc/expmed.c | 6 +-
>> >  1 file changed, 5 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/gcc/expmed.c b/gcc/expmed.c
>> > index 3143f38e057..10d62d857a8 100644
>> > --- a/gcc/expmed.c
>> > +++ b/gcc/expmed.c
>> > @@ -1571,12 +1571,16 @@ extract_bit_field_using_extv (const 
>> > extraction_insn *extv, rtx op0,
>> >
>> >if (GET_MODE (target) != ext_mode)
>> >  {
>> > +  machine_mode tmode = GET_MODE (target);
>> >/* Don't use LHS paradoxical subreg if explicit truncation is needed
>> >  between the mode of the extraction (word_mode) and the target
>> >  mode.  Instead, create a temporary and use convert_move to set
>> >  the target.  */
>> >if (REG_P (target)
>> > - && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
>>
>> ^^^
>>
>> I wonder if herein lies the problem in that the HFmode "truncation" from 
>> SImode
>> is considered noop?  Note the underlying target hook only looks at the mode
>> precision and thus receives 16 and 32, and thus maybe that
>> TRULY_NOOP_TRUNCATION_MODES_P query only makes sense for
>> integer modes?  Though the documentation of the hook only talks about
>> "conversion" of "values" ...
>>
>> So maybe a targetm.modes_tieable_p (GET_MODE (target), extmode) check
>> is missing?
>
>According to document, it should be true for
>targetm.modes_tieable_p(HFmode, SImode) since HFmode can be allocated
>to gpr.
>
>
>This hook returns true if a value of mode mode1 is accessible in mode
>mode2 without
>copying
>---
>
>and also here gen_lowpart (SImode, HFmode, target) is called and hit
>gcc_assert, not (subreg:HF (reg:SI) 0)

I see. Of course that leads to a suggestion to allow the subreg based on 
modes_tieable_p, but then others will know why that's the wrong thing to do? 

Richard. 

>>
>> > + && TRULY_NOOP_TRUNCATION_MODES_P (tmode, ext_mode)
>> > + && validate_subreg (ext_mode, tmode,
>> > + target,
>> > + subreg_lowpart_offset (ext_mode, tmode)))
>> > {
>> >   target = gen_lowpart (ext_mode, target);
>> >   if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
>> > --
>> > 2.27.0
>> >
>
>
>



Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-10 Thread Segher Boessenkool
On Fri, Sep 10, 2021 at 03:15:56PM +0200, Richard Biener wrote:
> On Fri, Sep 10, 2021 at 2:58 PM liuhongt  wrote:
> >if (REG_P (target)
> > - && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
> 
> ^^^
> 
> I wonder if herein lies the problem in that the HFmode "truncation" from 
> SImode
> is considered noop?  Note the underlying target hook only looks at the mode
> precision and thus receives 16 and 32, and thus maybe that
> TRULY_NOOP_TRUNCATION_MODES_P query only makes sense for
> integer modes?  Though the documentation of the hook only talks about
> "conversion" of "values" ...

@deftypefn {Target Hook} bool TARGET_TRULY_NOOP_TRUNCATION (poly_uint64 
@var{outprec}, poly_uint64 @var{inprec})
This hook returns true if it is safe to ``convert'' a value of
@var{inprec} bits to one of @var{outprec} bits (where @var{outprec} is
smaller than @var{inprec}) by merely operating on it as if it had only
@var{outprec} bits.  The default returns true unconditionally, which
is correct for most machines.  When @code{TARGET_TRULY_NOOP_TRUNCATION}
returns false, the machine description should provide a @code{trunc}
optab to specify the RTL that performs the required truncation.


@cindex @code{trunc@var{m}@var{n}2} instruction pattern
@item @samp{trunc@var{m}@var{n}2}
Truncate operand 1 (valid for mode @var{m}) to mode @var{n} and
store in operand 0 (which has mode @var{n}).  Both modes must be fixed
point or both floating point.


TRULY_NOOP_TRUNCATION does not make sense to ask if changing mode class.


Segher


Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-10 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 10, 2021 at 9:16 PM Richard Biener via Gcc-patches
 wrote:
>
> On Fri, Sep 10, 2021 at 2:58 PM liuhongt  wrote:
> >
> > gcc/ChangeLog:
> >
> > * expmed.c (extract_bit_field_using_extv): validate_subreg
> > before call gen_lowpart.
> > ---
> >  gcc/expmed.c | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/expmed.c b/gcc/expmed.c
> > index 3143f38e057..10d62d857a8 100644
> > --- a/gcc/expmed.c
> > +++ b/gcc/expmed.c
> > @@ -1571,12 +1571,16 @@ extract_bit_field_using_extv (const extraction_insn 
> > *extv, rtx op0,
> >
> >if (GET_MODE (target) != ext_mode)
> >  {
> > +  machine_mode tmode = GET_MODE (target);
> >/* Don't use LHS paradoxical subreg if explicit truncation is needed
> >  between the mode of the extraction (word_mode) and the target
> >  mode.  Instead, create a temporary and use convert_move to set
> >  the target.  */
> >if (REG_P (target)
> > - && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
>
> ^^^
>
> I wonder if herein lies the problem in that the HFmode "truncation" from 
> SImode
> is considered noop?  Note the underlying target hook only looks at the mode
> precision and thus receives 16 and 32, and thus maybe that
> TRULY_NOOP_TRUNCATION_MODES_P query only makes sense for
> integer modes?  Though the documentation of the hook only talks about
> "conversion" of "values" ...
>
> So maybe a targetm.modes_tieable_p (GET_MODE (target), extmode) check
> is missing?

According to document, it should be true for
targetm.modes_tieable_p(HFmode, SImode) since HFmode can be allocated
to gpr.


This hook returns true if a value of mode mode1 is accessible in mode
mode2 without
copying
---

and also here gen_lowpart (SImode, HFmode, target) is called and hit
gcc_assert, not (subreg:HF (reg:SI) 0)

>
> > + && TRULY_NOOP_TRUNCATION_MODES_P (tmode, ext_mode)
> > + && validate_subreg (ext_mode, tmode,
> > + target,
> > + subreg_lowpart_offset (ext_mode, tmode)))
> > {
> >   target = gen_lowpart (ext_mode, target);
> >   if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
> > --
> > 2.27.0
> >



-- 
BR,
Hongtao


Re: More aggressive threading causing loop-interchange-9.c regression

2021-09-10 Thread Michael Matz via Gcc-patches
Hi,

On Fri, 10 Sep 2021, Aldy Hernandez via Gcc-patches wrote:

>  }
> +
> +  /* Threading through a non-empty latch would cause code to be added

"through an *empty* latch".  The test in code is correct, though.

And for the before/after loops flag you added: we have a 
cfun->curr_properties field which can be used.  We even already have a 
PROP_loops flag but that is set throughout compilation from CFG 
construction until the RTL loop optimizers, so can't be re-used for what 
is needed here.  But you still could invent another PROP_ value instead of 
adding a new field in struct function.


Ciao,
Michael.


[PATCH RFC] c++: implement C++17 hardware interference size

2021-09-10 Thread Jason Merrill via Gcc-patches
OK, time to finish this up.  The main change relative to the last patch I sent
to the list is dropping the -finterference-tune flag and making that behavior
the default.  Any more comments?



The last missing piece of the C++17 standard library is the hardware
intereference size constants.  Much of the delay in implementing these has
been due to uncertainty about what the right values are, and even whether
there is a single constant value that is suitable; the destructive
interference size is intended to be used in structure layout, so program
ABIs will depend on it.

In principle, both of these values should be the same as the target's L1
cache line size.  When compiling for a generic target that is intended to
support a range of target CPUs with different cache line sizes, the
constructive size should probably be the minimum size, and the destructive
size the maximum, unless you are constrained by ABI compatibility with
previous code.

>From discussion on gcc-patches, I've come to the conclusion that the
solution to the difficulty of choosing stable values is to give up on it,
and instead encourage only uses where ABI stability is unimportant: in
particular, uses where the ABI is shared at most between translation units
built at the same time with the same flags.

To that end, I've added a warning for any use of the constant value of
std::hardware_destructive_interference_size in a header or module export.
Appropriate uses within a project can disable the warning.

A previous iteration of this patch included an -finterference-tune flag to
make the value vary with -mtune; this iteration makes that the default
behavior, which should be appropriate for all reasonable uses of the
variable.  The previous default of "stable-ish" seems to me likely to have
been more of an attractive nuisance; since we can't promise actual
stability, we should instead make proper uses more convenient.

JF Bastien's implementation proposal is summarized at
https://github.com/itanium-cxx-abi/cxx-abi/issues/74

I implement this by adding new --params for the two sizes.  Targets can
override these values in targetm.target_option.override() to support a range
of values for the generic target; otherwise, both will default to the L1
cache line size.

64 bytes still seems correct for all x86.

I'm not sure why he proposed 64/64 for generic 32-bit ARM, since the Cortex
A9 has a 32-byte cache line, so I'd think 32/64 would make more sense.

He proposed 64/128 for generic AArch64, but since the A64FX now has a 256B
cache line, I've changed that to 64/256.

With the above choice to reject stability as a goal, getting these values
"right" is now just a matter of what we want the default optimization to be,
and we can feel free to adjust them as CPUs with different cache lines
become more and less common.

gcc/ChangeLog:

* params.opt: Add destructive-interference-size and
constructive-interference-size.
* doc/invoke.texi: Document them.
* config/aarch64/aarch64.c (aarch64_override_options_internal):
Set them.
* config/arm/arm.c (arm_option_override): Set them.
* config/i386/i386-options.c (ix86_option_override_internal):
Set them.

gcc/c-family/ChangeLog:

* c.opt: Add -Winterference-size.
* c-cppbuiltin.c (cpp_atomic_builtins): Add __GCC_DESTRUCTIVE_SIZE
and __GCC_CONSTRUCTIVE_SIZE.

gcc/cp/ChangeLog:

* constexpr.c (maybe_warn_about_constant_value):
Complain about std::hardware_destructive_interference_size.
(cxx_eval_constant_expression): Call it.
* decl.c (cxx_init_decl_processing): Check
--param *-interference-size values.

libstdc++-v3/ChangeLog:

* include/std/version: Define __cpp_lib_hardware_interference_size.
* libsupc++/new: Define hardware interference size variables.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Winterference.h: New file.
* g++.dg/warn/Winterference.C: New test.
* g++.target/aarch64/interference.C: New test.
* g++.target/arm/interference.C: New test.
* g++.target/i386/interference.C: New test.
---
 gcc/doc/invoke.texi   | 65 +++
 gcc/c-family/c.opt|  5 ++
 gcc/params.opt| 16 +
 gcc/c-family/c-cppbuiltin.c   | 12 
 gcc/config/aarch64/aarch64.c  | 22 +++
 gcc/config/arm/arm.c  | 22 +++
 gcc/config/i386/i386-options.c|  6 ++
 gcc/cp/constexpr.c| 33 ++
 gcc/cp/decl.c | 32 +
 gcc/testsuite/g++.dg/warn/Winterference-2.C   | 14 
 gcc/testsuite/g++.dg/warn/Winterference.C |  6 ++
 .../g++.target/aarch64/interference.C |  9 +++
 gcc/testsuite/g++.target/arm/interference.C   |  9 +++
 gcc/testsuite/g++.target/i386/interference.C  |  8 +++
 

Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-10 Thread Richard Biener via Gcc-patches
On Fri, Sep 10, 2021 at 2:58 PM liuhongt  wrote:
>
> gcc/ChangeLog:
>
> * expmed.c (extract_bit_field_using_extv): validate_subreg
> before call gen_lowpart.
> ---
>  gcc/expmed.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/expmed.c b/gcc/expmed.c
> index 3143f38e057..10d62d857a8 100644
> --- a/gcc/expmed.c
> +++ b/gcc/expmed.c
> @@ -1571,12 +1571,16 @@ extract_bit_field_using_extv (const extraction_insn 
> *extv, rtx op0,
>
>if (GET_MODE (target) != ext_mode)
>  {
> +  machine_mode tmode = GET_MODE (target);
>/* Don't use LHS paradoxical subreg if explicit truncation is needed
>  between the mode of the extraction (word_mode) and the target
>  mode.  Instead, create a temporary and use convert_move to set
>  the target.  */
>if (REG_P (target)
> - && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))

^^^

I wonder if herein lies the problem in that the HFmode "truncation" from SImode
is considered noop?  Note the underlying target hook only looks at the mode
precision and thus receives 16 and 32, and thus maybe that
TRULY_NOOP_TRUNCATION_MODES_P query only makes sense for
integer modes?  Though the documentation of the hook only talks about
"conversion" of "values" ...

So maybe a targetm.modes_tieable_p (GET_MODE (target), extmode) check
is missing?

> + && TRULY_NOOP_TRUNCATION_MODES_P (tmode, ext_mode)
> + && validate_subreg (ext_mode, tmode,
> + target,
> + subreg_lowpart_offset (ext_mode, tmode)))
> {
>   target = gen_lowpart (ext_mode, target);
>   if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
> --
> 2.27.0
>


Re: [PATCH 1/2] Revert "Get rid of all float-int special cases in validate_subreg."

2021-09-10 Thread Richard Biener via Gcc-patches
On Fri, Sep 10, 2021 at 2:58 PM liuhongt  wrote:
>
> This reverts commit d2874d905647a1d146dafa60199d440e837adc4d.

OK.

Richard.

> PR target/102254
> PR target/102154
> PR target/102211
> ---
>  gcc/emit-rtl.c | 40 
>  1 file changed, 40 insertions(+)
>
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index 77ea8948ee8..ff3b4449b37 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -922,6 +922,46 @@ validate_subreg (machine_mode omode, machine_mode imode,
>
>poly_uint64 regsize = REGMODE_NATURAL_SIZE (imode);
>
> +  /* ??? This should not be here.  Temporarily continue to allow word_mode
> + subregs of anything.  The most common offender is (subreg:SI (reg:DF)).
> + Generally, backends are doing something sketchy but it'll take time to
> + fix them all.  */
> +  if (omode == word_mode)
> +;
> +  /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
> + is the culprit here, and not the backends.  */
> +  else if (known_ge (osize, regsize) && known_ge (isize, osize))
> +;
> +  /* Allow component subregs of complex and vector.  Though given the below
> + extraction rules, it's not always clear what that means.  */
> +  else if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
> +  && GET_MODE_INNER (imode) == omode)
> +;
> +  /* ??? x86 sse code makes heavy use of *paradoxical* vector subregs,
> + i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0).  This
> + surely isn't the cleanest way to represent this.  It's questionable
> + if this ought to be represented at all -- why can't this all be hidden
> + in post-reload splitters that make arbitrarily mode changes to the
> + registers themselves.  */
> +  else if (VECTOR_MODE_P (omode)
> +  && GET_MODE_INNER (omode) == GET_MODE_INNER (imode))
> +;
> +  /* Subregs involving floating point modes are not allowed to
> + change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
> + (subreg:SI (reg:DF) 0) isn't.  */
> +  else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
> +{
> +  if (! (known_eq (isize, osize)
> +/* LRA can use subreg to store a floating point value in
> +   an integer mode.  Although the floating point and the
> +   integer modes need the same number of hard registers,
> +   the size of floating point mode can be less than the
> +   integer mode.  LRA also uses subregs for a register
> +   should be used in different mode in on insn.  */
> +|| lra_in_progress))
> +   return false;
> +}
> +
>/* Paradoxical subregs must have offset zero.  */
>if (maybe_gt (osize, isize))
>  return known_eq (offset, 0U);
> --
> 2.27.0
>


[PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-10 Thread liuhongt via Gcc-patches
gcc/ChangeLog:

* expmed.c (extract_bit_field_using_extv): validate_subreg
before call gen_lowpart.
---
 gcc/expmed.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/expmed.c b/gcc/expmed.c
index 3143f38e057..10d62d857a8 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -1571,12 +1571,16 @@ extract_bit_field_using_extv (const extraction_insn 
*extv, rtx op0,
 
   if (GET_MODE (target) != ext_mode)
 {
+  machine_mode tmode = GET_MODE (target);
   /* Don't use LHS paradoxical subreg if explicit truncation is needed
 between the mode of the extraction (word_mode) and the target
 mode.  Instead, create a temporary and use convert_move to set
 the target.  */
   if (REG_P (target)
- && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
+ && TRULY_NOOP_TRUNCATION_MODES_P (tmode, ext_mode)
+ && validate_subreg (ext_mode, tmode,
+ target,
+ subreg_lowpart_offset (ext_mode, tmode)))
{
  target = gen_lowpart (ext_mode, target);
  if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
-- 
2.27.0



[PATCH 1/2] Revert "Get rid of all float-int special cases in validate_subreg."

2021-09-10 Thread liuhongt via Gcc-patches
This reverts commit d2874d905647a1d146dafa60199d440e837adc4d.

PR target/102254
PR target/102154
PR target/102211
---
 gcc/emit-rtl.c | 40 
 1 file changed, 40 insertions(+)

diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 77ea8948ee8..ff3b4449b37 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -922,6 +922,46 @@ validate_subreg (machine_mode omode, machine_mode imode,
 
   poly_uint64 regsize = REGMODE_NATURAL_SIZE (imode);
 
+  /* ??? This should not be here.  Temporarily continue to allow word_mode
+ subregs of anything.  The most common offender is (subreg:SI (reg:DF)).
+ Generally, backends are doing something sketchy but it'll take time to
+ fix them all.  */
+  if (omode == word_mode)
+;
+  /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
+ is the culprit here, and not the backends.  */
+  else if (known_ge (osize, regsize) && known_ge (isize, osize))
+;
+  /* Allow component subregs of complex and vector.  Though given the below
+ extraction rules, it's not always clear what that means.  */
+  else if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
+  && GET_MODE_INNER (imode) == omode)
+;
+  /* ??? x86 sse code makes heavy use of *paradoxical* vector subregs,
+ i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0).  This
+ surely isn't the cleanest way to represent this.  It's questionable
+ if this ought to be represented at all -- why can't this all be hidden
+ in post-reload splitters that make arbitrarily mode changes to the
+ registers themselves.  */
+  else if (VECTOR_MODE_P (omode)
+  && GET_MODE_INNER (omode) == GET_MODE_INNER (imode))
+;
+  /* Subregs involving floating point modes are not allowed to
+ change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
+ (subreg:SI (reg:DF) 0) isn't.  */
+  else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
+{
+  if (! (known_eq (isize, osize)
+/* LRA can use subreg to store a floating point value in
+   an integer mode.  Although the floating point and the
+   integer modes need the same number of hard registers,
+   the size of floating point mode can be less than the
+   integer mode.  LRA also uses subregs for a register
+   should be used in different mode in on insn.  */
+|| lra_in_progress))
+   return false;
+}
+
   /* Paradoxical subregs must have offset zero.  */
   if (maybe_gt (osize, isize))
 return known_eq (offset, 0U);
-- 
2.27.0



[PATCH 0/2] Revert r12-3277 since it caused regressions on many other targets.

2021-09-10 Thread liuhongt via Gcc-patches
Hi:
  Details discussed in 
https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579170.html.
  
  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
  Ok for trunk?

liuhongt (2):
  Revert "Get rid of all float-int special cases in validate_subreg."
  validate_subreg before call gen_lowpart to avoid ICE.

 gcc/emit-rtl.c | 40 
 gcc/expmed.c   |  6 +-
 2 files changed, 45 insertions(+), 1 deletion(-)

-- 
2.27.0



Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-10 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 10, 2021 at 7:25 PM Hongtao Liu  wrote:
>
> On Fri, Sep 10, 2021 at 6:54 PM Richard Biener
>  wrote:
> >
> > On Fri, Sep 10, 2021 at 5:03 AM Hongtao Liu  wrote:
> > >
> > > On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool
> > >  wrote:
> > > >
> > > > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote:
> > > > > > I think we should (longer term) get rid of the overloaded meanings 
> > > > > > and
> > > > > > uses of subregs.  One fairly simple thing is to make a new rtx code
> > > > > > "bit_cast" (or is there a nice short more traditional name for it?)
> > > > >
> > > > > But subreg _is_ bit_cast.
> > > >
> > > > It is not.  (subreg:M (reg:N) O) for O>0, little-endian, is not a
> > > > bit_cast.  It is taking a part of a register, or a single register from
> > > > a multi-register thing.  Paradoxicals are not bit-casts either.
> > > >
> > > > Subregs from or to (but not both) integer modes are generally bit_cast,
> > > > yeah.
> > > >
> > > > > What is odd to me is that a "disallowed" subreg
> > > > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of
> > > > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI 
> > > > > ..) 0) 0).
> > > > > Of course that's nested and invalid but just push the inner subreg to 
> > > > > a
> > > > > new pseudo and the thing becomes valid.
> > > >
> > > > Bingo.
> > > >
> > > > And many targets have strange rules for bit-strings in which modes can
> > > > be used as bit-strings in which other modes, and at what offsets in
> > > > which registers.  Now perhaps none of that is optimal (I bet it isn't),
> > > > but changing this without a transition plan simply does not work.
> > > >
> > > > > > But that is not the core problem we had here.  The behaviour of the
> > > > > > generic parts of the compiler was changed, without testing if that
> > > > > > works on other targets but x86.  That is an understandable mistake, 
> > > > > > it
> > > > > > takes some experience to know where the morasses are.  But this 
> > > > > > change
> > > > > > should have been accompanied by testcases exercising the changed 
> > > > > > code.
> > > > > > We would have clearly seen there are issues then, simply by watching
> > > > > > gcc-testresults@ (and/or maintainers are on top of the test results
> > > > > > anyway).  Also, if there were testcases for this, we could have some
> > > > > > confidence that a change in this area is robust.
> > > > >
> > > > > Well, that only works if some maintainers that are familiar enough
> > > > > with all this chime in ;)
> > > >
> > > > Not really.  It works always.  And it works way better than the
> > > > pandemonium we now have with broken targets left and right.
> > > >
> > > > With testcases anyone can see if any specific target is broken here.
> > > >
> > > > > It's stage1 so it's understandable that some
> > > > > people (like me ...) are tyring to help people making progress even
> > > > > if that involves trying to decipher 30 years of GCC history in this
> > > > > area (without much success in the end as we see) ;)
> > > >
> > > > Yeah :-)  And my thanks to you and everyone involved for tackling this
> > > > problematic part of GCC, which has been neglected and patched over for
> > > > way too long.  But from that same history it follows that anything you
> > > > do not super carefully (with testing everywhere) will cause some serious
> > > Frankly, testing everywhere is too heavy a burden for developers,
> > > after all, everyone has a limited variety of machines, and may not be
> > > familiar with using  other targets' simulators.
> > > And back to the problem we were trying to solve at the beginning
> > > (subreg:HF(reg:SI)), I guess this is not just a problem in x86
> > > backend, any backend can encounter similar problems, that's why we
> > > remove all the weird cases in validate_subreg.
> >
> > So can you please revert the change for now?  I think we need to go
> > back to the issue in extract_bit_field - does it somehow work to use
> > validate_subreg to avoid creating the subreg we ICE on in the first
> > place and what happens then to code quality?
> Sure, let me test the patch.
Survive regtest, code quality seems to be ok.
But lose some performance.
.cfi_startproc
-   pinsrw  $0, %edi, %xmm0
+   movw%di, -2(%rsp)
+   pinsrw  $0, -2(%rsp), %xmm0
ret
.cfi_endproc

Anyway I'll post the patches first and see if I can fix the
performance issue in the x86 backend.

> >
> > Thanks,
> > Richard.
> >
> > > > problems.  And nonse of these are easy to fix at all -- there is a
> > > > *reason* targets did this nastiness.
> > > >
> > > > > > p.s. Very unrelated...  Should we have __builtin_bit_cast for C as 
> > > > > > well?
> > > > > > Is there any reason this could not work?
> > > >
> > > > Still interested in this btw :-)  (And still very unrelated.)
> > > >
> > > >
> > > > Segher
> > >
> > >
> > >
> > > --
> > > BR,
> > > Hongtao
>
>
>
> --
> 

Re: [PATCH, rs6000] Optimization for vec_xl_sext

2021-09-10 Thread Bill Schmidt via Gcc-patches



On 9/10/21 12:45 AM, HAO CHEN GUI wrote:

Bill,

    Thanks so much for your advice.

    I refined the patch and passed the bootstrap and regression test.
Just one thing, the test case becomes unsupported on P9 if I set "{
dg-require-effective-target power10_ok }". I just want the test case to
be compiled and check its assembly. Do we need set "power10_ok"?


Yes.  power10_ok tests whether your toolchain supports *assembly* of p10 
instructions.  It isn't unsupported on P9 per se; it's unsupported if 
the toolchain you're using on P9 doesn't have binutils support for P10.


You can look at gcc/testsuite/lib/target-supports.exp to see how this 
stuff works.  In this case we have:


proc check_effective_target_power10_ok { } {
if { ([istarget powerpc64*-*-linux*]) } {
return [check_no_compiler_messages power10_ok object {
int main (void) {
long e;
asm ("pli %0,%1" : "=r" (e) : "n" (0x12345));
return e;
}
} "-mcpu=power10"]
} else {
return 0
}
}



    I tried to disable line wrap in my email editor. Please let me know
if you still see line wrap. Thanks.


Thanks, but unfortunately it's still broken as I still see:

@@ -9779,7 +9779,7 @@ altivec_expand_lxvr_builtin (enum insn_code icode,
tree exp, rtx target, bool bl

This should all be on one line.  The headers show:

Content-Type: text/plain; charset=utf-8; format=flowed

Not sure what mailer you're using, but often you can avoid problems by 
marking the patch portion as untouchable in some way. In Thunderbird, 
you can set it to Preformat, as an example.  Then you can get line wrap 
on the non-patch portions if you prefer that.




ChangeLog

2021-09-10 Haochen Gui 

gcc/
      * config/rs6000/rs6000-call.c (altivec_expand_lxvr_builtin):
      Modify the expansion for sign extension. All extentions are done
      within VSX resgisters.
      * gcc/config/rs6000/vsx.md (vsx_sign_extend_si_v2di): Define.

gcc/testsuite/
      * gcc.target/powerpc/p10_vec_xl_sext.c: New test.

patch.diff

diff --git a/gcc/config/rs6000/rs6000-call.c
b/gcc/config/rs6000/rs6000-call.c
index b4e13af4dc6..587e9fa2a2a 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -9779,7 +9779,7 @@ altivec_expand_lxvr_builtin (enum insn_code icode,
tree exp, rtx target, bool bl

     if (sign_extend)
   {
-  rtx discratch = gen_reg_rtx (DImode);
+  rtx discratch = gen_reg_rtx (V2DImode);
     rtx tiscratch = gen_reg_rtx (TImode);

     /* Emit the lxvr*x insn.  */
@@ -9788,20 +9788,31 @@ altivec_expand_lxvr_builtin (enum insn_code
icode, tree exp, rtx target, bool bl
      return 0;
     emit_insn (pat);

-  /* Emit a sign extension from QI,HI,WI to double (DI).  */
-  rtx scratch = gen_lowpart (smode, tiscratch);
+  /* Emit a sign extension from V16QI,V8HI,V4SI to V2DI.  */
+  rtx temp1, temp2;
     if (icode == CODE_FOR_vsx_lxvrbx)
-   emit_insn (gen_extendqidi2 (discratch, scratch));
+   {
+ temp1  = simplify_gen_subreg (V16QImode, tiscratch, TImode, 0);
+ emit_insn (gen_vsx_sign_extend_qi_v2di (discratch, temp1));
+   }
     else if (icode == CODE_FOR_vsx_lxvrhx)
-   emit_insn (gen_extendhidi2 (discratch, scratch));
+   {
+ temp1  = simplify_gen_subreg (V8HImode, tiscratch, TImode, 0);
+ emit_insn (gen_vsx_sign_extend_hi_v2di (discratch, temp1));
+   }
     else if (icode == CODE_FOR_vsx_lxvrwx)
-   emit_insn (gen_extendsidi2 (discratch, scratch));
-  /*  Assign discratch directly if scratch is already DI.  */
-  if (icode == CODE_FOR_vsx_lxvrdx)
-   discratch = scratch;
+   {
+ temp1  = simplify_gen_subreg (V4SImode, tiscratch, TImode, 0);
+ emit_insn (gen_vsx_sign_extend_si_v2di (discratch, temp1));
+   }
+  else if (icode == CODE_FOR_vsx_lxvrdx)
+   discratch = simplify_gen_subreg (V2DImode, tiscratch, TImode, 0);
+  else
+   gcc_unreachable ();

-  /* Emit the sign extension from DI (double) to TI (quad). */
-  emit_insn (gen_extendditi2 (target, discratch));
+  /* Emit the sign extension from V2DI (double) to TI (quad).  */
+  temp2 = simplify_gen_subreg (TImode, discratch, V2DImode, 0);
+  emit_insn (gen_extendditi2_vector (target, temp2));

     return target;
   }
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index bcb92be2f5c..987f21bbc22 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -4830,7 +4830,7 @@ (define_insn "vsx_sign_extend_hi_"
     "vextsh2 %0,%1"
     [(set_attr "type" "vecexts")])

-(define_insn "*vsx_sign_extend_si_v2di"
+(define_insn "vsx_sign_extend_si_v2di"
     [(set (match_operand:V2DI 0 "vsx_register_operand" "=v")
      (unspec:V2DI [(match_operand:V4SI 1 "vsx_register_operand" "v")]
   UNSPEC_VSX_SIGN_EXTEND))]
diff --git 

[PATCH] Always default to DWARF2_DEBUG if not specified, warn about deprecated STABS

2021-09-10 Thread Richard Biener via Gcc-patches
This makes defaults.h choose DWARF2_DEBUG if PREFERRED_DEBUGGING_TYPE
is not specified by the target and NO_DEBUG if DWARF is not supported.

It also makes us warn when STABS is enabled and removes the corresponding
diagnostic from the Ada frontend.  The warnings are pruned from the
testsuite output via prune_gcc_output but somehow this doesn't work
for the tests in gfortran.dg/debug which are now failing with excess
errors.  That seems to be the case for other fortran .exp as well
when appending -gstabs, something which works fine for gcc or g++ dgs.

Bootstrapped / tested on x86_64-unknown-linux-gnu with the mentioned
excess errors from all gfortran.dg/debug/

I need to still edit doc/invoke.texi somehow but I wonder if somebody
has insights into the testsuite pruning issue.

Richard.

2021-09-10  Richard Biener  

gcc/
* defaults.h (PREFERRED_DEBUGGING_TYPE): Choose DWARF2_DEBUG
or NO_DEBUG.
* toplev.c (process_options): Warn when STABS debugging is
enabled.

gcc/ada/
* gcc-interface/misc.c (gnat_post_options): Do not warn
about DBX_DEBUG use here.

gcc/testsuite/
* lib/prune.exp: Prune STABS obsoletion message.
---
 gcc/ada/gcc-interface/misc.c |  6 --
 gcc/defaults.h   | 27 ---
 gcc/testsuite/lib/prune.exp  |  3 +++
 gcc/toplev.c |  5 +
 4 files changed, 12 insertions(+), 29 deletions(-)

diff --git a/gcc/ada/gcc-interface/misc.c b/gcc/ada/gcc-interface/misc.c
index 96199bd4b63..87a4c8662cb 100644
--- a/gcc/ada/gcc-interface/misc.c
+++ b/gcc/ada/gcc-interface/misc.c
@@ -274,12 +274,6 @@ gnat_post_options (const char **pfilename ATTRIBUTE_UNUSED)
   if (!global_options_set.x_flag_diagnostics_show_caret)
 global_dc->show_caret = false;
 
-  /* Warn only if STABS is not the default: we don't want to emit a warning if
- the user did not use a -gstabs option.  */
-  if (PREFERRED_DEBUGGING_TYPE != DBX_DEBUG && write_symbols == DBX_DEBUG)
-warning (0, "STABS debugging information for Ada is obsolete and not "
-   "supported anymore");
-
   /* Copy global settings to local versions.  */
   gnat_encodings = global_options.x_gnat_encodings;
   optimize = global_options.x_optimize;
diff --git a/gcc/defaults.h b/gcc/defaults.h
index ba79a8e48ed..773b93b1a2e 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -900,34 +900,15 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 #define DEFAULT_GDB_EXTENSIONS 1
 #endif
 
-/* If more than one debugging type is supported, you must define
-   PREFERRED_DEBUGGING_TYPE to choose the default.  */
-
-#if 1 < (defined (DBX_DEBUGGING_INFO) \
- + defined (DWARF2_DEBUGGING_INFO) + defined (XCOFF_DEBUGGING_INFO) \
- + defined (VMS_DEBUGGING_INFO))
 #ifndef PREFERRED_DEBUGGING_TYPE
-#error You must define PREFERRED_DEBUGGING_TYPE
-#endif /* no PREFERRED_DEBUGGING_TYPE */
-
-/* If only one debugging format is supported, define PREFERRED_DEBUGGING_TYPE
-   here so other code needn't care.  */
-#elif defined DBX_DEBUGGING_INFO
-#define PREFERRED_DEBUGGING_TYPE DBX_DEBUG
-
-#elif defined DWARF2_DEBUGGING_INFO || defined DWARF2_LINENO_DEBUGGING_INFO
+/* We default to DWARF2_DEBUGGING_INFO.  */
+#if defined DWARF2_DEBUGGING_INFO || defined DWARF2_LINENO_DEBUGGING_INFO
 #define PREFERRED_DEBUGGING_TYPE DWARF2_DEBUG
-
-#elif defined VMS_DEBUGGING_INFO
-#define PREFERRED_DEBUGGING_TYPE VMS_AND_DWARF2_DEBUG
-
-#elif defined XCOFF_DEBUGGING_INFO
-#define PREFERRED_DEBUGGING_TYPE XCOFF_DEBUG
-
 #else
-/* No debugging format is supported by this target.  */
+/* DWARF is not supported by this target.  */
 #define PREFERRED_DEBUGGING_TYPE NO_DEBUG
 #endif
+#endif
 
 #ifndef FLOAT_LIB_COMPARE_RETURNS_BOOL
 #define FLOAT_LIB_COMPARE_RETURNS_BOOL(MODE, COMPARISON) false
diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp
index 91f165bec38..62fcd3731cc 100644
--- a/gcc/testsuite/lib/prune.exp
+++ b/gcc/testsuite/lib/prune.exp
@@ -90,6 +90,9 @@ proc prune_gcc_output { text } {
 # Ignore dsymutil warning (tool bug is actually linker)
 regsub -all "(^|\n)\[^\n\]*could not find object file symbol for 
symbol\[^\n\]*" $text "" text
 
+# Ignore stabs obsoletion warnings
+regsub -all "(^|\n)\[^\n\]*warning: STABS debugging information is 
obsolete and not supported anymore\[^\n\]*" $text "" text
+
 # If dg-enable-nn-line-numbers was provided, then obscure source-margin
 # line numbers by converting them to "NN" form.
 set text [maybe-handle-nn-line-numbers $text]
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 14d1335e79e..2b58fd373bf 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1452,6 +1452,11 @@ process_options (void)
   && ctf_debug_info_level == CTFINFO_LEVEL_NONE)
 write_symbols = NO_DEBUG;
 
+  /* Warn if STABS debug gets enabled.  */
+  if (write_symbols & DBX_DEBUG)
+warning (0, "STABS debugging information is obsolete and not "
+"supported 

Re: Remove tilegx port

2021-09-10 Thread Richard Biener via Gcc-patches
On Fri, Apr 27, 2018 at 9:32 PM Jeff Law  wrote:
>
> On 04/27/2018 11:42 AM, Richard Biener wrote:
> > On April 27, 2018 7:26:19 PM GMT+02:00, Jeff Law  wrote:
> >> On 04/27/2018 09:36 AM, Joseph Myers wrote:
> >>> Since tile support has been removed from the Linux kernel for 4.17,
> >>> this patch removes the (unmaintained) port to tilegx from glibc (the
> >>> tilepro support having been previously removed).  This reflects the
> >>> general principle that a glibc port needs upstream support for the
> >>> architecture in all the components it build-depends on (so binutils,
> >>> GCC and the Linux kernel, for the normal case of a port supporting
> >> the
> >>> Linux kernel but no other OS), in order to be maintainable.
> >>>
> >>> Apart from removal of sysdeps/tile and sysdeps/unix/sysv/linux/tile
> >>> (omitted from the diffs below), there are updates to various comments
> >>> referencing tile for which removal of those references seemed
> >>> appropriate.  The configuration is removed from README and from
> >>> build-many-glibcs.py.  contrib.texi keeps mention of removed
> >>> contributions, but I updated Chris Metcalf's entry to reflect that he
> >>> also contributed the non-removed support for the generic Linux kernel
> >>> syscall interface.  __ASSUME_FADVISE64_64_NO_ALIGN support is
> >> removed,
> >>> as it was only used by tile.
> >> Given tilegx/tilepro removal from the kernel and glibc, should we go
> >> ahead and deprecate them in GCC?  The only tilegx/tilepro
> >> configurations
> >> are -linux.
> >
> > Makes sense to me. Let's deprecate it for GCC 8 and remove from trunk.
> >
> > Richard.
> >
> >> Jeff
> >
>
> Here's what I committed to the trunk and the release branch.  I'll
> find/update the appropriate web page momentarily.

It's been deprecated since GCC 8 now but the port is still on trunk,
guarded by --enable-obsolete - is it time to remove it?

Richard.

> jeff


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-10 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 10, 2021 at 6:54 PM Richard Biener
 wrote:
>
> On Fri, Sep 10, 2021 at 5:03 AM Hongtao Liu  wrote:
> >
> > On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool
> >  wrote:
> > >
> > > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote:
> > > > > I think we should (longer term) get rid of the overloaded meanings and
> > > > > uses of subregs.  One fairly simple thing is to make a new rtx code
> > > > > "bit_cast" (or is there a nice short more traditional name for it?)
> > > >
> > > > But subreg _is_ bit_cast.
> > >
> > > It is not.  (subreg:M (reg:N) O) for O>0, little-endian, is not a
> > > bit_cast.  It is taking a part of a register, or a single register from
> > > a multi-register thing.  Paradoxicals are not bit-casts either.
> > >
> > > Subregs from or to (but not both) integer modes are generally bit_cast,
> > > yeah.
> > >
> > > > What is odd to me is that a "disallowed" subreg
> > > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of
> > > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 
> > > > 0) 0).
> > > > Of course that's nested and invalid but just push the inner subreg to a
> > > > new pseudo and the thing becomes valid.
> > >
> > > Bingo.
> > >
> > > And many targets have strange rules for bit-strings in which modes can
> > > be used as bit-strings in which other modes, and at what offsets in
> > > which registers.  Now perhaps none of that is optimal (I bet it isn't),
> > > but changing this without a transition plan simply does not work.
> > >
> > > > > But that is not the core problem we had here.  The behaviour of the
> > > > > generic parts of the compiler was changed, without testing if that
> > > > > works on other targets but x86.  That is an understandable mistake, it
> > > > > takes some experience to know where the morasses are.  But this change
> > > > > should have been accompanied by testcases exercising the changed code.
> > > > > We would have clearly seen there are issues then, simply by watching
> > > > > gcc-testresults@ (and/or maintainers are on top of the test results
> > > > > anyway).  Also, if there were testcases for this, we could have some
> > > > > confidence that a change in this area is robust.
> > > >
> > > > Well, that only works if some maintainers that are familiar enough
> > > > with all this chime in ;)
> > >
> > > Not really.  It works always.  And it works way better than the
> > > pandemonium we now have with broken targets left and right.
> > >
> > > With testcases anyone can see if any specific target is broken here.
> > >
> > > > It's stage1 so it's understandable that some
> > > > people (like me ...) are tyring to help people making progress even
> > > > if that involves trying to decipher 30 years of GCC history in this
> > > > area (without much success in the end as we see) ;)
> > >
> > > Yeah :-)  And my thanks to you and everyone involved for tackling this
> > > problematic part of GCC, which has been neglected and patched over for
> > > way too long.  But from that same history it follows that anything you
> > > do not super carefully (with testing everywhere) will cause some serious
> > Frankly, testing everywhere is too heavy a burden for developers,
> > after all, everyone has a limited variety of machines, and may not be
> > familiar with using  other targets' simulators.
> > And back to the problem we were trying to solve at the beginning
> > (subreg:HF(reg:SI)), I guess this is not just a problem in x86
> > backend, any backend can encounter similar problems, that's why we
> > remove all the weird cases in validate_subreg.
>
> So can you please revert the change for now?  I think we need to go
> back to the issue in extract_bit_field - does it somehow work to use
> validate_subreg to avoid creating the subreg we ICE on in the first
> place and what happens then to code quality?
Sure, let me test the patch.
>
> Thanks,
> Richard.
>
> > > problems.  And nonse of these are easy to fix at all -- there is a
> > > *reason* targets did this nastiness.
> > >
> > > > > p.s. Very unrelated...  Should we have __builtin_bit_cast for C as 
> > > > > well?
> > > > > Is there any reason this could not work?
> > >
> > > Still interested in this btw :-)  (And still very unrelated.)
> > >
> > >
> > > Segher
> >
> >
> >
> > --
> > BR,
> > Hongtao



-- 
BR,
Hongtao


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-10 Thread Richard Biener via Gcc-patches
On Fri, Sep 10, 2021 at 5:03 AM Hongtao Liu  wrote:
>
> On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool
>  wrote:
> >
> > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote:
> > > > I think we should (longer term) get rid of the overloaded meanings and
> > > > uses of subregs.  One fairly simple thing is to make a new rtx code
> > > > "bit_cast" (or is there a nice short more traditional name for it?)
> > >
> > > But subreg _is_ bit_cast.
> >
> > It is not.  (subreg:M (reg:N) O) for O>0, little-endian, is not a
> > bit_cast.  It is taking a part of a register, or a single register from
> > a multi-register thing.  Paradoxicals are not bit-casts either.
> >
> > Subregs from or to (but not both) integer modes are generally bit_cast,
> > yeah.
> >
> > > What is odd to me is that a "disallowed" subreg
> > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of
> > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 
> > > 0) 0).
> > > Of course that's nested and invalid but just push the inner subreg to a
> > > new pseudo and the thing becomes valid.
> >
> > Bingo.
> >
> > And many targets have strange rules for bit-strings in which modes can
> > be used as bit-strings in which other modes, and at what offsets in
> > which registers.  Now perhaps none of that is optimal (I bet it isn't),
> > but changing this without a transition plan simply does not work.
> >
> > > > But that is not the core problem we had here.  The behaviour of the
> > > > generic parts of the compiler was changed, without testing if that
> > > > works on other targets but x86.  That is an understandable mistake, it
> > > > takes some experience to know where the morasses are.  But this change
> > > > should have been accompanied by testcases exercising the changed code.
> > > > We would have clearly seen there are issues then, simply by watching
> > > > gcc-testresults@ (and/or maintainers are on top of the test results
> > > > anyway).  Also, if there were testcases for this, we could have some
> > > > confidence that a change in this area is robust.
> > >
> > > Well, that only works if some maintainers that are familiar enough
> > > with all this chime in ;)
> >
> > Not really.  It works always.  And it works way better than the
> > pandemonium we now have with broken targets left and right.
> >
> > With testcases anyone can see if any specific target is broken here.
> >
> > > It's stage1 so it's understandable that some
> > > people (like me ...) are tyring to help people making progress even
> > > if that involves trying to decipher 30 years of GCC history in this
> > > area (without much success in the end as we see) ;)
> >
> > Yeah :-)  And my thanks to you and everyone involved for tackling this
> > problematic part of GCC, which has been neglected and patched over for
> > way too long.  But from that same history it follows that anything you
> > do not super carefully (with testing everywhere) will cause some serious
> Frankly, testing everywhere is too heavy a burden for developers,
> after all, everyone has a limited variety of machines, and may not be
> familiar with using  other targets' simulators.
> And back to the problem we were trying to solve at the beginning
> (subreg:HF(reg:SI)), I guess this is not just a problem in x86
> backend, any backend can encounter similar problems, that's why we
> remove all the weird cases in validate_subreg.

So can you please revert the change for now?  I think we need to go
back to the issue in extract_bit_field - does it somehow work to use
validate_subreg to avoid creating the subreg we ICE on in the first
place and what happens then to code quality?

Thanks,
Richard.

> > problems.  And nonse of these are easy to fix at all -- there is a
> > *reason* targets did this nastiness.
> >
> > > > p.s. Very unrelated...  Should we have __builtin_bit_cast for C as well?
> > > > Is there any reason this could not work?
> >
> > Still interested in this btw :-)  (And still very unrelated.)
> >
> >
> > Segher
>
>
>
> --
> BR,
> Hongtao


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-10 Thread Richard Biener via Gcc-patches
On Fri, Sep 10, 2021 at 1:50 AM Segher Boessenkool
 wrote:
>
> On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote:
> > > I think we should (longer term) get rid of the overloaded meanings and
> > > uses of subregs.  One fairly simple thing is to make a new rtx code
> > > "bit_cast" (or is there a nice short more traditional name for it?)
> >
> > But subreg _is_ bit_cast.
>
> It is not.  (subreg:M (reg:N) O) for O>0, little-endian, is not a
> bit_cast.  It is taking a part of a register, or a single register from
> a multi-register thing.  Paradoxicals are not bit-casts either.
>
> Subregs from or to (but not both) integer modes are generally bit_cast,
> yeah.
>
> > What is odd to me is that a "disallowed" subreg
> > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of
> > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 0) 
> > 0).
> > Of course that's nested and invalid but just push the inner subreg to a
> > new pseudo and the thing becomes valid.
>
> Bingo.
>
> And many targets have strange rules for bit-strings in which modes can
> be used as bit-strings in which other modes, and at what offsets in
> which registers.  Now perhaps none of that is optimal (I bet it isn't),
> but changing this without a transition plan simply does not work.

But we _do_ already allow some of them :/  Like

  /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
 is the culprit here, and not the backends.  */
  else if (known_ge (osize, regsize) && known_ge (isize, osize))
;

so for the special case where 'regsize' matches osize it would be
a bit-cast of a full register from int to float.  But as written it also
allows (subreg:XF (reg:TI))  which will likely wreck havoc?

Similar for the omode == word_mode check which allows
(subreg:DI (reg:TF ..)).  That is, the existing special-cases look
too broad to me - and they probably exist because when validate_subreg
rejects sth then we can't put it together later when expand split it
into two subregs and a pseudo ...

> > > But that is not the core problem we had here.  The behaviour of the
> > > generic parts of the compiler was changed, without testing if that
> > > works on other targets but x86.  That is an understandable mistake, it
> > > takes some experience to know where the morasses are.  But this change
> > > should have been accompanied by testcases exercising the changed code.
> > > We would have clearly seen there are issues then, simply by watching
> > > gcc-testresults@ (and/or maintainers are on top of the test results
> > > anyway).  Also, if there were testcases for this, we could have some
> > > confidence that a change in this area is robust.
> >
> > Well, that only works if some maintainers that are familiar enough
> > with all this chime in ;)
>
> Not really.  It works always.  And it works way better than the
> pandemonium we now have with broken targets left and right.
>
> With testcases anyone can see if any specific target is broken here.
>
> > It's stage1 so it's understandable that some
> > people (like me ...) are tyring to help people making progress even
> > if that involves trying to decipher 30 years of GCC history in this
> > area (without much success in the end as we see) ;)
>
> Yeah :-)  And my thanks to you and everyone involved for tackling this
> problematic part of GCC, which has been neglected and patched over for
> way too long.  But from that same history it follows that anything you
> do not super carefully (with testing everywhere) will cause some serious
> problems.  And nonse of these are easy to fix at all -- there is a
> *reason* targets did this nastiness.
>
> > > p.s. Very unrelated...  Should we have __builtin_bit_cast for C as well?
> > > Is there any reason this could not work?
>
> Still interested in this btw :-)  (And still very unrelated.)

Sure, why not ...

Richard.

>
> Segher


[PATCH] middle-end/102273 - avoid ICE with auto-init and nested functions

2021-09-10 Thread Richard Biener via Gcc-patches
This refactors expansion to consider non-decl LHS.  I suspect
the is_val argument is not needed.

Bootstrapped and regtest running on x86_64-unknown-linux-gnu.

Richard.

2021-09-10  Richard Biener  

PR middle-end/102273
* internal-fn.c (expand_DEFERRED_INIT): Always expand non-SSA vars.

* gcc.dg/pr102273.c: New testcase.
---
 gcc/internal-fn.c   | 22 +++---
 gcc/testsuite/gcc.dg/pr102273.c | 11 +++
 2 files changed, 18 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr102273.c

diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index ada2a820ff1..b1283690080 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -3006,31 +3006,23 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
   tree var_size = gimple_call_arg (stmt, 0);
   enum auto_init_type init_type
 = (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (stmt, 1));
-  bool is_vla = (bool) TREE_INT_CST_LOW (gimple_call_arg (stmt, 2));
   bool reg_lhs = true;
 
   tree var_type = TREE_TYPE (lhs);
   gcc_assert (init_type > AUTO_INIT_UNINITIALIZED);
 
-  if (DECL_P (lhs))
-{
-  rtx tem = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
-  reg_lhs = !MEM_P (tem);
-}
-  else if (TREE_CODE (lhs) == SSA_NAME)
+  if (TREE_CODE (lhs) == SSA_NAME)
 reg_lhs = true;
   else
 {
-  gcc_assert (is_vla);
-  reg_lhs = false;
+  rtx tem = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+  reg_lhs = !MEM_P (tem);
 }
 
-
   if (!reg_lhs)
 {
-/* If this is a VLA or the variable is not in register,
-   expand to a memset to initialize it.  */
-
+  /* If this is a VLA or the variable is not in register,
+expand to a memset to initialize it.  */
   mark_addressable (lhs);
   tree var_addr = build_fold_addr_expr (lhs);
 
@@ -3045,8 +3037,8 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
 }
   else
 {
-/* If this variable is in a register, use expand_assignment might
-   generate better code.  */
+  /* If this variable is in a register, use expand_assignment might
+generate better code.  */
   tree init = build_zero_cst (var_type);
   unsigned HOST_WIDE_INT total_bytes
= tree_to_uhwi (TYPE_SIZE_UNIT (var_type));
diff --git a/gcc/testsuite/gcc.dg/pr102273.c b/gcc/testsuite/gcc.dg/pr102273.c
new file mode 100644
index 000..568e44ebfef
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr102273.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-ftrivial-auto-var-init=zero" } */
+
+void bar();
+
+struct A { char d; };
+void foo()
+{
+  struct A e;
+  void baz() { bar(e); }
+}
-- 
2.31.1


Re: [COMMITTED][patch][version 9]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-09-10 Thread Thomas Schwinge
Hi!

On 2021-09-10T10:47:00+0200, Christophe LYON via Gcc-patches 
 wrote:
> On 10/09/2021 00:49, Qing Zhao via Gcc-patches wrote:
>> I just committed the following patch to gcc upstream:
>>
>>
>> https://gcc.gnu.org/pipermail/gcc-cvs/2021-September/353195.html

> Several of the new tests fail on arm and aarch64 with -mabi=ilp32.

Similar for 32-bix x86 testing, or x86_64 with '-m32' testing -- as also
reported by a number of auto-tester instances.

> On arm:
>
> gcc:gcc.dg/dg.exp=c-c++-common/auto-init-1.c  -Wc++-compat   scan-tree-dump 
> gimple "temp5 = .DEFERRED_INIT \\(8, 2, 0\\)"
>  gcc:gcc.dg/dg.exp=c-c++-common/auto-init-1.c  -Wc++-compat   
> scan-tree-dump gimple "temp7 = .DEFERRED_INIT \\(8, 2, 0\\)"
>  gcc:gcc.dg/dg.exp=c-c++-common/auto-init-2.c  -Wc++-compat   
> scan-tree-dump gimple "temp5 = .DEFERRED_INIT \\(8, 1, 0\\)"
>  gcc:gcc.dg/dg.exp=c-c++-common/auto-init-2.c  -Wc++-compat   
> scan-tree-dump gimple "temp7 = .DEFERRED_INIT \\(8, 1, 0\\)"
>  gcc:gcc.dg/dg.exp=c-c++-common/auto-init-3.c  -Wc++-compat   
> scan-tree-dump gimple "temp3 = .DEFERRED_INIT \\(16, 2, 0\\)"
>  gcc:gcc.dg/dg.exp=c-c++-common/auto-init-4.c  -Wc++-compat   
> scan-tree-dump gimple "temp3 = .DEFERRED_INIT \\(16, 1, 0\\)"
>  gcc:gcc.dg/dg.exp=c-c++-common/auto-init-5.c  -Wc++-compat   
> scan-tree-dump gimple "temp3 = .DEFERRED_INIT \\(32, 2, 0\\)"
>  gcc:gcc.dg/dg.exp=c-c++-common/auto-init-6.c  -Wc++-compat   
> scan-tree-dump gimple "temp3 = .DEFERRED_INIT \\(32, 1, 0\\)"
>  gcc:gcc.dg/dg.exp=c-c++-common/auto-init-padding-1.c  -Wc++-compat   
> scan-tree-dump gimple ".DEFERRED_INIT \\(24, 1, 0\\)"
>
> on aarch64 -mabi=ilp32:
>
>  gcc:gcc.dg/dg.exp=c-c++-common/auto-init-1.c  -Wc++-compat   
> scan-tree-dump gimple "temp5 = .DEFERRED_INIT \\(8, 2, 0\\)"
>  gcc:gcc.dg/dg.exp=c-c++-common/auto-init-1.c  -Wc++-compat   
> scan-tree-dump gimple "temp7 = .DEFERRED_INIT \\(8, 2, 0\\)"
>  gcc:gcc.dg/dg.exp=c-c++-common/auto-init-2.c  -Wc++-compat   
> scan-tree-dump gimple "temp5 = .DEFERRED_INIT \\(8, 1, 0\\)"
>  gcc:gcc.dg/dg.exp=c-c++-common/auto-init-2.c  -Wc++-compat   
> scan-tree-dump gimple "temp7 = .DEFERRED_INIT \\(8, 1, 0\\)"
>  gcc:gcc.dg/dg.exp=c-c++-common/auto-init-padding-1.c  -Wc++-compat   
> scan-tree-dump gimple ".DEFERRED_INIT \\(24, 1, 0\\)"
>  gcc:gcc.target/aarch64/aarch64.exp=gcc.target/aarch64/auto-init-2.c 
> scan-rtl-dump-times expand "0xfefefefefefefefe" 2
>  
> gcc:gcc.target/aarch64/aarch64.exp=gcc.target/aarch64/auto-init-padding-5.c 
> scan-assembler-times stp\txzr, xzr, 2
>
> Can you check?

On 2021-09-10T11:08:22+0200, Martin Liška  wrote:
> It's the following bug:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102269

No, not ICEs, but just "regular 'scan-tree-dump' FAILs".  I suppose these
are all data-type mismatches: for example, 'long' or 'int *' not mapping
to the expected '8'.


Unrelated to the above, I've pushed as obvious
"Fix 'dg-do run' syntax in 'c-c++-common/auto-init-padding-{2,3}.c'"
to master branch in commit 5c5c2d86e520c3bf37368309b2fe932c88bdd14f, see
attached.  (All-PASS per my testing.)


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 5c5c2d86e520c3bf37368309b2fe932c88bdd14f Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 10 Sep 2021 11:26:50 +0200
Subject: [PATCH] Fix 'dg-do run' syntax in
 'c-c++-common/auto-init-padding-{2,3}.c'

Fix-up for recent commit a25e0b5e6ac8a77a71c229e0a7b744603365b0e9
 "Add -ftrivial-auto-var-init option and uninitialized variable attribute".

	gcc/testsuite/
	* c-c++-common/auto-init-padding-2.c: Fix 'dg-do run' syntax.
	* c-c++-common/auto-init-padding-3.c: Likewise.
---
 gcc/testsuite/c-c++-common/auto-init-padding-2.c | 2 +-
 gcc/testsuite/c-c++-common/auto-init-padding-3.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/auto-init-padding-2.c b/gcc/testsuite/c-c++-common/auto-init-padding-2.c
index e2b50dc5ae8..462f5aeab91 100644
--- a/gcc/testsuite/c-c++-common/auto-init-padding-2.c
+++ b/gcc/testsuite/c-c++-common/auto-init-padding-2.c
@@ -1,7 +1,7 @@
 /* To test that the compiler can fill all the paddings to zeroes for the 
structures when the auto variable is partially initialized,  fully 
initialized, or not initialized for -ftrivial-auto-var-init=zero.  */
-/* { dg-do run} */
+/* { dg-do run } */
 /* { dg-options "-ftrivial-auto-var-init=zero" } */
 
 /* Structure with no padding. */
diff --git a/gcc/testsuite/c-c++-common/auto-init-padding-3.c b/gcc/testsuite/c-c++-common/auto-init-padding-3.c
index e2c48c002c9..22770142a95 100644
--- a/gcc/testsuite/c-c++-common/auto-init-padding-3.c
+++ b/gcc/testsuite/c-c++-common/auto-init-padding-3.c
@@ -1,7 +1,7 @@
 /* 

[PATCH][RFC] Come up with casm state

2021-09-10 Thread Martin Liška

Hi.

We're considering with Richi some changes related to how we emit early debug 
info.
That would probably include changes where the debug info would be streamed to a 
separate
file (different from normal .o output file).

That said, we would need switching in between output assembly files. The patch 
I'm sending
is a first step and we would like to receive a comments about the chosen 
approach?

Richi, do you want to add something?

Cheers,
Martindiff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 55cb0347149..5bac9beb934 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -2309,7 +2309,7 @@ symbol_table::compile (void)
   timevar_pop (TV_CGRAPHOPT);
 
   /* Output everything.  */
-  switch_to_section (text_section);
+  switch_to_section (casm->sections.text);
   (*debug_hooks->assembly_start) ();
   if (!quiet_flag)
 fprintf (stderr, "Assembling functions:\n");
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1fbe9e0daa0..3160d2952d5 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -25714,7 +25714,7 @@ aarch64_sls_emit_blr_function_thunks (FILE *out_file)
  would happen in a different section -- leaving an unmatched
  `.cfi_startproc` in the cold text section and an unmatched `.cfi_endproc`
  in the standard text section.  */
-  section *save_text_section = in_section;
+  section *save_text_section = casm->in_section;
   switch_to_section (function_section (current_function_decl));
   for (int regnum = 0; regnum < 30; ++regnum)
 {
diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c
index c702e683c31..54a1df4b821 100644
--- a/gcc/config/alpha/alpha.c
+++ b/gcc/config/alpha/alpha.c
@@ -8055,13 +8055,13 @@ alpha_start_function (FILE *file, const char *fnname,
 
 #ifdef TARGET_VMS_CRASH_DEBUG
   /* Support of minimal traceback info.  */
-  switch_to_section (readonly_data_section);
+  switch_to_section (casm->sections.readonly_data);
   fprintf (file, "\t.align 3\n");
   assemble_name (file, fnname); fputs ("..na:\n", file);
   fputs ("\t.ascii \"", file);
   assemble_name (file, fnname);
   fputs ("\\0\"\n", file);
-  switch_to_section (text_section);
+  switch_to_section (casm->sections.text);
 #endif
 #endif /* TARGET_ABI_OPEN_VMS */
 }
@@ -9446,7 +9446,7 @@ alpha_elf_select_rtx_section (machine_mode mode, rtx x,
 {
   if (TARGET_SMALL_DATA && GET_MODE_SIZE (mode) <= g_switch_value)
 /* ??? Consider using mergeable sdata sections.  */
-return sdata_section;
+return casm->sections.sdata;
   else
 return default_elf_select_rtx_section (mode, x, align);
 }
@@ -9613,7 +9613,7 @@ alpha_write_linkage (FILE *stream, const char *funname)
 {
   fprintf (stream, "\t.link\n");
   fprintf (stream, "\t.align 3\n");
-  in_section = NULL;
+  casm->in_section = NULL;
 
 #ifdef TARGET_VMS_CRASH_DEBUG
   fputs ("\t.name ", stream);
@@ -9665,7 +9665,7 @@ vms_asm_named_section (const char *name, unsigned int flags,
 static void
 vms_asm_out_constructor (rtx symbol, int priority ATTRIBUTE_UNUSED)
 {
-  switch_to_section (ctors_section);
+  switch_to_section (casm->sections.ctors);
   assemble_align (BITS_PER_WORD);
   assemble_integer (symbol, UNITS_PER_WORD, BITS_PER_WORD, 1);
 }
@@ -9673,7 +9673,7 @@ vms_asm_out_constructor (rtx symbol, int priority ATTRIBUTE_UNUSED)
 static void
 vms_asm_out_destructor (rtx symbol, int priority ATTRIBUTE_UNUSED)
 {
-  switch_to_section (dtors_section);
+  switch_to_section (casm->sections.dtors);
   assemble_align (BITS_PER_WORD);
   assemble_integer (symbol, UNITS_PER_WORD, BITS_PER_WORD, 1);
 }
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 92797db96b7..ee763e59312 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -8832,7 +8832,7 @@ arc_asm_output_aligned_decl_local (FILE * stream, tree decl, const char * name,
 switch_to_section (get_named_section (NULL, ".sbss", 0));
   /*named_section (0,".sbss",0); */
   else
-switch_to_section (bss_section);
+switch_to_section (casm->sections.bss);
 
   if (globalize_p)
 (*targetm.asm_out.globalize_label) (stream, name);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index f1e628253d0..5f7e36234ea 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -17234,7 +17234,7 @@ get_jump_table_size (rtx_jump_table_data *insn)
 {
   /* ADDR_VECs only take room if read-only data does into the text
  section.  */
-  if (JUMP_TABLES_IN_TEXT_SECTION || readonly_data_section == text_section)
+  if (JUMP_TABLES_IN_TEXT_SECTION || casm->sections.readonly_data == casm->sections.text)
 {
   rtx body = PATTERN (insn);
   int elt = GET_CODE (body) == ADDR_DIFF_VEC ? 1 : 0;
@@ -24642,9 +24642,9 @@ arm_elf_asm_cdtor (rtx symbol, int priority, bool is_ctor)
   s = get_section (buf, SECTION_WRITE | SECTION_NOTYPE, NULL_TREE);
 }
   else if (is_ctor)
-s = ctors_section;
+s = casm->sections.ctors;
   else
-s = dtors_section;
+s = 

[PATCH Take 2] More NEGATE_EXPR folding in match.pd

2021-09-10 Thread Roger Sayle

Hi Richard,
Thanks for suggestion, which cleanly solves the problem I was encountering.
This revised patch adds a Boolean simplify argument to tree-ssa-sccvn.c's
vn_nary_build_or_lookup_1 to control whether to simplification should be
performed before value numbering, updating the callers, but then
avoiding simplification when constructing/value-numbering NEGATE_EXPR.
This avoids the regression of gcc.dg/tree-ssa/ssa-free-88.c, and enables the
new test case(s) to pass.  Brilliant, thank you.

This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap"
and "make -k check" with no new failures.  Ok for mainline?


2021-09-10  Roger Sayle  
Richard Biener  

gcc/ChangeLog
* match.pd (negation simplifications): Implement some negation
folding transformations from fold-const.c's fold_negate_expr.
* tree-ssa-sccvn.c (vn_nary_build_or_lookup_1): Add a SIMPLIFY
argument, to control whether the op should be simplified prior
to looking up/assigning a value number.
(vn_nary_build_or_lookup): Update call to vn_nary_build_or_lookup_1.
(vn_nary_simplify): Likewise.
(visit_nary_op): Likewise, but when constructing a NEGATE_EXPR
now call vn_nary_build_or_lookup_1 disabling simplification.

gcc/testsuite/ChangeLog
* gcc.dg/fold-negate-1.c: New test case.


One potential enhancement request it might be useful to file in Bugzilla
(I'm not familiar enough with sccvn to investigate this myself), but there's
a missed optimization opportunity when we recognize one value-number
as the negation of another (and can therefore materialize one result from
the other using a single negation instruction).  The opportunity is that we
currently always select the first value number as the parent, and derive
the second from it, ignoring the expressions themselves.   Sometimes, it
may be profitable to use the second (negated) occurrence as the parent,
and instead negate that to obtain the first.  One could use negate_expr_p
to decide whether one expression is cheaper to negate than the other.

Both examples in gcc.dg/tree-ssa/ssa-free-88.c would benefit from this:
Firstly:
void bar (double x, double z) {
  y[0] = -z / x;
  y[1] = z / x;
}
if we select "z / x" as the parent, and derive -(z/x) from it, we can avoid/
eliminate a negation, over the current code that calculates "(-z)/x" and 
then derives "-((-z)/x)" from it.
Secondly:
void foo (double x) {
  y[0] = x * -3.;
  y[1] = x * 3.;
}
Following Richard's solution/workaround to PR 19988, we'd prefer to keep
positive real constants in the constant pool, hence selecting "x * 3.0" as the
parent and deriving "-(x * 3.0)" from it, would be slightly preferred over the
current behaviour of placing -3 in the constant pool.

Thanks again,
--
Roger

-Original Message-
From: Richard Biener  
Sent: 09 September 2021 13:05
To: Roger Sayle 
Cc: GCC Patches 
Subject: Re: [PATCH] More NEGATE_EXPR folding in match.pd

On Thu, Sep 9, 2021 at 12:08 PM Roger Sayle  wrote:
>
>
> As observed by Jakub in comment #2 of PR 98865, the expression 
> -(a>>63) is optimized in GENERIC but not in GIMPLE.  Investigating 
> further it turns out that this is one of a few transformations 
> performed by fold_negate_expr in fold-const.c that aren't yet performed by 
> match.pd.
> This patch moves/duplicates them there, and should be relatively safe 
> as these transformations are already performed by the compiler, but 
> just in different passes.
>
> Alas the one minor complication is that some of these transformations 
> are only wins, if the intermediate result (of the multiplication or
> division) is only used once, to avoid duplication/performing them again.
> See gcc.dg/tree-ssa/ssa-free-88.c.  Normally, this is the perfect 
> usage of match's single_use (aka SSA's has_single_use).  Alas, 
> single_use is not always accurate in match.pd, as some passes will 
> construct and simplify an expression/stmt before inserting it into 
> GIMPLE, and folding during this process sees the temporary undercount from 
> the data-flow.
> To solve this, this patch introduces a new single_use_is_op_p that 
> double checks that the single_use has the expected tree_code/operation 
> and skips the transformation if we can tell single_use might be invalid.
>
> A follow-up patch might be to investigate whether genmatch.c can be 
> tweaked to use this new helper function to implement the :s qualifier 
> when the enclosing context is/should be known, but that's overkill to 
> just unblock Jakub and Andrew on 98865.
>
> This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap"
> and "make -k check" with no new failures.  Ok for mainline?

I think that single_use_is_op_p is "bad" heuristics since it stretches the SSA 
operand / immediate use use a bit too far.  Generally fold_stmt and thus 
match.pd patterns may not rely on stmt operands or immediate uses as only 
generated by update_stmt.

In fact the whole 

[PING] Re: [Patch][GCC][middle-end] - Generate FRINTZ for (double)(int) under -ffast-math on aarch64

2021-09-10 Thread Jirui Wu via Gcc-patches
Hi,

Ping: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577846.html

Ok for master? If OK, can it be committed for me, I have no commit rights.

Jirui Wu
-Original Message-
From: Jirui Wu 
Sent: Friday, September 3, 2021 12:39 PM
To: 'Richard Biener' 
Cc: Richard Biener ; Andrew Pinski 
; Richard Sandiford ; 
i...@airs.com; gcc-patches@gcc.gnu.org; Joseph S. Myers 

Subject: RE: [Patch][GCC][middle-end] - Generate FRINTZ for (double)(int) under 
-ffast-math on aarch64

Ping

-Original Message-
From: Jirui Wu
Sent: Friday, August 20, 2021 4:28 PM
To: Richard Biener 
Cc: Richard Biener ; Andrew Pinski 
; Richard Sandiford ; 
i...@airs.com; gcc-patches@gcc.gnu.org; Joseph S. Myers 

Subject: RE: [Patch][GCC][middle-end] - Generate FRINTZ for (double)(int) under 
-ffast-math on aarch64

> -Original Message-
> From: Richard Biener 
> Sent: Friday, August 20, 2021 8:15 AM
> To: Jirui Wu 
> Cc: Richard Biener ; Andrew Pinski 
> ; Richard Sandiford ; 
> i...@airs.com; gcc-patches@gcc.gnu.org; Joseph S. Myers 
> 
> Subject: RE: [Patch][GCC][middle-end] - Generate FRINTZ for
> (double)(int) under -ffast-math on aarch64
> 
> On Thu, 19 Aug 2021, Jirui Wu wrote:
> 
> > Hi all,
> >
> > This patch generates FRINTZ instruction to optimize type casts.
> >
> > The changes in this patch covers:
> > * Generate FRINTZ for (double)(int) casts.
> > * Add new test cases.
> >
> > The intermediate type is not checked according to the C99 spec.
> > Overflow of the integral part when casting floats to integers causes
> undefined behavior.
> > As a result, optimization to trunc() is not invalid.
> > I've confirmed that Boolean type does not match the matching condition.
> >
> > Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master? If OK can it be committed for me, I have no commit rights.
> 
> +/* Detected a fix_trunc cast inside a float type cast,
> +   use IFN_TRUNC to optimize.  */
> +#if GIMPLE
> +(simplify
> +  (float (fix_trunc @0))
> +  (if (direct_internal_fn_supported_p (IFN_TRUNC, type,
> +  OPTIMIZE_FOR_BOTH)
> +   && flag_unsafe_math_optimizations
> +   && type == TREE_TYPE (@0))
> 
> types_match (type, TREE_TYPE (@0))
> 
> please.  Please perform cheap tests first (the flag test).
> 
> + (IFN_TRUNC @0)))
> +#endif
> 
> why only for GIMPLE?  I'm not sure flag_unsafe_math_optimizations is a 
> good test here.  If you say we can use undefined behavior of any 
> overflow of the fix_trunc operation what do we guard here?
> If it's Inf/NaN input then flag_finite_math_only would be more 
> appropriate, if it's behavior for -0. (I suppose trunc (-0.0) == -0.0 
> and thus "wrong") then a && !HONOR_SIGNED_ZEROS (type) is missing 
> instead.  If it's setting of FENV state and possibly trapping on 
> overflow (but it's undefined?!) then flag_trapping_math covers the 
> latter but we don't have any flag for eliding FENV state affecting 
> transforms, so there the kitchen-sink flag_unsafe_math_optimizations might 
> apply.
> 
> So - which is it?
> 
This change is only for GIMPLE because we can't test for the optab support 
without being in GIMPLE. direct_internal_fn_supported_p is defined only for 
GIMPLE. 

IFN_TRUNC's documentation mentions nothing for zero, NaNs/inf inputs.
So I think the correct guard is just flag_fp_int_builtin_inexact.
!flag_trapping_math because the operation can only still raise inexacts.

The new pattern is moved next to the place you mentioned.

Ok for master? If OK can it be committed for me, I have no commit rights.

Thanks,
Jirui
> Note there's also the pattern
> 
> /* Handle cases of two conversions in a row.  */ (for ocvt (convert 
> float
> fix_trunc)  (for icvt (convert float)
>   (simplify
>(ocvt (icvt@1 @0))
>(with
> {
> ...
> 
> which is related so please put the new pattern next to that (the set 
> of conversions handled there does not include (float (fix_trunc @0)))
> 
> Thanks,
> Richard.
> 
> > Thanks,
> > Jirui
> >
> > gcc/ChangeLog:
> >
> > * match.pd: Generate IFN_TRUNC.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/aarch64/merge_trunc1.c: New test.
> >
> > > -Original Message-
> > > From: Richard Biener 
> > > Sent: Tuesday, August 17, 2021 9:13 AM
> > > To: Andrew Pinski 
> > > Cc: Jirui Wu ; Richard Sandiford 
> > > ; i...@airs.com; 
> > > gcc-patches@gcc.gnu.org; rguent...@suse.de
> > > Subject: Re: [Patch][GCC][middle-end] - Generate FRINTZ for
> > > (double)(int) under -ffast-math on aarch64
> > >
> > > On Mon, Aug 16, 2021 at 8:48 PM Andrew Pinski via Gcc-patches
> > >  wrote:
> > > >
> > > > On Mon, Aug 16, 2021 at 9:15 AM Jirui Wu via Gcc-patches 
> > > >  wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > This patch generates FRINTZ instruction to optimize type casts.
> > > > >
> > > > > The changes in this patch covers:
> > > > > * Opimization of a FIX_TRUNC_EXPR cast inside a FLOAT_EXPR 
> > > > > using
> > > IFN_TRUNC.
> > > > > * 

Re: [COMMITTED][patch][version 9]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-09-10 Thread Martin Liška

On 9/10/21 10:47, Christophe LYON via Gcc-patches wrote:

Can you check?


It's the following bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102269

Martin


Re: [COMMITTED][patch][version 9]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-09-10 Thread Christophe LYON via Gcc-patches



On 10/09/2021 00:49, Qing Zhao via Gcc-patches wrote:

Hi, FYI

I just committed the following patch to gcc upstream:


https://gcc.gnu.org/pipermail/gcc-cvs/2021-September/353195.html


Hi,

Several of the new tests fail on arm and aarch64 with -mabi=ilp32.

On arm:

gcc:gcc.dg/dg.exp=c-c++-common/auto-init-1.c  -Wc++-compat   scan-tree-dump gimple 
"temp5 = .DEFERRED_INIT \\(8, 2, 0\\)"
gcc:gcc.dg/dg.exp=c-c++-common/auto-init-1.c  -Wc++-compat   scan-tree-dump gimple 
"temp7 = .DEFERRED_INIT \\(8, 2, 0\\)"
gcc:gcc.dg/dg.exp=c-c++-common/auto-init-2.c  -Wc++-compat   scan-tree-dump gimple 
"temp5 = .DEFERRED_INIT \\(8, 1, 0\\)"
gcc:gcc.dg/dg.exp=c-c++-common/auto-init-2.c  -Wc++-compat   scan-tree-dump gimple 
"temp7 = .DEFERRED_INIT \\(8, 1, 0\\)"
gcc:gcc.dg/dg.exp=c-c++-common/auto-init-3.c  -Wc++-compat   scan-tree-dump gimple 
"temp3 = .DEFERRED_INIT \\(16, 2, 0\\)"
gcc:gcc.dg/dg.exp=c-c++-common/auto-init-4.c  -Wc++-compat   scan-tree-dump gimple 
"temp3 = .DEFERRED_INIT \\(16, 1, 0\\)"
gcc:gcc.dg/dg.exp=c-c++-common/auto-init-5.c  -Wc++-compat   scan-tree-dump gimple 
"temp3 = .DEFERRED_INIT \\(32, 2, 0\\)"
gcc:gcc.dg/dg.exp=c-c++-common/auto-init-6.c  -Wc++-compat   scan-tree-dump gimple 
"temp3 = .DEFERRED_INIT \\(32, 1, 0\\)"
gcc:gcc.dg/dg.exp=c-c++-common/auto-init-padding-1.c  -Wc++-compat   scan-tree-dump 
gimple ".DEFERRED_INIT \\(24, 1, 0\\)"

on aarch64 -mabi=ilp32:

gcc:gcc.dg/dg.exp=c-c++-common/auto-init-1.c  -Wc++-compat   scan-tree-dump gimple 
"temp5 = .DEFERRED_INIT \\(8, 2, 0\\)"
gcc:gcc.dg/dg.exp=c-c++-common/auto-init-1.c  -Wc++-compat   scan-tree-dump gimple 
"temp7 = .DEFERRED_INIT \\(8, 2, 0\\)"
gcc:gcc.dg/dg.exp=c-c++-common/auto-init-2.c  -Wc++-compat   scan-tree-dump gimple 
"temp5 = .DEFERRED_INIT \\(8, 1, 0\\)"
gcc:gcc.dg/dg.exp=c-c++-common/auto-init-2.c  -Wc++-compat   scan-tree-dump gimple 
"temp7 = .DEFERRED_INIT \\(8, 1, 0\\)"
gcc:gcc.dg/dg.exp=c-c++-common/auto-init-padding-1.c  -Wc++-compat   scan-tree-dump 
gimple ".DEFERRED_INIT \\(24, 1, 0\\)"
gcc:gcc.target/aarch64/aarch64.exp=gcc.target/aarch64/auto-init-2.c 
scan-rtl-dump-times expand "0xfefefefefefefefe" 2
gcc:gcc.target/aarch64/aarch64.exp=gcc.target/aarch64/auto-init-padding-5.c 
scan-assembler-times stp\txzr, xzr, 2

Can you check?


Thanks,

Christophe




Thanks.

Qing


On Sep 6, 2021, at 5:16 AM, Richard Biener  wrote:

On Sat, 21 Aug 2021, Qing Zhao wrote:


Hi,

This is the 8th version of the patch for the new security feature for GCC.
I have tested it with bootstrap on both x86 and aarch64, regression testing on 
both x86 and aarch64.
Also tested it with the kernel testing case provided by Kees.
Also compile CPU2017 (running is ongoing), without any issue.

Please take a look at this patch and let me know any issues.

+  /* If this DECL is a VLA, a temporary address variable for it has been
+ created, the replacement for DECL is recorded in DECL_VALUE_EXPR
(decl),
+ we should use it as the LHS of the call.  */
+
+  tree lhs_call
+= is_vla ? DECL_VALUE_EXPR (decl) : decl;
+  gimplify_assign (lhs_call, call, seq_p);

you shouldn't need to replace the lhs with DECL_VALUE_EXPR of it
here, gimplify_assign should take care of that.

+/* Return true if the DECL need to be automaticly initialized by the
+   compiler.  */
+static bool
+is_var_need_auto_init (tree decl)
+{
+  if (auto_var_p (decl)
+  && (opt_for_fn (current_function_decl, flag_auto_var_init)

maybe I said otherwise at some point but you can test 'flag_auto_var_init'
directly when not in an IPA pass, no need to use 'opt_for_fn'

+   > AUTO_INIT_UNINITIALIZED)
+  && (!lookup_attribute ("uninitialized", DECL_ATTRIBUTES (decl
+return true;
+  return false;


diff --git a/gcc/tree.c b/gcc/tree.c
index e923e67b6942..23d7b17774ce 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -9508,6 +9508,22 @@ build_common_builtin_nodes (void)
   tree tmp, ftype;
   int ecf_flags;

+  /* If user requests automatic variables initialization, the builtin
+ BUILT_IN_CLEAR_PADDING is needed.  */
+  if (flag_auto_var_init > AUTO_INIT_UNINITIALIZED
+  && !builtin_decl_explicit_p (BUILT_IN_CLEAR_PADDING))

I think this is prone to fail with LTO and auto-var-init setting
different in different TUs.  Just build the builtin unconditionally
when it's not available.

+{
+  ftype = build_function_type_list (void_type_node,
+   ptr_type_node,
+   ptr_type_node,
+   integer_type_node,
+   NULL_TREE);
+  local_define_builtin ("__builtin_clear_padding", ftype,
+   BUILT_IN_CLEAR_PADDING,
+   "__builtin_clear_padding",
+ ECF_LEAF | ECF_NOTHROW);
+}


With these changes the patch looks OK to me, so please go ahead
after fixing the above.

Thanks,

[PATCH] middle-end/102269 - avoid auto-init of empty types

2021-09-10 Thread Richard Biener via Gcc-patches
This avoids initializing empty types for which we'll eventually
leave a .DEFERRED_INIT call without a LHS.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

2021-09-10  Richard Biener  

PR middle-end/102269
* gimplify.c (is_var_need_auto_init): Empty types do not need
initialization.

* gcc.dg/pr102269.c: New testcase.
---
 gcc/gimplify.c  | 3 ++-
 gcc/testsuite/gcc.dg/pr102269.c | 4 
 2 files changed, 6 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr102269.c

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 3314f76cf3f..8820f873993 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -1826,7 +1826,8 @@ is_var_need_auto_init (tree decl)
 {
   if (auto_var_p (decl)
   && (flag_auto_var_init > AUTO_INIT_UNINITIALIZED)
-  && (!lookup_attribute ("uninitialized", DECL_ATTRIBUTES (decl
+  && (!lookup_attribute ("uninitialized", DECL_ATTRIBUTES (decl)))
+  && !is_empty_type (TREE_TYPE (decl)))
 return true;
   return false;
 }
diff --git a/gcc/testsuite/gcc.dg/pr102269.c b/gcc/testsuite/gcc.dg/pr102269.c
new file mode 100644
index 000..9d41b8fd7c7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr102269.c
@@ -0,0 +1,4 @@
+/* { dg-do compile } */
+/* { dg-options "-ftrivial-auto-var-init=zero" } */
+
+void fn() { int a[0]; }
-- 
2.31.1


Re: Host and offload targets have no common meaning of address spaces

2021-09-10 Thread Thomas Schwinge
Hi!

Ping.  Patch again attached for easy reference.


Plus, incrementally, the two "should we" questions cited below?


Grüße
 Thomas


On 2021-08-24T12:23:07+0200, I wrote:
> Hi!
>
> On 2021-08-19T22:13:56+0200, I wrote:
>> On 2021-08-16T10:21:04+0200, Jakub Jelinek  wrote:
>>> On Mon, Aug 16, 2021 at 10:08:42AM +0200, Thomas Schwinge wrote:
>> |> Concerning the current 'gcc/omp-low.c:omp_build_component_ref', for the
>> |> current set of offloading testcases, we never see a
>> |> '!ADDR_SPACE_GENERIC_P' there, so the address space handling doesn't seem
>> |> to be necessary there (but also won't do any harm: no-op).
>>>
>>> Are you sure this can't trigger?
>>> Say
>>> extern int __seg_fs a;
>>>
>>> void
>>> foo (void)
>>> {
>>>   #pragma omp parallel private (a)
>>>   a = 2;
>>> }
>>
>> That test case doesn't run into 'omp_build_component_ref' at all,
>> but [I've pushed an altered and extended variant that does],
>> "Add 'libgomp.c/address-space-1.c'".
>>
>> In this case, 'omp_build_component_ref' called via host compilation
>> 'pass_lower_omp', it's the 'field_type' that has 'address-space-1', not
>> 'obj_type', so indeed Kwok's new code is a no-op:
>>
>> (gdb) call debug_tree(field_type)
>>  > type 
>>> I think keeping the qual addr space here is the wrong thing to do,
>>> it should keep the other quals and clear the address space instead,
>>> the whole struct is going to be in generic addres space, isn't it?
>>
>> Correct for 'omp_build_component_ref' called via host compilation
>> 'pass_lower_omp'
>
>> However, regarding the former comment -- shouldn't we force generic
>> address space for all 'tree' types read in via LTO streaming for
>> offloading compilation?  I assume that (in the general case) address
>> spaces are never compatible between host and offloading compilation?
>> For [...] "Add 'libgomp.c/address-space-1.c'", propagating the
>> '__seg_fs' address space across the offloading boundary (assuming I did
>> interpret the dumps correctly) doesn't seem to cause any problems
>
> As I found later, actually the 'address-space-1' per host '__seg_fs' does
> cause the "Intel MIC (emulated) offloading execution failure"
> mentioned/XFAILed for 'libgomp.c/address-space-1.c': SIGSEGV, like
> (expected) for host execution.  For GCN offloading target, it maps to
> GCN 'ADDR_SPACE_FLAT' which apparently doesn't cause any ill effects (for
> that simple test case).  The nvptx offloading target doesn't consider
> address spaces at all.
>
> Is the attached "Host and offload targets have no common meaning of
> address spaces" OK to push?
>
>
> Then, is that the way to do this, or should we add in
> 'gcc/tree-streamer-out.c:pack_ts_base_value_fields':
>
> if (lto_stream_offload_p)
>   gcc_assert (ADDR_SPACE_GENERIC_P (TYPE_ADDR_SPACE (expr)));
>
> ..., and elsewhere sanitize this for offloading compilation?  Jakub's
> suggestion above, regarding 'gcc/omp-low.c:omp_build_component_ref':
>
> | I think keeping the qual addr space here is the wrong thing to do,
> | it should keep the other quals and clear the address space instead
>
> But it's not obvious to me that indeed this is the one place where this
> would need to be done?  (It ought to work for
> 'libgomp.c/address-space-1.c', and any other occurrences would run into
> the 'assert', so that ought to be "fine", though?)
>
>
> And, should we have a new hook
> 'void targetm.addr_space.validate (addr_space_t as)' (better name?),
> called via 'gcc/emit-rtl.c:set_mem_attrs' (only? -- assuming this is the
> appropriate canonic function where address space use is observed?), to
> make sure that the requested 'as' is valid for the target?
> 'default_addr_space_validate' would refuse everything but
> 'ADDR_SPACE_GENERIC_P (as)'; this hook would need implementing for all
> handful of targets making use of address spaces (supposedly matching the
> logic how they call 'c_register_addr_space'?).  (The closest existing
> hook seems to be 'targetm.addr_space.diagnose_usage', only defined for
> AVR, and called from "the front ends" (C only).)
>
>
> Grüße
>  Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From e01e06bd17bf2c7cb182d30bed02babc5edfa183 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 24 Aug 2021 11:14:10 +0200
Subject: [PATCH] Host and offload targets have no common meaning of address
 spaces

	gcc/
	* tree-streamer-out.c (pack_ts_base_value_fields): Don't pack
	'TYPE_ADDR_SPACE' for offloading.
	* tree-streamer-in.c (unpack_ts_base_value_fields): Don't unpack
	'TYPE_ADDR_SPACE' for offloading.
	libgomp/
	* testsuite/libgomp.c/address-space-1.c: Remove 'dg-xfail-run-if'
	for 'offload_device_intel_mic'.
---
 gcc/tree-streamer-in.c| 2 ++
 gcc/tree-streamer-out.c   

[PATCH] analyzer: Define INCLUDE_UNIQUE_PTR

2021-09-10 Thread Maxim Blinov
Un-break the build for AArch64 Darwin. Build currently fails with an
error very similar to pr82091:

```
In file included from 
../../../gcc-master-wip-apple-si/gcc/analyzer/engine.cc:69:
In file included from 
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/memory:678:
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/stdexcept:239:5:
 error: no member named 'fancy_abort' in namespace 'std::__1'; did you mean 
simply 'fancy_abort'?
_VSTD::abort();
^~~
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/__config:852:15:
 note: expanded from macro '_VSTD'

../../../gcc-master-wip-apple-si/gcc/system.h:777:13: note: 'fancy_abort' 
declared here
extern void fancy_abort (const char *, int, const char *)
^
```

Judging from the following comment in gcc/system.h, we just need to
define INCLUDE_UNIQUE_PTR since commit eafa9d96923 added the inclusion
of :

```
/* Some of the headers included by  can use "abort" within a
   namespace, e.g. "_VSTD::abort();", which fails after we use the
   preprocessor to redefine "abort" as "fancy_abort" below.
   Given that unique-ptr.h can use "free", we need to do this after "free"
   is declared but before "abort" is overridden.  */

```

gcc/analyzer/ChangeLog:
* engine.cc: Define INCLUDE_UNIQUE_PTR.
---
 gcc/analyzer/engine.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index 24f0931197d..f21f8e5b78a 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -19,6 +19,7 @@ along with GCC; see the file COPYING3.  If not see
 .  */
 
 #include "config.h"
+#define INCLUDE_UNIQUE_PTR
 #include "system.h"
 #include "coretypes.h"
 #include "tree.h"
-- 
2.30.1 (Apple Git-130)



[PING] Re: Fix 'hash_table::expand' to destruct stale Value objects

2021-09-10 Thread Thomas Schwinge
Hi!

On 2021-09-01T19:31:19-0600, Martin Sebor via Gcc-patches 
 wrote:
> On 8/30/21 4:46 AM, Thomas Schwinge wrote:
>> Ping -- we still need to plug the memory leak; see patch attached, and/or
>> long discussion here:
>
> Thanks for answering my questions.  I have no concerns with going
> forward with the patch as is.

Thanks, Martin.  Ping for formal approval (and review for using proper
C++ terminology in the 'gcc/hash-table.h:hash_table::expand' source code
comment that I'm adding).  Patch again attached, for easy reference.


> Just a suggestion/request: unless
> this patch fixes all the outstanding problems you know of or suspect
> in this area (leaks/missing dtor calls) and unless you plan to work
> on those in the near future, please open a bug for them with a brain
> dump of what you learned.  That should save us time when the day
> comes to tackle those.

ACK.  I'm not aware of any additional known problems.  (In our email
discussion, we did have some "vague ideas" for opportunities of
clarification/clean-up, but these aren't worth filing PRs for; needs
someone to gain understanding, taking a look.)


Grüße
 Thomas


>> On 2021-08-16T14:10:00-0600, Martin Sebor  wrote:
>>> On 8/16/21 6:44 AM, Thomas Schwinge wrote:
 On 2021-08-12T17:15:44-0600, Martin Sebor via Gcc  wrote:
> On 8/6/21 10:57 AM, Thomas Schwinge wrote:
>> So I'm trying to do some C++...  ;-)
>>
>> Given:
>>
>>/* A map from SSA names or var decls to record fields.  */
>>typedef hash_map field_map_t;
>>
>>/* For each propagation record type, this is a map from SSA names 
>> or var decls
>>   to propagate, to the field in the record type that should be 
>> used for
>>   transmission and reception.  */
>>typedef hash_map record_field_map_t;
>>
>> Thus, that's a 'hash_map>'.  (I may do that,
>> right?)  Looking through GCC implementation files, very most of all uses
>> of 'hash_map' boil down to pointer key ('tree', for example) and
>> pointer/integer value.
>
> Right.  Because most GCC containers rely exclusively on GCC's own
> uses for testing, if your use case is novel in some way, chances
> are it might not work as intended in all circumstances.
>
> I've wrestled with hash_map a number of times.  A use case that's
> close to yours (i.e., a non-trivial value type) is in cp/parser.c:
> see class_to_loc_map_t.

 Indeed, at the time you sent this email, I already had started looking
 into that one!  (The Fortran test cases that I originally analyzed, which
 triggered other cases of non-POD/non-trivial destructor, all didn't
 result in a memory leak, because the non-trivial constructor doesn't
 actually allocate any resources dynamically -- that's indeed different in
 this case here.)  ..., and indeed:

> (I don't remember if I tested it for leaks
> though.  It's used to implement -Wmismatched-tags so compiling
> a few tests under Valgrind should show if it does leak.)

 ... it does leak memory at present.  :-| (See attached commit log for
 details for one example.)
>>
>> (Attached "Fix 'hash_table::expand' to destruct stale Value objects"
>> again.)
>>
 To that effect, to document the current behavior, I propose to
 "Add more self-tests for 'hash_map' with Value type with non-trivial
 constructor/destructor"
>>
>> (We've done that in commit e4f16e9f357a38ec702fb69a0ffab9d292a6af9b
>> "Add more self-tests for 'hash_map' with Value type with non-trivial
>> constructor/destructor", quickly followed by bug fix
>> commit bb04a03c6f9bacc890118b9e12b657503093c2f8
>> "Make 'gcc/hash-map-tests.c:test_map_of_type_with_ctor_and_dtor_expand'
>> work on 32-bit architectures [PR101959]".
>>
 (Also cherry-pick into release branches, eventually?)
>>
>> Then:
>>
>>record_field_map_t field_map ([...]); // see below
>>for ([...])
>>  {
>>tree record_type = [...];
>>[...]
>>bool existed;
>>field_map_t 
>>  = field_map.get_or_insert (record_type, );
>>gcc_checking_assert (!existed);
>>[...]
>>for ([...])
>>  fields.put ([...], [...]);
>>[...]
>>  }
>>[stuff that looks up elements from 'field_map']
>>field_map.empty ();
>>
>> This generally works.
>>
>> If I instantiate 'record_field_map_t field_map (40);', Valgrind is happy.
>> If however I instantiate 'record_field_map_t field_map (13);' (where '13'
>> would be the default for 'hash_map'), Valgrind complains:
>>
>>2,080 bytes in 10 blocks are definitely lost in loss record 828 
>> of 876
>>   at 0x483DD99: calloc (vg_replace_malloc.c:762)
>>   by 0x175F010: xcalloc 

[PING] Re: [Committed] [PATCH 2/4] (v4) On-demand locations within string-literals

2021-09-10 Thread Thomas Schwinge
Hi!

Ping.  My patches again attached, for easy reference.


Grüße
 Thomas


On 2021-09-03T18:33:37+0200, I wrote:
> Hi!
>
> On 2021-09-02T21:09:54+0200, I wrote:
>> On 2021-09-02T15:59:14+0200, I wrote:
>>> On 2016-08-05T14:16:58-0400, David Malcolm  wrote:
 Committed to trunk as r239175; I'm attaching the final version of the
 patch for reference.
>>>
>>> David, you've added here 'gcc/input.h:struct location_hash' (see quoted
>>> below), which will be useful elsewhere, so:
>>>
 --- a/gcc/input.c
 +++ b/gcc/input.c
>>>
 +/* Internal function.  Canonicalize LOC into a form suitable for
 +   use as a key within the database, stripping away macro expansion,
 +   ad-hoc information, and range information, using the location of
 +   the start of LOC within an ordinary linemap.  */
 +
 +location_t
 +string_concat_db::get_key_loc (location_t loc)
 +{
 +  loc = linemap_resolve_location (line_table, loc, LRK_SPELLING_LOCATION,
 + NULL);
 +
 +  loc = get_range_from_loc (line_table, loc).m_start;
 +
 +  return loc;
 +}
>>>
>>> OK to push the attached
>>> "Harden 'gcc/input.c:string_concat_db::get_key_loc'"?  (This fell out of
>>> my analysis for development work elsewhere.)
>>
>> My suggested patch was:
>>
>> --- a/gcc/input.c
>> +++ b/gcc/input.c
>> @@ -1483,6 +1483,9 @@ string_concat_db::get_key_loc (location_t loc)
>>
>>loc = get_range_from_loc (line_table, loc).m_start;
>>
>> +  /* Ascertain that 'loc' is valid as a key in 'm_table'.  */
>> +  gcc_checking_assert (!RESERVED_LOCATION_P (loc));
>> +
>>return loc;
>>  }
>>
>> Uh, I should've looked at the correct test logs...  This change actually
>> does regress 'c-c++-common/substring-location-PR-87721.c' and
>> 'gcc.dg/plugin/diagnostic-test-string-literals-1.c': for these, we do see
>> 'BUILTINS_LOCATION' (via 'string_concat_db::record_string_concatenation').
>> Unless someone tell me that's unexpected (I'm completely lost in this
>> code...)
>
> I think I convinced myself that the current code doesn't have stable
> behavior, so...
>
>> I shall change/generalize my changes to provide both a
>> 'location_hash' only using 'UNKNOWN_LOCATION' as a spare value for
>> 'Empty' (as currently used here) and another variant additionally using
>> 'BUILTINS_LOCATION' as spare value for 'Deleted'.
>
> ... I didn't do this, but instead would like to push the attached
> "Don't record string concatenation data for 'RESERVED_LOCATION_P'"
> (replacing "Harden 'gcc/input.c:string_concat_db::get_key_loc'" as
> originally proposed).  OK?
>
>
> ... and then re:
>
 --- a/gcc/input.h
 +++ b/gcc/input.h
>>>
 +struct location_hash : int_hash  { };
 +
 +class GTY(()) string_concat_db
 +{
 +[...]
 +  hash_map  *m_table;
 +};
>>>
>>> OK to push the attached
>>> "Generalize 'gcc/input.h:struct location_hash'"?
>
> Attached again.
>
>
> Grüße
>  Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 9f1066fcb770397d6e791aa0594f067a755e2ed6 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 3 Sep 2021 18:25:10 +0200
Subject: [PATCH] Don't record string concatenation data for
 'RESERVED_LOCATION_P'

'RESERVED_LOCATION_P' means 'UNKNOWN_LOCATION' or 'BUILTINS_LOCATION.
We're using 'UNKNOWN_LOCATION' as a spare value for 'Empty', so should
ascertain that we don't use it as a key additionally.  Similarly for
'BUILTINS_LOCATION' that we'd later like to use as a spare value for
'Deleted'.

As discussed in the source code comment added, for these we didn't have
stable behavior anyway.

Follow-up to r239175 (commit 88faa309e5d6c6171b957daaf2f800920869)
"On-demand locations within string-literals".

	gcc/
	* input.c (string_concat_db::record_string_concatenation)
	(string_concat_db::get_string_concatenation): Skip for
	'RESERVED_LOCATION_P'.
	gcc/testsuite/
	* gcc.dg/plugin/diagnostic-test-string-literals-1.c: Adjust
	expected error diagnostics.
---
 gcc/input.c  | 9 +
 .../gcc.dg/plugin/diagnostic-test-string-literals-1.c| 4 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/gcc/input.c b/gcc/input.c
index 4b809862e02..dd753decfa0 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -1437,6 +1437,11 @@ string_concat_db::record_string_concatenation (int num, location_t *locs)
   gcc_assert (locs);
 
   location_t key_loc = get_key_loc (locs[0]);
+  /* We don't record data for 'RESERVED_LOCATION_P (key_loc)' key values:
+ any data now recorded under key 'key_loc' would be overwritten by a
+ subsequent call with the same key 'key_loc'.  */
+  if (RESERVED_LOCATION_P (key_loc))
+return;
 
   string_concat *concat
 

[PING] Don't maintain a warning spec for 'UNKNOWN_LOCATION'/'BUILTINS_LOCATION' [PR101574] (was: [PATCH 1/13] v2 [PATCH 1/13] Add support for per-location warning groups (PR 74765))

2021-09-10 Thread Thomas Schwinge
Hi!

Ping.

On 2021-09-03T21:16:46+0200, Thomas Schwinge  wrote:
> Martin, thanks for your review.  Now need someone to formally approve the
> third patch.

Again attached for easy reference.


Grüße
 Thomas


> On 2021-09-01T18:14:46-0600, Martin Sebor  wrote:
>> On 9/1/21 1:35 PM, Thomas Schwinge wrote:
>>> On 2021-06-23T13:47:08-0600, Martin Sebor via Gcc-patches 
>>>  wrote:
 On 6/22/21 5:28 PM, David Malcolm wrote:
> On Tue, 2021-06-22 at 19:18 -0400, David Malcolm wrote:
>> On Fri, 2021-06-04 at 15:41 -0600, Martin Sebor wrote:
>>> The attached patch introduces the suppress_warning(),
>>> warning_suppressed(), and copy_no_warning() APIs [etc.]
>
>>> I now had a bit of a deep dive into some aspects of this, in context of
>>>  "gcc/sparseset.h:215:20: error: suggest
>>> parentheses around assignment used as truth value [-Werror=parentheses]"
>>> that I recently filed.  This seems difficult to reproduce, but I'm still
>>> able to reliably reproduce it in one specific build
>>> configuration/directory/machine/whatever.  Initially, we all quickly
>>> assumed that it'd be some GC issue -- but "alas", it's not, at least not
>>> directly.  (But I'll certainly assume that some GC aspects are involved
>>> which make this issue come and go across different GCC sources revisions,
>>> and difficult to reproduce.)
>
>>> First, two pieces of cleanup:
>
> ACKed by Martin, again attached for convenience.
>
>
 --- /dev/null
 +++ b/gcc/diagnostic-spec.h
>>>
 +typedef location_t key_type_t;
 +typedef int_hash  xint_hash_t;
>
>> By the way, it seems we should probably also use a manifest constant
>> for Empty (probably UNKNOWN_LOCATION since we're reserving it).
>
> Yes, that will be part of another patch here -- waiting for approval of
> "Generalize 'gcc/input.h:struct location_hash'" posted elsewhere.
>
>
>>> attached "Don't maintain a warning spec for
>>> 'UNKNOWN_LOCATION'/'BUILTINS_LOCATION' [PR101574]".  OK to push?
>>
>> [...].  So I agree that it ought to be fixed.
>
>>> I'm reasonably confident that my changes are doing the right things in
>>> general, but please carefully review, especially here:
>>>
>>>- 'gcc/warning-control.cc:suppress_warning' functions: is it correct to
>>>  conditionalize on '!RESERVED_LOCATION_P' the 'suppress_warning_at'
>>>  calls and 'supp' update?  Or, should instead 'suppress_warning_at'
>>>  handle the case of '!RESERVED_LOCATION_P'?  (How?)
>>
>> It seems like six of one vs half a dozen of the other.  I'd say go
>> with whatever makes more sense to you here :)
>
> OK, was just trying to make sure that I don't fail to see any non-obvious
> intentions here.
>
>>>- 'gcc/diagnostic-spec.c:copy_warning' and
>>>  'gcc/warning-control.cc:copy_warning': is the rationale correct for
>>>  the 'gcc_checking_assert (!from_spec)': "If we cannot set no-warning
>>>  dispositions for 'to', ascertain that we don't have any for 'from'.
>>>  Otherwise, we'd lose these."?  If the rationale is correct, then
>>>  observing that in 'gcc/warning-control.cc:copy_warning' this
>>>  currently "triggers during GCC build" is something to be looked into,
>>>  later, I suppose, and otherwise, how should I change this code?
>>
>> copy_warning(location_t, location_t) is called [only] from
>> gimple_set_location().  The middle end does clear the location of
>> some statements for which it was previously valid (e.g., return
>> statements).
>
> What I observed was that the 'assert' never triggered for the
> 'location_t' variant "called [only] from gimple_set_location" -- but does
> trigger for some other variant.  Anyway:
>
>> So I wouldn't expect this assumption to be safe.  If
>> that happens, we have no choice but to lose the per-warning detail
>> and fall back on the no-warning bit.
>
> ACK.  I'm thus clarifying that as follows:
>
> --- gcc/diagnostic-spec.c
> +++ gcc/diagnostic-spec.c
> @@ -185,7 +185,5 @@ copy_warning (location_t to, location_t from)
>if (RESERVED_LOCATION_P (to))
> -{
> -  /* If we cannot set no-warning dispositions for 'to', ascertain 
> that we
> -don't have any for 'from'.  Otherwise, we'd lose these.  */
> -  gcc_checking_assert (!from_spec);
> -}
> +/* We cannot set no-warning dispositions for 'to', so we have no 
> chance but
> +   lose those potentially set for 'from'.  */
> +;
>else
> --- gcc/warning-control.cc
> +++ gcc/warning-control.cc
> @@ -197,9 +197,5 @@ void copy_warning (ToType to, FromType from)
>if (RESERVED_LOCATION_P (to_loc))
> -{
> -#if 0 //TODO triggers during GCC build
> -  /* If we cannot set no-warning dispositions for 'to', ascertain 
> that we
> -don't have any for 'from'.  Otherwise, we'd lose these.  */
> -  gcc_checking_assert (!from_spec);
> -#endif
> -}
> +/* We cannot set 

Re: [PATCH][www] Add note that STABS is deprecated

2021-09-10 Thread Richard Biener via Gcc-patches
On Fri, 10 Sep 2021, Jakub Jelinek wrote:

> On Fri, Sep 10, 2021 at 09:28:16AM +0200, Richard Biener via Gcc-patches 
> wrote:
> > This amends the caveats section of the GCC 12 release notes according
> > to the STABS deprecation.
> > 
> > I'll wait with pushing to the point it reflects the state on trunk
> > rather than projecting what will be the state for GCC 12.
> 
> Most of the ports actually default to emit DWARF5, some DWARF4, some DWARF2,
> so perhaps better would be to emit DWARF (version 2 or later) debugging info?

True, I adjusted the change accordingly.

Thanks,
Richard.

> > diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
> > index 946faa49..c4a59c3b 100644
> > --- a/htdocs/gcc-12/changes.html
> > +++ b/htdocs/gcc-12/changes.html
> > @@ -47,6 +47,12 @@ a work-in-progress.
> >  which still supports -std=f95 and is recommended to be 
> > used
> >  instead in general.
> >
> > +  
> > +STABS:
> > +Support for emitting the STABS debugging format is deprecated and will
> > +be removed in the next release.  All ports now default to emit DWARF2
> > +debugging info.
> > +  
> >  
> >  
> >  
> > -- 
> > 2.31.1
> 
>   Jakub
> 
> 


Re: [PATCH][www] Add note that STABS is deprecated

2021-09-10 Thread Jakub Jelinek via Gcc-patches
On Fri, Sep 10, 2021 at 09:28:16AM +0200, Richard Biener via Gcc-patches wrote:
> This amends the caveats section of the GCC 12 release notes according
> to the STABS deprecation.
> 
> I'll wait with pushing to the point it reflects the state on trunk
> rather than projecting what will be the state for GCC 12.

Most of the ports actually default to emit DWARF5, some DWARF4, some DWARF2,
so perhaps better would be to emit DWARF (version 2 or later) debugging info?

> diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
> index 946faa49..c4a59c3b 100644
> --- a/htdocs/gcc-12/changes.html
> +++ b/htdocs/gcc-12/changes.html
> @@ -47,6 +47,12 @@ a work-in-progress.
>  which still supports -std=f95 and is recommended to be used
>  instead in general.
>
> +  
> +STABS:
> +Support for emitting the STABS debugging format is deprecated and will
> +be removed in the next release.  All ports now default to emit DWARF2
> +debugging info.
> +  
>  
>  
>  
> -- 
> 2.31.1

Jakub



[PATCH][www] Add note that STABS is deprecated

2021-09-10 Thread Richard Biener via Gcc-patches
This amends the caveats section of the GCC 12 release notes according
to the STABS deprecation.

I'll wait with pushing to the point it reflects the state on trunk
rather than projecting what will be the state for GCC 12.

---
 htdocs/gcc-12/changes.html | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 946faa49..c4a59c3b 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -47,6 +47,12 @@ a work-in-progress.
 which still supports -std=f95 and is recommended to be used
 instead in general.
   
+  
+STABS:
+Support for emitting the STABS debugging format is deprecated and will
+be removed in the next release.  All ports now default to emit DWARF2
+debugging info.
+  
 
 
 
-- 
2.31.1


Re: [COMMITTED][patch][version 9]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-09-10 Thread Richard Biener via Gcc-patches
On Thu, 9 Sep 2021, Kees Cook wrote:

> On Thu, Sep 09, 2021 at 10:49:11PM +, Qing Zhao wrote:
> > Hi, FYI
> > 
> > I just committed the following patch to gcc upstream:
> > 
> > 
> > https://gcc.gnu.org/pipermail/gcc-cvs/2021-September/353195.html
> 
> Hurray! Thank you so much for working on this, and thanks also to the
> reviewers and everyone else poking at it.
> 
> I will go update my Linux Plumbers slides to say "supported" instead of
> "proposed". :)

Can you two work on wording to add to gcc-12/changes.html for this
feature?  I think it deserves a release note.  Likewise the CTF/BTF
support btw.

Thanks,
Richard.


[PATCH] Remove DARWIN_PREFER_DWARF and dead code

2021-09-10 Thread Richard Biener via Gcc-patches
This removes the always defined DARWIN_PREFER_DWARF and the code
guarded by it being not defined, removing the possibility to
default some i386 darwin configurations to STABS when it would
not be defined.

OK for trunk?

Thanks,
Richard.

2021-09-10  Richard Biener  

* config/darwin.h (DARWIN_PREFER_DWARF): Do not define.
* config/i386/darwin.h (PREFERRED_DEBUGGING_TYPE): Do not
change based on DARWIN_PREFER_DWARF not being defined.
---
 gcc/config/darwin.h  |  3 +--
 gcc/config/i386/darwin.h | 11 ---
 2 files changed, 1 insertion(+), 13 deletions(-)

diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
index f1d92f87e9a..6396586c138 100644
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -499,9 +499,8 @@ extern GTY(()) int darwin_ms_struct;
 /* We now require C++11 to bootstrap and newer tools than those based on
stabs, so require DWARF-2, even if stabs is supported by the assembler.  */
 
-#define PREFERRED_DEBUGGING_TYPE DWARF2_DEBUG
-#define DARWIN_PREFER_DWARF
 #define DWARF2_DEBUGGING_INFO 1
+#define PREFERRED_DEBUGGING_TYPE DWARF2_DEBUG
 
 #ifdef HAVE_AS_STABS_DIRECTIVE
 #define DBX_DEBUGGING_INFO 1
diff --git a/gcc/config/i386/darwin.h b/gcc/config/i386/darwin.h
index da0ae5b3ee7..c4a6f4dfda7 100644
--- a/gcc/config/i386/darwin.h
+++ b/gcc/config/i386/darwin.h
@@ -264,17 +264,6 @@ along with GCC; see the file COPYING3.  If not see
   target_flags &= ~MASK_MACHO_DYNAMIC_NO_PIC;  \
   } while (0)
 
-/* Darwin on x86_64 uses dwarf-2 by default.  Pre-darwin9 32-bit
-   compiles default to stabs+.  darwin9+ defaults to dwarf-2.  */
-#ifndef DARWIN_PREFER_DWARF
-#undef PREFERRED_DEBUGGING_TYPE
-#ifdef HAVE_AS_STABS_DIRECTIVE
-#define PREFERRED_DEBUGGING_TYPE (TARGET_64BIT ? DWARF2_DEBUG : DBX_DEBUG)
-#else
-#define PREFERRED_DEBUGGING_TYPE DWARF2_DEBUG
-#endif
-#endif
-
 /* Darwin uses the standard DWARF register numbers but the default
register numbers for STABS.  Fortunately for 64-bit code the
default and the standard are the same.  */
-- 
2.31.1


[PATCH] Default AVR to DWARF2 debug

2021-09-10 Thread Richard Biener via Gcc-patches
This switches the AVR port to generate DWARF2 debugging info by
default since the support for STABS is going to be deprecated for
GCC 12.

OK for trunk?

Thanks,
Richard.

2021-09-10  Richard Biener  

* config/avr/elf.h (PREFERRED_DEBUGGING_TYPE): Remove
override, pick up DWARF2_DEBUG define from elfos.h
---
 gcc/config/avr/elf.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/gcc/config/avr/elf.h b/gcc/config/avr/elf.h
index 2f0eb0de775..5f2d1e03936 100644
--- a/gcc/config/avr/elf.h
+++ b/gcc/config/avr/elf.h
@@ -22,9 +22,6 @@
 
 #undef PCC_BITFIELD_TYPE_MATTERS
 
-#undef PREFERRED_DEBUGGING_TYPE
-#define PREFERRED_DEBUGGING_TYPE DBX_DEBUG
-
 #undef MAX_OFILE_ALIGNMENT
 #define MAX_OFILE_ALIGNMENT (32768 * 8)
 
-- 
2.31.1


[PATCH] Always default to DWARF2 debugging for RX, even with -mas100-syntax

2021-09-10 Thread Richard Biener via Gcc-patches
The RX port defaults to STABS when -mas100-syntax is used because
the AS100 assembler does not support some of the pseudo-ops used
by DWARF2 debug emission.  Since STABS is going to be deprecated
that has to change.  The following simply always uses DWARF2,
likely leaving -mas100-syntax broken when debug info is generated.

Can the RX port maintainer please sort out the situation?  One
option might be to drop to NO_DEBUG when -mas100-syntax is
specified but maybe there's AS100 assemblers that now support
all the required pseudo ops or there's a way to define the DWARF
output macros to work around the lack of those (it's by no means
the first tagret to have such issues).

OK for trunk?

Thanks,
Richard.

2021-09-10  Richard Biener  

* config/rx/rx.h (PREFERRED_DEBUGGING_TYPE): Always define to
DWARF2_DEBUG.
---
 gcc/config/rx/rx.h | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/gcc/config/rx/rx.h b/gcc/config/rx/rx.h
index 4078440fa31..3cb411db192 100644
--- a/gcc/config/rx/rx.h
+++ b/gcc/config/rx/rx.h
@@ -620,14 +620,8 @@ typedef unsigned int CUMULATIVE_ARGS;
 /* Like REG_P except that this macro is true for SET expressions.  */
 #define SET_P(rtl)(GET_CODE (rtl) == SET)
 
-/* The AS100 assembler does not support .leb128 and .uleb128, but
-   the compiler-build-time configure tests will have enabled their
-   use because GAS supports them.  So default to generating STABS
-   debug information instead of DWARF2 when generating AS100
-   compatible output.  */
 #undef  PREFERRED_DEBUGGING_TYPE
-#define PREFERRED_DEBUGGING_TYPE (TARGET_AS100_SYNTAX \
- ? DBX_DEBUG : DWARF2_DEBUG)
+#define PREFERRED_DEBUGGING_TYPE DWARF2_DEBUG
 
 #define DBX_DEBUGGING_INFO 1
 #define DWARF2_DEBUGGING_INFO 1
-- 
2.31.1


Re: More aggressive threading causing loop-interchange-9.c regression

2021-09-10 Thread Aldy Hernandez via Gcc-patches
I think things are clear enough to propose a patch.  Thanks for 
everyone's input.


I have added some comments and tweaked Michael's patch to include the 
final BB where we're threading to, in the check.  Without this last 
check, we fail in tree-ssa/cunroll-15.c because the latch becomes 
polluted with PHI nodes.


OK for trunk?
Aldy

commit e735a2cd01485773635bd66c97af21059992d5a7 (HEAD -> 
pending/global-ranges)

Author: Aldy Hernandez 
Date:   Thu Sep 9 20:30:28 2021 +0200

Disable threading through latches until after loop optimizations.

The motivation for this patch was enabling the use of global ranges in
the path solver, but this caused certain properties of loops being
destroyed which made subsequent loop optimizations to fail.
Consequently, this patch's mail goal is to disable jump threading
involving the latch until after loop optimizations have run.

I have added a bit in cfun to indicate when the "loopdone" optimization
has completed.  This helps the path threader determine when it's 
safe to

thread through loop structures.  I can adapt the patch if there is an
alternate way of determining this.

As can be seen in the test adjustments, we mostly shift the threading
from the early threaders (ethread, thread[12] to the late threaders
thread[34]).  I have nuked some of the early notes in the testcases
that came as part of the jump threader rewrite.  They're mostly noise
now.

Note that we could probably relax some other restrictions in
profitable_path_p when loop optimizations have completed, but it would
require more testing, and I'm hesitant to touch more things than needed
at this point.  I have added a reminder to the function to keep this
in mind.

Finally, perhaps as a follow-up, we should apply the same 
restrictions to
the forward threader.  At some point I'd like to combine the cost 
models.


Tested on x86-64 Linux.

p.s. There is a thorough discussion involving the limitations of jump
threading involving loops here:

https://gcc.gnu.org/pipermail/gcc/2021-September/237247.html
>From e735a2cd01485773635bd66c97af21059992d5a7 Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Thu, 9 Sep 2021 20:30:28 +0200
Subject: [PATCH] Disable threading through latches until after loop
 optimizations.

The motivation for this patch was enabling the use of global ranges in
the path solver, but this caused certain properties of loops being
destroyed which made subsequent loop optimizations to fail.
Consequently, this patch's mail goal is to disable jump threading
involving the latch until after loop optimizations have run.

I have added a bit in cfun to indicate when the "loopdone" optimization
has completed.  This helps the path threader determine when it's safe to
thread through loop structures.  I can adapt the patch if there is an
alternate way of determining this.

As can be seen in the test adjustments, we mostly shift the threading
from the early threaders (ethread, thread[12] to the late threaders
thread[34]).  I have nuked some of the early notes in the testcases
that came as part of the jump threader rewrite.  They're mostly noise
now.

Note that we could probably relax some other restrictions in
profitable_path_p when loop optimizations have completed, but it would
require more testing, and I'm hesitant to touch more things than needed
at this point.  I have added a reminder to the function to keep this
in mind.

Finally, perhaps as a follow-up, we should apply the same restrictions to
the forward threader.  At some point I'd like to combine the cost models.

Tested on x86-64 Linux.

p.s. There is a thorough discussion involving the limitations of jump
threading involving loops here:

	https://gcc.gnu.org/pipermail/gcc/2021-September/237247.html

gcc/ChangeLog:

	* function.h (struct function): Add loop_optimizers_done.
	(set_loop_optimizers_done): New.
	(loop_optimizers_done_p): New.
	* gimple-range-path.cc (path_range_query::internal_range_of_expr):
	Intersect with global range.
	* tree-ssa-loop.c (tree_ssa_loop_done): Call set_loop_optimizers_done.
	* tree-ssa-threadbackward.c
	(back_threader_profitability::profitable_path_p): Disable
	threading through latches until after loop optimizations have run.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/ssa-dom-thread-2b.c: Adjust for disabling of
	threading through latches.
	* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same.

Co-authored-by: Michael Matz 
---
 gcc/function.h| 15 
 gcc/gimple-range-path.cc  |  3 ++
 .../gcc.dg/tree-ssa/ssa-dom-thread-2b.c   |  4 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-6.c| 37 +--
 .../gcc.dg/tree-ssa/ssa-dom-thread-7.c| 17 +
 gcc/tree-ssa-loop.c   |  1 +
 gcc/tree-ssa-threadbackward.c | 28 +-
 7 files 

Re: [PATCH 09/62] AVX512FP16: Enable _Float16 autovectorization

2021-09-10 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 1, 2021 at 2:17 PM liuhongt  wrote:
>
> From: "H.J. Lu" 
>
> gcc/ChangeLog:
>
> * config/i386/i386-expand.c
> (ix86_avx256_split_vector_move_misalign): Handle V16HF mode.
> * config/i386/i386.c
> (ix86_preferred_simd_mode): Handle HF mode.
> * config/i386/sse.md (V_256H): New mode iterator.
> (avx_vextractf128): Use it.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/vect-float16-1.c: New test.
> * gcc.target/i386/vect-float16-10.c: Ditto.
> * gcc.target/i386/vect-float16-11.c: Ditto.
> * gcc.target/i386/vect-float16-12.c: Ditto.
> * gcc.target/i386/vect-float16-2.c: Ditto.
> * gcc.target/i386/vect-float16-3.c: Ditto.
> * gcc.target/i386/vect-float16-4.c: Ditto.
> * gcc.target/i386/vect-float16-5.c: Ditto.
> * gcc.target/i386/vect-float16-6.c: Ditto.
> * gcc.target/i386/vect-float16-7.c: Ditto.
> * gcc.target/i386/vect-float16-8.c: Ditto.
> * gcc.target/i386/vect-float16-9.c: Ditto.
I'm going to check in this patch w/ a bit change, the change is
removing TARGET_AVX512FP16 for vector HFmodes when vpinsrw/../vpextrw
instructions are used for V*HFmodevector_init and
vector_extract{,_lo/hi}.
Attach an updated patch.
Also check in 6 patches which are [PATCH 10/62] to [PATH 15/62].

[PATCH 10/62] AVX512FP16: Add vaddsh/vsubsh/vmulsh/vdivsh.
[PATCH 11/62] AVX512FP16: Add testcase for vaddsh/vsubsh/vmulsh/vdivsh.
[PATCH 12/62] AVX512FP16: Add vmaxph/vminph/vmaxsh/vminsh.
[PATCH 13/62] AVX512FP16: Add testcase for vmaxph/vmaxsh/vminph/vminsh.
[PATCH 14/62] AVX512FP16: Add vcmpph/vcmpsh/vcomish/vucomish.
[PATCH 15/62] AVX512FP16: Add testcase for vcmpph/vcmpsh/vcomish/vucomish.

  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
  Also newly added runtime testcases  were run on sde/SPR.

[10] https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574128.html
[11] https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574127.html
[12] https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574129.html
[13] https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574130.html
[14] https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574131.html
[15] https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574132.html

> ---
>  gcc/config/i386/i386-expand.c   |  4 
>  gcc/config/i386/i386.c  | 14 ++
>  gcc/config/i386/sse.md  |  7 ++-
>  gcc/testsuite/gcc.target/i386/vect-float16-1.c  | 14 ++
>  gcc/testsuite/gcc.target/i386/vect-float16-10.c | 14 ++
>  gcc/testsuite/gcc.target/i386/vect-float16-11.c | 14 ++
>  gcc/testsuite/gcc.target/i386/vect-float16-12.c | 14 ++
>  gcc/testsuite/gcc.target/i386/vect-float16-2.c  | 14 ++
>  gcc/testsuite/gcc.target/i386/vect-float16-3.c  | 14 ++
>  gcc/testsuite/gcc.target/i386/vect-float16-4.c  | 14 ++
>  gcc/testsuite/gcc.target/i386/vect-float16-5.c  | 14 ++
>  gcc/testsuite/gcc.target/i386/vect-float16-6.c  | 14 ++
>  gcc/testsuite/gcc.target/i386/vect-float16-7.c  | 14 ++
>  gcc/testsuite/gcc.target/i386/vect-float16-8.c  | 14 ++
>  gcc/testsuite/gcc.target/i386/vect-float16-9.c  | 14 ++
>  15 files changed, 192 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-float16-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-float16-10.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-float16-11.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-float16-12.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-float16-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-float16-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-float16-4.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-float16-5.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-float16-6.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-float16-7.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-float16-8.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-float16-9.c
>
> diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
> index 39647eb2cf1..df50c72ab16 100644
> --- a/gcc/config/i386/i386-expand.c
> +++ b/gcc/config/i386/i386-expand.c
> @@ -498,6 +498,10 @@ ix86_avx256_split_vector_move_misalign (rtx op0, rtx op1)
>extract = gen_avx_vextractf128v32qi;
>mode = V16QImode;
>break;
> +case E_V16HFmode:
> +  extract = gen_avx_vextractf128v16hf;
> +  mode = V8HFmode;
> +  break;
>  case E_V8SFmode:
>extract = gen_avx_vextractf128v8sf;
>mode = V4SFmode;
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 79e6880d9dd..dc0d440061b 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -22360,6 +22360,20 @@ 

[PATCH] Default Alpha/VMS to DWARF2 debugging only

2021-09-10 Thread Richard Biener via Gcc-patches
This changes the default debug format for Alpha/VMS to DWARF2 only,
skipping emission of VMS debug info which is going do be deprecated
for GCC 12 alongside the support for STABS.

It looks like other flavors of VMS never used VMS_DEBUG by default
but only the alpha port did.

I have no good means to test anything here, it might be that we have
alpha-vms specific testcases that rely on the previous default.

OK for trunk?

Thanks,
Richard.

2021-09-10  Richard Biener  

* config/alpha/vms.h (PREFERRED_DEBUGGING_TYPE): Define to
DWARF2_DEBUG.
---
 gcc/config/alpha/vms.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/alpha/vms.h b/gcc/config/alpha/vms.h
index b8673b6b6fb..2a9917cde62 100644
--- a/gcc/config/alpha/vms.h
+++ b/gcc/config/alpha/vms.h
@@ -244,7 +244,7 @@ typedef struct {int num_args; enum avms_arg_type 
atypes[6];} avms_arg_info;
  while (0)
 
 #undef PREFERRED_DEBUGGING_TYPE
-#define PREFERRED_DEBUGGING_TYPE VMS_AND_DWARF2_DEBUG
+#define PREFERRED_DEBUGGING_TYPE DWARF2_DEBUG
 
 #define ASM_PN_FORMAT "%s___%lu"
 
-- 
2.31.1


[PATCH] Always default to DWARF2 debug for cygwin and mingw

2021-09-10 Thread Richard Biener via Gcc-patches
This removes the fallback to STABS as default for cygwin and mingw
when the assembler does not support .secrel32 and the default is
to emit 32bit code.  Support for .secrel32 was added to binutils 2.16
released in 2005 so instead document that as requirement.

I left the now unused check for .secrel32 in configure around
in case somebody wants to turn that into an error or warning.

OK for trunk?  As before I have no good means to test this but
it should change nothing for people using binutils 2.16+

2021-09-10  Richard Biener  

* config/i386/cygming.h: Always default to DWARF2 debugging.
Do not define DBX_DEBUGGING_INFO, that's done via dbxcoff.h
already.
* doc/install.texi: Document binutils 2.16 as minimum
requirement for mingw.
---
 gcc/config/i386/cygming.h | 9 -
 gcc/doc/install.texi  | 4 
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/gcc/config/i386/cygming.h b/gcc/config/i386/cygming.h
index ac458cdfee1..da872d10cd3 100644
--- a/gcc/config/i386/cygming.h
+++ b/gcc/config/i386/cygming.h
@@ -18,17 +18,10 @@ You should have received a copy of the GNU General Public 
License
 along with GCC; see the file COPYING3.  If not see
 .  */
 
-#define DBX_DEBUGGING_INFO 1
-#if TARGET_64BIT_DEFAULT || defined (HAVE_GAS_PE_SECREL32_RELOC)
 #define DWARF2_DEBUGGING_INFO 1
-#endif
 
 #undef PREFERRED_DEBUGGING_TYPE
-#if (DWARF2_DEBUGGING_INFO)
 #define PREFERRED_DEBUGGING_TYPE DWARF2_DEBUG
-#else
-#define PREFERRED_DEBUGGING_TYPE DBX_DEBUG
-#endif
 
 #undef TARGET_SEH
 #define TARGET_SEH  (TARGET_64BIT_MS_ABI && flag_unwind_tables)
@@ -97,7 +90,6 @@ along with GCC; see the file COPYING3.  If not see
 #undef DWARF_FRAME_REGISTERS
 #define DWARF_FRAME_REGISTERS (TARGET_64BIT ? 33 : 17)
 
-#ifdef HAVE_GAS_PE_SECREL32_RELOC
 /* Use section relative relocations for debugging offsets.  Unlike
other targets that fake this by putting the section VMA at 0, PE
won't allow it.  */
@@ -129,7 +121,6 @@ along with GCC; see the file COPYING3.  If not see
gcc_unreachable (); \
   }\
   } while (0)
-#endif
 
 #define TARGET_EXECUTABLE_SUFFIX ".exe"
 
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 99b44706836..88e453c3f6b 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -5131,6 +5131,10 @@ GCC will build with and support only MinGW runtime 3.12 
and later.
 Earlier versions of headers are incompatible with the new default semantics
 of @code{extern inline} in @code{-std=c99} and @code{-std=gnu99} modes.
 
+To support emitting DWARF debugging info you need to use GNU binutils
+version 2.16 or above containing support for the @code{.secrel32}
+assembler pseudo-op.
+
 @html
 
 @end html
-- 
2.31.1


[PATCH] Remove vestiges of --with-stabs

2021-09-10 Thread Richard Biener via Gcc-patches
This removes the --with-stabs configure option which had no effect
since quite some time.

Will push when it was included in some bootstrap/regtest cycle.

2021-09-10  Richard Biener  

* configure.ac (--with-stabs): Remove.
* configure: Regenerate.
* doc/install.texi: Remove --with-stabs documentation.
---
 gcc/configure| 16 ++--
 gcc/configure.ac |  7 ---
 gcc/doc/install.texi |  5 -
 3 files changed, 2 insertions(+), 26 deletions(-)

diff --git a/gcc/configure b/gcc/configure
index 500e3f68215..27293279eba 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -958,7 +958,6 @@ enable_checking
 enable_coverage
 enable_gather_detailed_mem_stats
 enable_valgrind_annotations
-with_stabs
 enable_multilib
 enable_multiarch
 with_stack_clash_protection_guard_size
@@ -1820,7 +1819,6 @@ Optional Packages:
   pathname)
   --with-gnu-as   arrange to work with GNU as
   --with-as   arrange to use the specified as (full pathname)
-  --with-stabsarrange to use stabs instead of host debug format
   --with-stack-clash-protection-guard-size=size
   Set the default stack clash protection guard size
   for specific targets as a power of two in bytes.
@@ -7604,16 +7602,6 @@ fi
 # Miscenalleous configure options
 # ---
 
-# With stabs
-
-# Check whether --with-stabs was given.
-if test "${with_stabs+set}" = set; then :
-  withval=$with_stabs; stabs="$with_stabs"
-else
-  stabs=no
-fi
-
-
 # Determine whether or not multilibs are enabled.
 # Check whether --enable-multilib was given.
 if test "${enable_multilib+set}" = set; then :
@@ -19480,7 +19468,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19483 "configure"
+#line 19471 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -19586,7 +19574,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19589 "configure"
+#line 19577 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 6f768e02aa4..259c933b415 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -845,13 +845,6 @@ fi
 # Miscenalleous configure options
 # ---
 
-# With stabs
-AC_ARG_WITH(stabs,
-[AS_HELP_STRING([--with-stabs],
-   [arrange to use stabs instead of host debug format])],
-stabs="$with_stabs",
-stabs=no)
-
 # Determine whether or not multilibs are enabled.
 AC_ARG_ENABLE(multilib,
 [AS_HELP_STRING([--enable-multilib],
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 8e974d2952e..99b44706836 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -1088,11 +1088,6 @@ but for the linker.
 Same as @uref{#with-as,,@option{--with-as}}
 but for the debug linker (only used on Darwin platforms so far).
 
-@item --with-stabs
-Specify that stabs debugging
-information should be used instead of whatever format the host normally
-uses.  Normally GCC uses the same debug format as the host system.
-
 @item --with-tls=@var{dialect}
 Specify the default TLS dialect, for systems were there is a choice.
 For ARM targets, possible values for @var{dialect} are @code{gnu} or
-- 
2.31.1


Re: [PATCH] Remove dbx.h, do not set PREFERRED_DEBUGGING_TYPE from dbxcoff.h, lynx.h

2021-09-10 Thread Richard Biener via Gcc-patches
On Thu, 9 Sep 2021, Jeff Law wrote:

> 
> 
> On 9/9/2021 7:19 AM, Richard Biener via Gcc-patches wrote:
> > The following removes the unused config/dbx.h file and removes the
> > setting of PREFERRED_DEBUGGING_TYPE from dbxcoff.h which is
> > overridden by all users (djgpp/mingw/cygwin) via either including
> > config/i386/djgpp.h or config/i386/cygming.h
> >
> > There are still circumstances where mingw and cygwin default to
> > STABS, namely when HAVE_GAS_PE_SECREL32_RELOC is not defined and
> > the target defaults to 32bit code generation.
> >
> > The new style handling DBX_DEBUGGING_INFO is in line with
> > dbxelf.h which does not define PREFERRED_DEBUGGING_TYPE either.
> >
> > The patch also removes the PREFERRED_DEBUGGING_TYPE define from
> > lynx.h which always follows elfos.h already defaulting to DWARF,
> > so the comment about STABS being the default is misleading and
> > outdated.  There's no listed maintainer for Lynx OS.
> >
> > I have not tested this in any ways but I also have no idea how
> > to meaningfully do so.
> >
> > OK?
> >
> > Thanks,
> > Richard.
> >
> > 2021-09-09  Richard Biener  
> >
> >  PR target/102255
> >  * config/dbx.h: Remove.
> >  * config/dbxcoff.h: Do not define PREFERRED_DEBUGGING_TYPE.
> >  * config/lynx.h: Likewise.
> I'd go ahead and install.  We're on a clear path to kill dbx/stabs and if this
> breaks those ports, better to do so as early as possible to give folks a
> chance to fix 'em.
> 
> I can't really help on the testing side for this -- my tester doesn't try to
> test djgpp, mingw or cygwin.

I see.  Mind the patch doesn't change anything (unless my analysis was 
flawed).  It merely reduces the grep hit for DBX_DEBUGGING_INFO and
PREFERRED_DEBUGGING_TYPE ;)

My immediate goal is to get rid of PREFERRED_DEBUGGING_TYPE
(it will always be DWARF_DEBUGGING_INFO) and _default_ all ports
to DWARF so my original goal of being able to deprecate STABS for GCC 12
will work out (which would be sth odd if some port still defaults to it).

I've pushed this change now.

Richard.