date:20210922

[PATCH 7/7] AVX512FP16: Enable vec_cmpmn/vcondmn expanders for HF modes.

2021-09-22 Thread liuhongt via Gcc-patches

From: Hongyu Wang 

gcc/ChangeLog:

* config/i386/i386-expand.c (ix86_use_mask_cmp_p): Enable
HFmode mask_cmp.
* config/i386/sse.md (sseintvecmodelower): Add HF vector modes.
(_store_mask): Extend to support HF vector modes.
(vec_cmp): Likewise.
(vcond_mask_): Likewise.
(vcond): New expander.
(vcond): Likewise.
(vcond): Likewise.
(vcondu): Likewise.

gcc/testsuite/ChangeLog:

* g++.target/i386/avx512fp16-vcondmn-vec.C: New test.
* g++.target/i386/avx512fp16-vcondmn-minmax.C: Ditto.
* gcc.target/i386/avx512fp16-vcondmn-loop-1.c: Ditto.
* gcc.target/i386/avx512fp16-vcondmn-loop-2.c: Ditto.
* gcc.target/i386/avx512fp16-vec_cmpmn.c: Ditto.
---
 gcc/config/i386/i386-expand.c |   2 +
 gcc/config/i386/sse.md|  84 --
 .../i386/avx512fp16-vcondmn-minmax.C  |  25 +++
 .../g++.target/i386/avx512fp16-vcondmn-vec.C  |  70 +
 .../i386/avx512fp16-vcondmn-loop-1.c  |  70 +
 .../i386/avx512fp16-vcondmn-loop-2.c  | 143 ++
 .../gcc.target/i386/avx512fp16-vec_cmpmn.c|  32 
 7 files changed, 414 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/i386/avx512fp16-vcondmn-minmax.C
 create mode 100644 gcc/testsuite/g++.target/i386/avx512fp16-vcondmn-vec.C
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vcondmn-loop-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vcondmn-loop-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vec_cmpmn.c

diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index dbbf5e34656..94ac303585e 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -3638,6 +3638,8 @@ ix86_use_mask_cmp_p (machine_mode mode, machine_mode 
cmp_mode,
 return false;
   else if (vector_size == 64)
 return true;
+  else if (GET_MODE_INNER (cmp_mode) == HFmode)
+return true;
 
   /* When op_true is NULL, op_false must be NULL, or vice versa.  */
   gcc_assert (!op_true == !op_false);
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index a48c8e8bede..084fc7f4693 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -989,9 +989,9 @@ (define_mode_attr sseintvecmode2
(V16HF "OI") (V8HF "TI")])
 
 (define_mode_attr sseintvecmodelower
-  [(V16SF "v16si") (V8DF "v8di")
-   (V8SF "v8si") (V4DF "v4di")
-   (V4SF "v4si") (V2DF "v2di")
+  [(V32HF "v32hi") (V16SF "v16si") (V8DF "v8di")
+   (V16HF "v16hi") (V8SF "v8si") (V4DF "v4di")
+   (V8HF "v8hi") (V4SF "v4si") (V2DF "v2di")
(V8SI "v8si") (V4DI "v4di")
(V4SI "v4si") (V2DI "v2di")
(V16HI "v16hi") (V8HI "v8hi")
@@ -1568,9 +1568,9 @@ (define_insn "_store_mask"
(set_attr "mode" "")])
 
 (define_insn "_store_mask"
-  [(set (match_operand:VI12_AVX512VL 0 "memory_operand" "=m")
-   (vec_merge:VI12_AVX512VL
- (match_operand:VI12_AVX512VL 1 "register_operand" "v")
+  [(set (match_operand:VI12HF_AVX512VL 0 "memory_operand" "=m")
+   (vec_merge:VI12HF_AVX512VL
+ (match_operand:VI12HF_AVX512VL 1 "register_operand" "v")
  (match_dup 0)
  (match_operand: 2 "register_operand" "Yk")))]
   "TARGET_AVX512BW"
@@ -3810,8 +3810,8 @@ (define_insn "_comi"
 (define_expand "vec_cmp"
   [(set (match_operand: 0 "register_operand")
(match_operator: 1 ""
- [(match_operand:V48_AVX512VL 2 "register_operand")
-  (match_operand:V48_AVX512VL 3 "nonimmediate_operand")]))]
+ [(match_operand:V48H_AVX512VL 2 "register_operand")
+  (match_operand:V48H_AVX512VL 3 "nonimmediate_operand")]))]
   "TARGET_AVX512F"
 {
   bool ok = ix86_expand_mask_vec_cmp (operands[0], GET_CODE (operands[1]),
@@ -4018,6 +4018,51 @@ (define_expand "vcond"
   DONE;
 })
 
+(define_expand "vcond"
+  [(set (match_operand:VF_AVX512FP16VL 0 "register_operand")
+   (if_then_else:VF_AVX512FP16VL
+ (match_operator 3 ""
+   [(match_operand:VF_AVX512FP16VL 4 "vector_operand")
+(match_operand:VF_AVX512FP16VL 5 "vector_operand")])
+ (match_operand:VF_AVX512FP16VL 1 "general_operand")
+ (match_operand:VF_AVX512FP16VL 2 "general_operand")))]
+  "TARGET_AVX512FP16"
+{
+  bool ok = ix86_expand_fp_vcond (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vcond"
+  [(set (match_operand:VF_AVX512FP16VL 0 "register_operand")
+   (if_then_else:VF_AVX512FP16VL
+ (match_operator 3 ""
+   [(match_operand: 4 "vector_operand")
+(match_operand: 5 "vector_operand")])
+ (match_operand:VF_AVX512FP16VL 1 "general_operand")
+ (match_operand:VF_AVX512FP16VL 2 "general_operand")))]
+  "TARGET_AVX512FP16"
+{
+  bool ok = ix86_expand_int_vcond (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vcond"
+  [(set (match_operand: 0 "register_operand")
+   (if_then_else:
+

[PATCH 6/7] AVX512FP16: add truncmn2/extendmn2 expanders

2021-09-22 Thread liuhongt via Gcc-patches

From: Hongyu Wang 

gcc/ChangeLog:

* config/i386/sse.md (extend2):
New expander.
(extendv4hf2): Likewise.
(extendv2hfv2df2): Likewise.
(trunc2): Likewise.
(avx512fp16_vcvt2ph_): Rename to ...
(truncv4hf2): ... this, and drop constraints.
(avx512fp16_vcvtpd2ph_v2df): Rename to ...
(truncv2dfv2hf2): ... this, and likewise.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512fp16-trunc-extendvnhf.c: New test.
---
 gcc/config/i386/sse.md| 75 +--
 .../i386/avx512fp16-trunc-extendvnhf.c| 55 ++
 2 files changed, 123 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-trunc-extendvnhf.c

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 66062dc3bcf..a48c8e8bede 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -6328,6 +6328,12 @@ (define_mode_attr ph2pssuffix
   [(V16SF "x") (V8SF "x") (V4SF "x")
(V8DF "") (V4DF "") (V2DF "")])
 
+(define_expand "extend2"
+  [(set (match_operand:VF48H_AVX512VL 0 "register_operand")
+   (float_extend:VF48H_AVX512VL
+ (match_operand: 1 "nonimmediate_operand")))]
+  "TARGET_AVX512FP16")
+
 (define_insn "avx512fp16_float_extend_ph2"
   [(set (match_operand:VF48H_AVX512VL 0 "register_operand" "=v")
(float_extend:VF48H_AVX512VL
@@ -6338,6 +6344,21 @@ (define_insn 
"avx512fp16_float_extend_ph2"
(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
+(define_expand "extendv4hf2"
+  [(set (match_operand:VF4_128_8_256 0 "register_operand")
+   (float_extend:VF4_128_8_256
+ (match_operand:V4HF 1 "nonimmediate_operand")))]
+  "TARGET_AVX512FP16 && TARGET_AVX512VL"
+{
+  if (!MEM_P (operands[1]))
+{
+  operands[1] = lowpart_subreg (V8HFmode, operands[1], V4HFmode);
+  emit_insn (gen_avx512fp16_float_extend_ph2
+(operands[0], operands[1]));
+  DONE;
+}
+})
+
 (define_insn "avx512fp16_float_extend_ph2"
   [(set (match_operand:VF4_128_8_256 0 "register_operand" "=v")
(float_extend:VF4_128_8_256
@@ -6360,6 +6381,21 @@ (define_insn 
"*avx512fp16_float_extend_ph2_load"
(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
+(define_expand "extendv2hfv2df2"
+  [(set (match_operand:V2DF 0 "register_operand")
+   (float_extend:V2DF
+ (match_operand:V2HF 1 "nonimmediate_operand")))]
+  "TARGET_AVX512FP16 && TARGET_AVX512VL"
+{
+  if (!MEM_P (operands[1]))
+{
+  operands[1] = lowpart_subreg (V8HFmode, operands[1], V2HFmode);
+  emit_insn (gen_avx512fp16_float_extend_phv2df2
+(operands[0], operands[1]));
+  DONE;
+}
+})
+
 (define_insn "avx512fp16_float_extend_phv2df2"
   [(set (match_operand:V2DF 0 "register_operand" "=v")
(float_extend:V2DF
@@ -6382,6 +6418,12 @@ (define_insn 
"*avx512fp16_float_extend_phv2df2_load"
(set_attr "prefix" "evex")
(set_attr "mode" "TI")])
 
+(define_expand "trunc2"
+  [(set (match_operand: 0 "register_operand")
+   (float_truncate:
+ (match_operand:VF48H_AVX512VL 1 "nonimmediate_operand")))]
+  "TARGET_AVX512FP16")
+
 (define_insn "avx512fp16_vcvt2ph_"
   [(set (match_operand: 0 "register_operand" "=v")
(float_truncate:
@@ -6392,11 +6434,21 @@ (define_insn 
"avx512fp16_vcvt2ph_"
(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
-(define_expand "avx512fp16_vcvt2ph_"
-  [(set (match_operand:V8HF 0 "register_operand" "=v")
+(define_expand "truncv4hf2"
+  [(set (match_operand:V4HF 0 "register_operand")
+   (float_truncate:V4HF (match_operand:VF4_128_8_256 1 "vector_operand")))]
+  "TARGET_AVX512FP16 && TARGET_AVX512VL"
+{
+  operands[0] = lowpart_subreg (V8HFmode, operands[0], V4HFmode);
+  emit_insn (gen_avx512fp16_truncv4hf2 (operands[0], operands[1]));
+  DONE;
+})
+
+(define_expand "avx512fp16_truncv4hf2"
+  [(set (match_operand:V8HF 0 "register_operand")
(vec_concat:V8HF
(float_truncate:V4HF
- (match_operand:VF4_128_8_256 1 "vector_operand" "vm"))
+ (match_operand:VF4_128_8_256 1 "vector_operand"))
(match_dup 2)))]
   "TARGET_AVX512FP16 && TARGET_AVX512VL"
   "operands[2] = CONST0_RTX (V4HFmode);")
@@ -6461,11 +6513,20 @@ (define_insn 
"*avx512fp16_vcvt2ph__mask_1"
(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
-(define_expand "avx512fp16_vcvtpd2ph_v2df"
-  [(set (match_operand:V8HF 0 "register_operand" "=v")
+(define_expand "truncv2dfv2hf2"
+  [(set (match_operand:V2HF 0 "register_operand")
+   (float_truncate:V2HF (match_operand:V2DF 1 "vector_operand")))]
+  "TARGET_AVX512FP16 && TARGET_AVX512VL"
+{
+  operands[0] = lowpart_subreg (V8HFmode, operands[0], V2HFmode);
+  emit_insn (gen_avx512fp16_truncv2dfv2hf2 (operands[0], operands[1]));
+  DONE;
+})
+
+(define_expand "avx512fp16_truncv2dfv2hf2"
+  [(set (match_operand:V8HF 0 "register_operand")
(vec_concat:V8HF
- (float_truncate:V2HF
-

[PATCH 5/7] AVX512FP16: Add float(uns)?mn2 expander

2021-09-22 Thread liuhongt via Gcc-patches

From: Hongyu Wang 

gcc/ChangeLog:

* config/i386/sse.md (float2):
New expander.
(avx512fp16_vcvt2ph_):
Rename to ...
(floatv4hf2): ... this, and drop constraints.
(avx512fp16_vcvtqq2ph_v2di): Rename to ...
(floatv2div2hf2): ... this, and likewise.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512fp16-floatvnhf.c: New test.
---
 gcc/config/i386/sse.md| 46 +++---
 .../gcc.target/i386/avx512fp16-floatvnhf.c| 61 +++
 2 files changed, 99 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-floatvnhf.c

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index f8a5f197f3c..66062dc3bcf 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -6006,6 +6006,12 @@ (define_insn 
"avx512fp16_vcvtph2_<
(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
+(define_expand "float2"
+  [(set (match_operand: 0 "register_operand")
+   (any_float:
+ (match_operand:VI2H_AVX512VL 1 "nonimmediate_operand")))]
+  "TARGET_AVX512FP16")
+
 (define_insn 
"avx512fp16_vcvt2ph_"
   [(set (match_operand: 0 "register_operand" "=v")
(any_float:
@@ -6016,11 +6022,23 @@ (define_insn 
"avx512fp16_vcvt2ph_")])
 
-(define_expand "avx512fp16_vcvt2ph_"
-  [(set (match_operand:V8HF 0 "register_operand" "=v")
+(define_expand "floatv4hf2"
+  [(set (match_operand:V4HF 0 "register_operand")
+   (any_float:V4HF
+ (match_operand:VI4_128_8_256 1 "vector_operand")))]
+  "TARGET_AVX512FP16 && TARGET_AVX512VL"
+{
+  operands[0] = lowpart_subreg (V8HFmode, operands[0], V4HFmode);
+  emit_insn (gen_avx512fp16_floatv4hf2 (operands[0],
+ operands[1]));
+  DONE;
+})
+
+(define_expand "avx512fp16_floatv4hf2"
+  [(set (match_operand:V8HF 0 "register_operand")
(vec_concat:V8HF
-   (any_float:V4HF (match_operand:VI4_128_8_256 1 "vector_operand" 
"vm"))
-   (match_dup 2)))]
+ (any_float:V4HF (match_operand:VI4_128_8_256 1 "vector_operand"))
+ (match_dup 2)))]
   "TARGET_AVX512FP16 && TARGET_AVX512VL"
   "operands[2] = CONST0_RTX (V4HFmode);")
 
@@ -6079,11 +6097,23 @@ (define_insn 
"*avx512fp16_vcvt2ph__mask_1"
(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
-(define_expand "avx512fp16_vcvtqq2ph_v2di"
-  [(set (match_operand:V8HF 0 "register_operand" "=v")
+(define_expand "floatv2div2hf2"
+  [(set (match_operand:V2HF 0 "register_operand")
+   (any_float:V2HF
+ (match_operand:V2DI 1 "vector_operand")))]
+  "TARGET_AVX512FP16 && TARGET_AVX512VL"
+{
+  operands[0] = lowpart_subreg (V8HFmode, operands[0], V2HFmode);
+  emit_insn (gen_avx512fp16_floatv2div2hf2 (operands[0],
+   operands[1]));
+  DONE;
+})
+
+(define_expand "avx512fp16_floatv2div2hf2"
+  [(set (match_operand:V8HF 0 "register_operand")
(vec_concat:V8HF
-   (any_float:V2HF (match_operand:V2DI 1 "vector_operand" "vm"))
-   (match_dup 2)))]
+ (any_float:V2HF (match_operand:V2DI 1 "vector_operand"))
+ (match_dup 2)))]
   "TARGET_AVX512FP16 && TARGET_AVX512VL"
   "operands[2] = CONST0_RTX (V6HFmode);")
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-floatvnhf.c 
b/gcc/testsuite/gcc.target/i386/avx512fp16-floatvnhf.c
new file mode 100644
index 000..112ac3e74d5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512fp16-floatvnhf.c
@@ -0,0 +1,61 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512fp16 -mavx512vl -ftree-slp-vectorize 
-mprefer-vector-width=512" } */
+
+extern long long di[8];
+extern unsigned long long udi[8];
+extern int si[16];
+extern unsigned int usi[16];
+extern short hi[32];
+extern unsigned short uhi[32];
+extern _Float16 hf[32];
+
+#define DO_PRAGMA(X) _Pragma(#X)
+
+#define FLOATHFVV(size, mode)  \
+  void __attribute__ ((noinline, noclone))  \
+float##v##size##mode##v##size##hf ()   \
+{\
+  int i;  \
+  DO_PRAGMA (GCC unroll size)  \
+  for (i = 0; i < size; i++)  \
+hf[i] = (_Float16) mode[i];  \
+}
+
+FLOATHFVV(32, hi)
+FLOATHFVV(16, hi)
+FLOATHFVV(8, hi)
+FLOATHFVV(16, si)
+FLOATHFVV(8, si)
+FLOATHFVV(4, si)
+FLOATHFVV(8, di)
+FLOATHFVV(4, di)
+FLOATHFVV(2, di)
+
+FLOATHFVV(32, uhi)
+FLOATHFVV(16, uhi)
+FLOATHFVV(8, uhi)
+FLOATHFVV(16, usi)
+FLOATHFVV(8, usi)
+FLOATHFVV(4, usi)
+FLOATHFVV(8, udi)
+FLOATHFVV(4, udi)
+FLOATHFVV(2, udi)
+
+/* { dg-final { scan-assembler-times "vcvtqq2phz\[ 
\\t\]+\[^\{\n\]*\[^\n\r]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vcvtuqq2phz\[ 
\\t\]+\[^\{\n\]*\[^\n\r]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vcvtqq2phy\[ 
\\t\]+\[^\{\n\]*\[^\n\r]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-times "vcvtuqq2phy\[ 
\\t\]+\[^\{\n\]*\[^\n\r]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 { xfail *-*-* }

[PATCH 4/7] AVX512FP16: Add fix(uns)?_truncmn2 for HF scalar and vector modes

2021-09-22 Thread liuhongt via Gcc-patches

From: Hongyu Wang 

NB: 64bit/32bit vectorize for HFmode is not supported for now, will
adjust this patch when V2HF/V4HF operations supported.

gcc/ChangeLog:

* config/i386/i386.md (fix_trunchf2): New expander.
(fixuns_trunchfhi2): Likewise.
(*fixuns_trunchfsi2zext): New define_insn.
* config/i386/sse.md (ssePHmodelower): New mode_attr.
(fix_trunc2):
New expander for same element vector fix_truncate.
(fix_trunc2):
Likewise for V4HF to V4SI/V4DI fix_truncate.
(fix_truncv2hfv2di2):
Likeise for V2HF to V2DI fix_truncate.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512fp16-trunchf.c: New test.
* gcc.target/i386/avx512fp16-truncvnhf.c: Ditto.
---
 gcc/config/i386/i386.md   | 29 +
 gcc/config/i386/sse.md| 43 +
 .../gcc.target/i386/avx512fp16-trunchf.c  | 59 ++
 .../gcc.target/i386/avx512fp16-truncvnhf.c| 61 +++
 4 files changed, 192 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-trunchf.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-truncvnhf.c

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index a087e557d7f..c6279e620c9 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -4810,6 +4810,16 @@ (define_expand "fix_truncdi2"
}
 })
 
+(define_insn "fix_trunchf2"
+  [(set (match_operand:SWI48 0 "register_operand" "=r")
+   (any_fix:SWI48
+ (match_operand:HF 1 "nonimmediate_operand" "vm")))]
+  "TARGET_AVX512FP16"
+  "vcvttsh2si\t{%1, %0|%0, %1}"
+  [(set_attr "type" "sseicvt")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "")])
+
 ;; Signed conversion to SImode.
 
 (define_expand "fix_truncxfsi2"
@@ -4917,6 +4927,17 @@ (define_insn "fixuns_truncsi2_avx512f"
(set_attr "prefix" "evex")
(set_attr "mode" "SI")])
 
+(define_insn "*fixuns_trunchfsi2zext"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (zero_extend:DI
+ (unsigned_fix:SI
+   (match_operand:HF 1 "nonimmediate_operand" "vm"]
+  "TARGET_64BIT && TARGET_AVX512FP16"
+  "vcvttsh2usi\t{%1, %k0|%k0, %1}"
+  [(set_attr "type" "sseicvt")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "SI")])
+
 (define_insn "*fixuns_truncsi2_avx512f_zext"
   [(set (match_operand:DI 0 "register_operand" "=r")
(zero_extend:DI
@@ -4949,6 +4970,14 @@ (define_insn_and_split "*fixuns_trunc_1"
 ;; Without these patterns, we'll try the unsigned SI conversion which
 ;; is complex for SSE, rather than the signed SI conversion, which isn't.
 
+(define_expand "fixuns_trunchfhi2"
+  [(set (match_dup 2)
+   (fix:SI (match_operand:HF 1 "nonimmediate_operand")))
+   (set (match_operand:HI 0 "nonimmediate_operand")
+   (subreg:HI (match_dup 2) 0))]
+  "TARGET_AVX512FP16"
+  "operands[2] = gen_reg_rtx (SImode);")
+
 (define_expand "fixuns_trunchi2"
   [(set (match_dup 2)
(fix:SI (match_operand:MODEF 1 "nonimmediate_operand")))
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 1ca95984afc..f8a5f197f3c 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -1034,6 +1034,13 @@ (define_mode_attr ssePHmode
(V8DI "V8HF") (V4DI "V8HF") (V2DI "V8HF")
(V8DF "V8HF") (V16SF "V16HF") (V8SF "V8HF")])
 
+;; Mapping of vector modes to vector hf modes of same element.
+(define_mode_attr ssePHmodelower
+  [(V32HI "v32hf") (V16HI "v16hf") (V8HI "v8hf")
+   (V16SI "v16hf") (V8SI "v8hf") (V4SI "v4hf")
+   (V8DI "v8hf") (V4DI "v4hf") (V2DI "v2hf")
+   (V8DF "v8hf") (V16SF "v16hf") (V8SF "v8hf")])
+
 ;; Mapping of vector modes to packed single mode of the same size
 (define_mode_attr ssePSmode
   [(V16SI "V16SF") (V8DF "V16SF")
@@ -6175,6 +6182,12 @@ (define_insn 
"avx512fp16_vcvtsi2sh"
(set_attr "prefix" "evex")
(set_attr "mode" "HF")])
 
+(define_expand "fix_trunc2"
+  [(set (match_operand:VI2H_AVX512VL 0 "register_operand")
+   (any_fix:VI2H_AVX512VL
+ (match_operand: 1 "nonimmediate_operand")))]
+  "TARGET_AVX512FP16")
+
 (define_insn 
"avx512fp16_fix_trunc2"
   [(set (match_operand:VI2H_AVX512VL 0 "register_operand" "=v")
(any_fix:VI2H_AVX512VL
@@ -6185,6 +6198,21 @@ (define_insn 
"avx512fp16_fix_trunc2")])
 
+(define_expand "fix_truncv4hf2"
+  [(set (match_operand:VI4_128_8_256 0 "register_operand")
+   (any_fix:VI4_128_8_256
+ (match_operand:V4HF 1 "nonimmediate_operand")))]
+  "TARGET_AVX512FP16 && TARGET_AVX512VL"
+{
+  if (!MEM_P (operands[1]))
+{
+  operands[1] = lowpart_subreg (V8HFmode, operands[1], V4HFmode);
+  emit_insn (gen_avx512fp16_fix_trunc2 (operands[0],
+   operands[1]));
+  DONE;
+}
+})
+
 (define_insn "avx512fp16_fix_trunc2"
   [(set (match_operand:VI4_128_8_256 0 "register_operand" "=v")
(any_fix:VI4_128_8_256
@@ -6207,6 +6235,21 @@ (define_insn

[PATCH 3/7] AVX512FP16: Add expander for smin/maxhf3.

2021-09-22 Thread liuhongt via Gcc-patches

From: Hongyu Wang 

gcc/ChangeLog:

* config/i386/i386.md (hf3): New expander.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512fp16-builtin-minmax-1.c: New test.
---
 gcc/config/i386/i386.md   | 11 ++
 .../i386/avx512fp16-builtin-minmax-1.c| 35 +++
 2 files changed, 46 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-builtin-minmax-1.c

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 4b13a59be82..a087e557d7f 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -19946,6 +19946,17 @@ (define_insn "3"
(set_attr "type" "sseadd")
(set_attr "mode" "")])
 
+(define_insn "hf3"
+  [(set (match_operand:HF 0 "register_operand" "=v")
+   (smaxmin:HF
+ (match_operand:HF 1 "nonimmediate_operand" "%v")
+ (match_operand:HF 2 "nonimmediate_operand" "vm")))]
+  "TARGET_AVX512FP16"
+  "vsh\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "prefix" "evex")
+   (set_attr "type" "sseadd")
+   (set_attr "mode" "HF")])
+
 ;; These versions of the min/max patterns implement exactly the operations
 ;;   min = (op1 < op2 ? op1 : op2)
 ;;   max = (!(op1 < op2) ? op1 : op2)
diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-builtin-minmax-1.c 
b/gcc/testsuite/gcc.target/i386/avx512fp16-builtin-minmax-1.c
new file mode 100644
index 000..90080e44216
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512fp16-builtin-minmax-1.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -mavx512fp16 -mprefer-vector-width=512" } */
+
+_Float16
+minf1 (_Float16 a, _Float16 b)
+{
+  return __builtin_fminf16 (a, b);
+}
+
+void
+minf2 (_Float16* __restrict psrc1, _Float16* __restrict psrc2,
+   _Float16* __restrict pdst)
+{
+  for (int i = 0; i != 32; i++)
+pdst[i] = __builtin_fminf16 (psrc1[i], psrc2[i]);
+}
+
+_Float16
+maxf1 (_Float16 a, _Float16 b)
+{
+  return __builtin_fmaxf16 (a, b);
+}
+
+void
+maxf2 (_Float16* __restrict psrc1, _Float16* __restrict psrc2,
+   _Float16* __restrict pdst)
+{
+  for (int i = 0; i != 32; i++)
+pdst[i] = __builtin_fmaxf16 (psrc1[i], psrc2[i]);
+}
+
+/* { dg-final { scan-assembler-times "vmaxsh\[^\n\r\]*xmm\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "vmaxph\[^\n\r\]*zmm\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "vminsh\[^\n\r\]*xmm\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "vminph\[^\n\r\]*zmm\[0-9\]" 1 } } */
-- 
2.27.0

[PATCH 2/7] AVX512FP16: Add expander for fmahf4

2021-09-22 Thread liuhongt via Gcc-patches

gcc/ChangeLog:

* config/i386/sse.md (FMAMODEM): extend to handle FP16.
(VFH_SF_AVX512VL): Extend to handle HFmode.
(VF_SF_AVX512VL): Deleted.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512fp16-fma-1.c: New test.
* gcc.target/i386/avx512fp16vl-fma-1.c: New test.
* gcc.target/i386/avx512fp16vl-fma-vectorize-1.c: New test.
---
 gcc/config/i386/sse.md| 11 +--
 .../gcc.target/i386/avx512fp16-fma-1.c| 69 ++
 .../gcc.target/i386/avx512fp16vl-fma-1.c  | 70 +++
 .../i386/avx512fp16vl-fma-vectorize-1.c   | 45 
 4 files changed, 190 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-fma-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16vl-fma-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16vl-fma-vectorize-1.c

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 9079613e829..1ca95984afc 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -4650,7 +4650,11 @@ (define_mode_iterator FMAMODEM
(V8SF "TARGET_FMA || TARGET_FMA4 || TARGET_AVX512VL")
(V4DF "TARGET_FMA || TARGET_FMA4 || TARGET_AVX512VL")
(V16SF "TARGET_AVX512F")
-   (V8DF "TARGET_AVX512F")])
+   (V8DF "TARGET_AVX512F")
+   (HF "TARGET_AVX512FP16")
+   (V8HF "TARGET_AVX512FP16 && TARGET_AVX512VL")
+   (V16HF "TARGET_AVX512FP16 && TARGET_AVX512VL")
+   (V32HF "TARGET_AVX512FP16")])
 
 (define_expand "fma4"
   [(set (match_operand:FMAMODEM 0 "register_operand")
@@ -4758,14 +4762,11 @@ (define_insn "*fma_fmadd_"
(set_attr "mode" "")])
 
 ;; Suppose AVX-512F as baseline
-(define_mode_iterator VF_SF_AVX512VL
-  [SF V16SF (V8SF "TARGET_AVX512VL") (V4SF "TARGET_AVX512VL")
-   DF V8DF (V4DF "TARGET_AVX512VL") (V2DF "TARGET_AVX512VL")])
-
 (define_mode_iterator VFH_SF_AVX512VL
   [(V32HF "TARGET_AVX512FP16")
(V16HF "TARGET_AVX512FP16 && TARGET_AVX512VL")
(V8HF "TARGET_AVX512FP16 && TARGET_AVX512VL")
+   (HF "TARGET_AVX512FP16")
SF V16SF (V8SF "TARGET_AVX512VL") (V4SF "TARGET_AVX512VL")
DF V8DF (V4DF "TARGET_AVX512VL") (V2DF "TARGET_AVX512VL")])
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-fma-1.c 
b/gcc/testsuite/gcc.target/i386/avx512fp16-fma-1.c
new file mode 100644
index 000..d78d7629838
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512fp16-fma-1.c
@@ -0,0 +1,69 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -mavx512fp16" } */
+
+typedef _Float16 v32hf __attribute__ ((__vector_size__ (64)));
+
+_Float16
+foo1 (_Float16 a, _Float16 b, _Float16 c)
+{
+  return a * b + c;
+}
+
+/* { dg-final { scan-assembler-times "vfmadd132sh\[^\n\r\]*xmm\[0-9\]" 1 } } */
+
+_Float16
+foo2 (_Float16 a, _Float16 b, _Float16 c)
+{
+  return -a * b + c;
+}
+
+/* { dg-final { scan-assembler-times "vfnmadd132sh\[^\n\r\]*xmm\[0-9\]" 1 } } 
*/
+
+_Float16
+foo3 (_Float16 a, _Float16 b, _Float16 c)
+{
+  return a * b - c;
+}
+
+/* { dg-final { scan-assembler-times "vfmsub132sh\[^\n\r\]*xmm\[0-9\]" 1 } } */
+
+_Float16
+foo4 (_Float16 a, _Float16 b, _Float16 c)
+{
+  return -a * b - c;
+}
+
+/* { dg-final { scan-assembler-times "vfnmsub132sh\[^\n\r\]*xmm\[0-9\]" 1 } } 
*/
+
+v32hf
+foo5 (v32hf a, v32hf b, v32hf c)
+{
+  return a * b + c;
+}
+
+/* { dg-final { scan-assembler-times "vfmadd132ph\[^\n\r\]*zmm\[0-9\]" 1 } } */
+
+v32hf
+foo6 (v32hf a, v32hf b, v32hf c)
+{
+  return -a * b + c;
+}
+
+/* { dg-final { scan-assembler-times "vfnmadd132ph\[^\n\r\]*zmm\[0-9\]" 1 } } 
*/
+
+v32hf
+foo7 (v32hf a, v32hf b, v32hf c)
+{
+  return a * b - c;
+}
+
+/* { dg-final { scan-assembler-times "vfmsub132ph\[^\n\r\]*zmm\[0-9\]" 1 } } */
+
+v32hf
+foo8 (v32hf a, v32hf b, v32hf c)
+{
+  return -a * b - c;
+}
+
+/* { dg-final { scan-assembler-times "vfnmsub132ph\[^\n\r\]*zmm\[0-9\]" 1 } } 
*/
+
diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16vl-fma-1.c 
b/gcc/testsuite/gcc.target/i386/avx512fp16vl-fma-1.c
new file mode 100644
index 000..1a832f37d6c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512fp16vl-fma-1.c
@@ -0,0 +1,70 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -mavx512fp16 -mavx512vl" } */
+
+typedef _Float16 v8hf __attribute__ ((__vector_size__ (16)));
+typedef _Float16 v16hf __attribute__ ((__vector_size__ (32)));
+
+v8hf
+foo1 (v8hf a, v8hf b, v8hf c)
+{
+  return a * b + c;
+}
+
+/* { dg-final { scan-assembler-times "vfmadd132ph\[^\n\r\]*xmm\[0-9\]" 1 } } */
+
+v8hf
+foo2 (v8hf a, v8hf b, v8hf c)
+{
+  return -a * b + c;
+}
+
+/* { dg-final { scan-assembler-times "vfnmadd132ph\[^\n\r\]*xmm\[0-9\]" 1 } } 
*/
+
+v8hf
+foo3 (v8hf a, v8hf b, v8hf c)
+{
+  return a * b - c;
+}
+
+/* { dg-final { scan-assembler-times "vfmsub132ph\[^\n\r\]*xmm\[0-9\]" 1 } } */
+
+v8hf
+foo4 (v8hf a, v8hf b, v8hf c)
+{
+  return -a * b - c;
+}
+
+/* { dg-final { scan-assembler-times "vfnmsub132ph\[^\n\r\]*xmm\[0-9\]" 1 } } 
*/
+
+v16hf
+foo5 (v16hf a, v16hf b, v16hf c)
+{
+  return a * b + c;
+}

[PATCH 1/7] AVX512FP16: Add expander for rint/nearbyinthf2.

2021-09-22 Thread liuhongt via Gcc-patches

gcc/ChangeLog:

* config/i386/i386.md (rinthf2): New expander.
(nearbyinthf2): New expander.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512fp16-builtin-round-1.c: Add new testcase.
---
 gcc/config/i386/i386.md   | 22 +++
 .../i386/avx512fp16-builtin-round-1.c | 14 
 2 files changed, 36 insertions(+)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 60d877668d5..4b13a59be82 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -18287,6 +18287,17 @@ (define_insn "rintxf2"
(set_attr "znver1_decode" "vector")
(set_attr "mode" "XF")])
 
+(define_expand "rinthf2"
+  [(match_operand:HF 0 "register_operand")
+   (match_operand:HF 1 "nonimmediate_operand")]
+  "TARGET_AVX512FP16"
+{
+  emit_insn (gen_sse4_1_roundhf2 (operands[0],
+ operands[1],
+ GEN_INT (ROUND_MXCSR)));
+  DONE;
+})
+
 (define_expand "rint2"
   [(use (match_operand:MODEF 0 "register_operand"))
(use (match_operand:MODEF 1 "nonimmediate_operand"))]
@@ -18320,6 +18331,17 @@ (define_expand "nearbyintxf2"
   "TARGET_USE_FANCY_MATH_387
&& !flag_trapping_math")
 
+(define_expand "nearbyinthf2"
+  [(match_operand:HF 0 "register_operand")
+   (match_operand:HF 1 "nonimmediate_operand")]
+  "TARGET_AVX512FP16"
+{
+  emit_insn (gen_sse4_1_roundhf2 (operands[0],
+ operands[1],
+ GEN_INT (ROUND_MXCSR | ROUND_NO_EXC)));
+  DONE;
+})
+
 (define_expand "nearbyint2"
   [(use (match_operand:MODEF 0 "register_operand"))
(use (match_operand:MODEF 1 "nonimmediate_operand"))]
diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-builtin-round-1.c 
b/gcc/testsuite/gcc.target/i386/avx512fp16-builtin-round-1.c
index 3cab1526967..a1c6636e354 100644
--- a/gcc/testsuite/gcc.target/i386/avx512fp16-builtin-round-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512fp16-builtin-round-1.c
@@ -25,7 +25,21 @@ f4 (_Float16 x)
   return __builtin_roundevenf16 (x);
 }
 
+_Float16
+f5 (_Float16 x)
+{
+  return __builtin_rintf16 (x);
+}
+
+_Float16
+f6 (_Float16 x)
+{
+  return __builtin_nearbyintf16 (x);
+}
+
 /* { dg-final { scan-assembler-times "vrndscalesh\[ 
\\t\]+\\\$11\[^\n\r\]*xmm\[0-9\]" 1 } } */
 /* { dg-final { scan-assembler-times "vrndscalesh\[ 
\\t\]+\\\$10\[^\n\r\]*xmm\[0-9\]" 1 } } */
 /* { dg-final { scan-assembler-times "vrndscalesh\[ 
\\t\]+\\\$9\[^\n\r\]*xmm\[0-9\]" 1 } } */
 /* { dg-final { scan-assembler-times "vrndscalesh\[ 
\\t\]+\\\$8\[^\n\r\]*xmm\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "vrndscalesh\[ 
\\t\]+\\\$4\[^\n\r\]*xmm\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "vrndscalesh\[ 
\\t\]+\\\$12\[^\n\r\]*xmm\[0-9\]" 1 } } */
-- 
2.27.0

[PATCH 0/7] AVX512FP16: Support bunch of expanders for HFmode and vector HFmodes

2021-09-22 Thread liuhongt via Gcc-patches

  xfail are added for testcases related to truncmn2/extendmn2 expanders since 
V2HF/V4HFmode
are not supported yet, they should be removed later.

  Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
  Newly added runtime testcases passed on sde{-m32,}.

Hongyu Wang (5):
  AVX512FP16: Add expander for smin/maxhf3.
  AVX512FP16: Add fix(uns)?_truncmn2 for HF scalar and vector modes
  AVX512FP16: Add float(uns)?mn2 expander
  AVX512FP16: add truncmn2/extendmn2 expanders
  AVX512FP16: Enable vec_cmpmn/vcondmn expanders for HF modes.

liuhongt (2):
  AVX512FP16: Add expander for rint/nearbyinthf2.
  AVX512FP16: Add expander for fmahf4

 gcc/config/i386/i386-expand.c |   2 +
 gcc/config/i386/i386.md   |  62 +
 gcc/config/i386/sse.md| 259 +++---
 .../i386/avx512fp16-vcondmn-minmax.C  |  25 ++
 .../g++.target/i386/avx512fp16-vcondmn-vec.C  |  70 +
 .../i386/avx512fp16-builtin-minmax-1.c|  35 +++
 .../i386/avx512fp16-builtin-round-1.c |  14 +
 .../gcc.target/i386/avx512fp16-floatvnhf.c|  61 +
 .../gcc.target/i386/avx512fp16-fma-1.c|  69 +
 .../i386/avx512fp16-trunc-extendvnhf.c|  55 
 .../gcc.target/i386/avx512fp16-trunchf.c  |  59 
 .../gcc.target/i386/avx512fp16-truncvnhf.c|  61 +
 .../i386/avx512fp16-vcondmn-loop-1.c  |  70 +
 .../i386/avx512fp16-vcondmn-loop-2.c  | 143 ++
 .../gcc.target/i386/avx512fp16-vec_cmpmn.c|  32 +++
 .../gcc.target/i386/avx512fp16vl-fma-1.c  |  70 +
 .../i386/avx512fp16vl-fma-vectorize-1.c   |  45 +++
 17 files changed, 1100 insertions(+), 32 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/i386/avx512fp16-vcondmn-minmax.C
 create mode 100644 gcc/testsuite/g++.target/i386/avx512fp16-vcondmn-vec.C
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-builtin-minmax-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-floatvnhf.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-fma-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-trunc-extendvnhf.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-trunchf.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-truncvnhf.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vcondmn-loop-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vcondmn-loop-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vec_cmpmn.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16vl-fma-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16vl-fma-vectorize-1.c

-- 
2.27.0

[PATCH] wwwdocs: [GCC12] Mention Intel AVX512-FP16.

2021-09-22 Thread liuhongt via Gcc-patches

---
 htdocs/gcc-12/changes.html | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 81f62fe3..14149212 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -165,8 +165,12 @@ a work-in-progress.
   
 
 
-
-
+IA-32/x86-64
+
+  New ISA extension support for Intel AVX512-FP16 was added to GCC.
+  AVX512FP16 intrinsics are available via the -mavx512fp16
+  compiler switch.
+
 
 
 
-- 
2.18.1

Re: [PATCH][Hashtable 6/6] PR 68303 small size optimization

2021-09-22 Thread François Dumont via Gcc-patches


Ping ?

On 16/08/21 9:03 pm, François Dumont wrote:

On 17/07/20 2:58 pm, Jonathan Wakely wrote:

On 17/11/19 22:31 +0100, FranÃ§ois Dumont wrote:

This is an implementation of PR 68303.

I try to use this idea as much as possible to avoid computation of 
hash codes.


Note that tests are not showing any gain. I guess hash computation 
must be quite bad to get a benefit from it. So I am only activating 
it when hash code is not cached and/or when computation is not fast.


If the tests don't show any benefit, why bother making the change?


I eventually managed to demonstrate this optimization through a 
performance test case.




Does it help the example in the PR?


No, the code attached to the PR just show what the user has done to 
put in place this optim on his side.


What I needed was a slow hash code computation compared to the equal 
operation. I realized that I had to use longer string to achieve this.


Moreover making this optim dependant on 
_Hashtable_traits::__hash_cached was just wrong as we cannot use the 
cached hash code here as the input is a key instance, not a node.


I introduce _Hashtable_hash_traits<_Hash> to offer a single 
customization point as this optim depends highly on the difference 
between a hash code computation and a comparison. Maybe I should put 
it at std namespace scope to ease partial specialization ?


Performance test results before the patch:

unordered_small_size.cc      std::unordered_set: 1st insert      
40r   32u    8s 264000112mem    0pf
unordered_small_size.cc      std::unordered_set: find/erase      
22r   22u    0s -191999808mem    0pf
unordered_small_size.cc      std::unordered_set: 2nd insert      
36r   36u    0s 191999776mem    0pf
unordered_small_size.cc      std::unordered_set: erase key       
25r   25u    0s -191999808mem    0pf
unordered_small_size.cc      std::unordered_set: 1st insert    
 404r  244u  156s -1989936256mem    0pf
unordered_small_size.cc      std::unordered_set: find/erase    
 315r  315u    0s 2061942272mem    0pf
unordered_small_size.cc      std::unordered_set: 2nd insert    
 233r  233u    0s -2061942208mem    0pf
unordered_small_size.cc      std::unordered_set: erase key     
 299r  298u    0s 2061942208mem    0pf


after the patch:

unordered_small_size.cc      std::unordered_set: 1st insert      
41r   33u    7s 264000112mem    0pf
unordered_small_size.cc      std::unordered_set: find/erase      
24r   25u    1s -191999808mem    0pf
unordered_small_size.cc      std::unordered_set: 2nd insert      
34r   34u    0s 191999776mem    0pf
unordered_small_size.cc      std::unordered_set: erase key       
25r   25u    0s -191999808mem    0pf
unordered_small_size.cc      std::unordered_set: 1st insert    
 399r  232u  165s -1989936256mem    0pf
unordered_small_size.cc      std::unordered_set: find/erase    
 196r  197u    0s 2061942272mem    0pf
unordered_small_size.cc      std::unordered_set: 2nd insert    
 221r  222u    0s -2061942208mem    0pf
unordered_small_size.cc      std::unordered_set: erase key     
 178r  178u    0s 2061942208mem    0pf


    libstdc++: Optimize operations on small size hashtable [PR 68303]

    When hasher is identified as slow and the number of elements is 
limited in the
    container use a brute-force loop on those elements to look for a 
given key using
    the key_equal functor. For the moment the default threshold below 
which the

    container is considered as small is 20.

    libstdc++-v3/ChangeLog:

    PR libstdc++/68303
    * include/bits/hashtable_policy.h
    (_Hashtable_hash_traits<_Hash>): New.
    (_Hash_code_base<>::_M_hash_code(const 
_Hash_node_value<>&)): New.

    (_Hashtable_base<>::_M_key_equals): New.
    (_Hashtable_base<>::_M_equals): Use latter.
    (_Hashtable_base<>::_M_key_equals_tr): New.
    (_Hashtable_base<>::_M_equals_tr): Use latter.
    * include/bits/hashtable.h
    (_Hashtable<>::__small_size_threshold()): New, use 
_Hashtable_hash_traits.
    (_Hashtable<>::find): Loop through elements to look for 
key if size is lower

    than __small_size_threshold().
    (_Hashtable<>::_M_emplace(true_type, _Args&&...)): Likewise.
    (_Hashtable<>::_M_insert_unique(_Kt&&, _Args&&, const 
_NodeGenerator&)): Likewise.
(_Hashtable<>::_M_compute_hash_code(const_iterator, const key_type&)): 
New.
    (_Hashtable<>::_M_emplace(const_iterator, false_type, 
_Args&&...)): Use latter.

    (_Hashtable<>::_M_find_before_node(const key_type&)): New.
    (_Hashtable<>::_M_erase(true_type, const key_type&)): Use 
latter.
    (_Hashtable<>::_M_erase(false_type, const key_type&)): 
Likewise.
    * src/c++11/hashtable_c++0x.cc: Include 
.

    * testsuite/util/testsuite_performane.h
    (report_performance): Use 9 width to display memory.
    *

Re: [RFC] Don't move cold code out of loop by checking bb count

2021-09-22 Thread Xionghu Luo via Gcc-patches





On 2021/9/23 10:13, Xionghu Luo via Gcc-patches wrote:



On 2021/9/22 17:14, Richard Biener wrote:

On Thu, Sep 9, 2021 at 3:56 AM Xionghu Luo  wrote:




On 2021/8/26 19:33, Richard Biener wrote:
On Tue, Aug 10, 2021 at 4:03 AM Xionghu Luo  
wrote:


Hi,

On 2021/8/6 20:15, Richard Biener wrote:
On Mon, Aug 2, 2021 at 7:05 AM Xiong Hu Luo  
wrote:


There was a patch trying to avoid move cold block out of loop:

https://gcc.gnu.org/pipermail/gcc/2014-November/215551.html

Richard suggested to "never hoist anything from a bb with lower 
execution

frequency to a bb with higher one in LIM invariantness_dom_walker
before_dom_children".

This patch does this profile count check in both gimple LIM
move_computations_worker and RTL loop-invariant.c 
find_invariants_bb,

if the loop bb is colder than loop preheader, don't hoist it out of
loop.

Also, the profile count in loop split pass should be corrected to 
avoid
lim2 and lim4 mismatch behavior, currently, the new loop 
preheader generated
by loop_version is set to "[count: 0]:", then lim4 after lsplt 
pass will
move statement out of loop unexpectely when lim2 didn't move it.  
This

change could fix regression on 544.nab_r from -1.55% to +0.46%.

SPEC2017 performance evaluation shows 1% performance improvement for
intrate GEOMEAN and no obvious regression for others.  Especially,
500.perlbench_r +7.52% (Perf shows function S_regtry of perlbench is
largely improved.), and 548.exchange2_r+1.98%, 526.blender_r +1.00%
on P8LE.

Regression and bootstrap tested pass on P8LE, any comments?  Thanks.


While I'm not familiar with the RTL invariant motion pass the 
patch there

looks reasonable.  Note that we should assess the profile quality
somehow - I'm not sure how to do that, CCed Honza for that.


Thanks.



For the GIMPLE part the patch looks quite complicated - but note it
probably has to be since LIM performs kind of a "CSE" on loads
(and stores for store-motion), so when there are multiple stmts
affected by a hoisting decision the biggest block count has to be
accounted.  Likewise when there are dependent stmts involved
that might include conditional stmts (a "PHI"), but the overall
cost should be looked at.


Currently, The gimple code check two situations with the patch:
1) The statement or PHI‘s BB is *colder* then preheader, don't move 
it out

of loop;
2) The statement or PHI's BB is *hotter* then preheader, but any of 
it's rhs
couldn't be moved out of loop, also don't move it out of loop to 
avoid definition

not dominates use error.


But part 2) is obviously already done.  What I tried to say is your 
heuristic
doesn't integrate nicely with the pass but I admitted that it might 
be a bit

difficult to find a place to add this heuristic.

There is lim_data->cost which we could bias negatively but then this is
a cost that is independent on the hoisting distance.  But doing this 
would
work at least for the case where the immediately enclosing loop 
preheader
is hotter than the stmt and with this it would be a patch that's 
similarly

simple as the RTL one.

Another possibility is to simply only adjust PHI processing in
compute_invariantness, capping movement according to the hotness
heuristic.  The same could be done for regular stmts there but I'm
not sure that will do good in the end since this function is supposed
to compute "correctness" (well, it also has the cost stuff), and it's
not the place to do overall cost considerations.


Thanks.  I found that adding a function find_coldest_out_loop and 
check it in
outermost_invariant_loop to find the coldest invariant loop between 
outermost
loop and itself could also reach the purpose.  Then the gimple code 
check is

redundant and could be removed.



May be I could collect the number of instructions not hoisted with 
the patch
on regression tests and SPEC2017 to do a estimation for "multiple 
stmts affected"
and "overall cost" need to be considered?  But it seems 
move_computations_worker
couldn't rollback if we still want to hoist multiple stmts out 
during the iterations?




Now - GIMPLE LIM "costing" is somewhat backward right now
and it isn't set up to consider those multiple involved stmts.  Plus
the store-motion part does not have any cost part (but it depends
on previously decided invariant motions).

I think the way you implemented the check will cause no hoisting
to be performed instead of, say, hoisting to a different loop level
only.  Possibly shown when you consider a loop nest like

 for (;;)
   if (unlikely_cond)
 for (;;)
    invariant;

we want to hoist 'invariant' but only from the inner loop even if it
is invariant also in the outer loop.



For this case, theorotically I think the master GCC will optimize 
it to:


    invariant;
    for (;;)
  if (unlikely_cond)
    for (;;)
   ;

'invariant' is moved out of outer loop, but with the patch, it will 
get:


    for (;;)
  if (unlikely_cond)
    {
  invariant;

Re: [RFC] Don't move cold code out of loop by checking bb count

2021-09-22 Thread Xionghu Luo via Gcc-patches





On 2021/9/22 17:14, Richard Biener wrote:

On Thu, Sep 9, 2021 at 3:56 AM Xionghu Luo  wrote:




On 2021/8/26 19:33, Richard Biener wrote:

On Tue, Aug 10, 2021 at 4:03 AM Xionghu Luo  wrote:


Hi,

On 2021/8/6 20:15, Richard Biener wrote:

On Mon, Aug 2, 2021 at 7:05 AM Xiong Hu Luo  wrote:


There was a patch trying to avoid move cold block out of loop:

https://gcc.gnu.org/pipermail/gcc/2014-November/215551.html

Richard suggested to "never hoist anything from a bb with lower execution
frequency to a bb with higher one in LIM invariantness_dom_walker
before_dom_children".

This patch does this profile count check in both gimple LIM
move_computations_worker and RTL loop-invariant.c find_invariants_bb,
if the loop bb is colder than loop preheader, don't hoist it out of
loop.

Also, the profile count in loop split pass should be corrected to avoid
lim2 and lim4 mismatch behavior, currently, the new loop preheader generated
by loop_version is set to "[count: 0]:", then lim4 after lsplt pass will
move statement out of loop unexpectely when lim2 didn't move it.  This
change could fix regression on 544.nab_r from -1.55% to +0.46%.

SPEC2017 performance evaluation shows 1% performance improvement for
intrate GEOMEAN and no obvious regression for others.  Especially,
500.perlbench_r +7.52% (Perf shows function S_regtry of perlbench is
largely improved.), and 548.exchange2_r+1.98%, 526.blender_r +1.00%
on P8LE.

Regression and bootstrap tested pass on P8LE, any comments?  Thanks.


While I'm not familiar with the RTL invariant motion pass the patch there
looks reasonable.  Note that we should assess the profile quality
somehow - I'm not sure how to do that, CCed Honza for that.


Thanks.



For the GIMPLE part the patch looks quite complicated - but note it
probably has to be since LIM performs kind of a "CSE" on loads
(and stores for store-motion), so when there are multiple stmts
affected by a hoisting decision the biggest block count has to be
accounted.  Likewise when there are dependent stmts involved
that might include conditional stmts (a "PHI"), but the overall
cost should be looked at.


Currently, The gimple code check two situations with the patch:
1) The statement or PHI‘s BB is *colder* then preheader, don't move it out
of loop;
2) The statement or PHI's BB is *hotter* then preheader, but any of it's rhs
couldn't be moved out of loop, also don't move it out of loop to avoid 
definition
not dominates use error.


But part 2) is obviously already done.  What I tried to say is your heuristic
doesn't integrate nicely with the pass but I admitted that it might be a bit
difficult to find a place to add this heuristic.

There is lim_data->cost which we could bias negatively but then this is
a cost that is independent on the hoisting distance.  But doing this would
work at least for the case where the immediately enclosing loop preheader
is hotter than the stmt and with this it would be a patch that's similarly
simple as the RTL one.

Another possibility is to simply only adjust PHI processing in
compute_invariantness, capping movement according to the hotness
heuristic.  The same could be done for regular stmts there but I'm
not sure that will do good in the end since this function is supposed
to compute "correctness" (well, it also has the cost stuff), and it's
not the place to do overall cost considerations.


Thanks.  I found that adding a function find_coldest_out_loop and check it in
outermost_invariant_loop to find the coldest invariant loop between outermost
loop and itself could also reach the purpose.  Then the gimple code check is
redundant and could be removed.




May be I could collect the number of instructions not hoisted with the patch
on regression tests and SPEC2017 to do a estimation for "multiple stmts 
affected"
and "overall cost" need to be considered?  But it seems move_computations_worker
couldn't rollback if we still want to hoist multiple stmts out during the 
iterations?



Now - GIMPLE LIM "costing" is somewhat backward right now
and it isn't set up to consider those multiple involved stmts.  Plus
the store-motion part does not have any cost part (but it depends
on previously decided invariant motions).

I think the way you implemented the check will cause no hoisting
to be performed instead of, say, hoisting to a different loop level
only.  Possibly shown when you consider a loop nest like

 for (;;)
   if (unlikely_cond)
 for (;;)
invariant;

we want to hoist 'invariant' but only from the inner loop even if it
is invariant also in the outer loop.



For this case, theorotically I think the master GCC will optimize it to:

invariant;
for (;;)
  if (unlikely_cond)
for (;;)
   ;

'invariant' is moved out of outer loop, but with the patch, it will get:

for (;;)
  if (unlikely_cond)
{
  invariant;
  for (;;)
 ;
}

'invariant' is *cold* for outer loop, but it is still

[PATCH v4 1/2] Add -f[no-]direct-extern-access

2021-09-22 Thread H.J. Lu via Gcc-patches

Add -f[no-]direct-extern-access and nodirect_extern_access attribute.
-fdirect-extern-access is the default and always use GOT to access
undefined data and function symbols with nodirect_extern_access attribute,
including in PIE and non-PIE.  With -fno-direct-extern-access:

1. Always use GOT to access undefined data and function symbols,
   including in PIE and non-PIE.  These will avoid copy relocations
   in executables.  This is compatible with existing executables and
   shared libraries.
2. In executable and shared library, bind symbols with the STV_PROTECTED
   visibility locally:
   a. The address of data symbol is the address of data body.
   b. For systems without function descriptor, the function pointer is
  the address of function body.
   c. The resulting shared libraries may not be incompatible with
  executables which have copy relocations on protected symbols or
  use executable PLT entries as function addresses for protected
  functions in shared libraries.
3. Update asm_preferred_eh_data_format to select PC relative EH encoding
format with -fno-direct-extern-access to avoid copy relocation.
4. Add ix86_reloc_rw_mask for TARGET_ASM_RELOC_RW_MASK to avoid copy
relocation with -fno-direct-extern-access.

gcc/

PR target/35513
PR target/100593
* common.opt: Add -fdirect-extern-access.
* config/i386/i386-protos.h (ix86_force_load_from_GOT_p): Add a
bool argument.
* config/i386/i386.c (ix86_force_load_from_GOT_p): Add a bool
argument to indicate call operand.  Force non-call load
from GOT for -fno-direct-extern-access or nodirect_extern_access
attribute.
(legitimate_pic_address_disp_p): Avoid copy relocation in PIE
for -fno-direct-extern-access or nodirect_extern_access attribute.
(ix86_print_operand): Pass true to ix86_force_load_from_GOT_p
for call operand.
(asm_preferred_eh_data_format): Use PC-relative format for
-fno-direct-extern-access to avoid copy relocation.  Check
ptr_mode instead of TARGET_64BIT when selecting DW_EH_PE_sdata4.
(ix86_binds_local_p): Don't treat protected data as extern and
avoid copy relocation on common symbol with
-fno-direct-extern-access or nodirect_extern_access attribute.
(ix86_reloc_rw_mask): New to avoid copy relocation for
-fno-direct-extern-access.
(TARGET_ASM_RELOC_RW_MASK): New.
* doc/extend.texi: Document nodirect_extern_access attribute.
* doc/invoke.texi: Document -f[no-]direct-extern-access.

gcc/c-family/

PR target/35513
PR target/100593
* c-attribs.c (handle_nodirect_extern_access_attribute): New.
(c_common_attribute_table): Add nodirect_extern_access.

gcc/testsuite/

PR target/35513
PR target/100593
* g++.dg/pr35513-1.C: New file.
* g++.dg/pr35513-2.C: Likewise.
* gcc.target/i386/pr35513-1a.c: Likewise.
* gcc.target/i386/pr35513-1b.c: Likewise.
* gcc.target/i386/pr35513-2a.c: Likewise.
* gcc.target/i386/pr35513-2b.c: Likewise.
* gcc.target/i386/pr35513-3a.c: Likewise.
* gcc.target/i386/pr35513-3b.c: Likewise.
* gcc.target/i386/pr35513-4a.c: Likewise.
* gcc.target/i386/pr35513-4b.c: Likewise.
* gcc.target/i386/pr35513-5a.c: Likewise.
* gcc.target/i386/pr35513-5b.c: Likewise.
* gcc.target/i386/pr35513-6a.c: Likewise.
* gcc.target/i386/pr35513-6b.c: Likewise.
* gcc.target/i386/pr35513-7a.c: Likewise.
* gcc.target/i386/pr35513-7b.c: Likewise.
* gcc.target/i386/pr35513-8a.c: Likewise.
* gcc.target/i386/pr35513-8b.c: Likewise.
* gcc.target/i386/pr35513-9a.c: Likewise.
* gcc.target/i386/pr35513-9b.c: Likewise.
* gcc.target/i386/pr35513-10a.c: Likewise.
* gcc.target/i386/pr35513-10b.c: Likewise.
* gcc.target/i386/pr35513-11a.c: Likewise.
* gcc.target/i386/pr35513-11b.c: Likewise.
* gcc.target/i386/pr35513-12a.c: Likewise.
* gcc.target/i386/pr35513-12b.c: Likewise.
---
 gcc/c-family/c-attribs.c| 34 +++
 gcc/common.opt  |  4 ++
 gcc/config/i386/i386-protos.h   |  2 +-
 gcc/config/i386/i386.c  | 62 -
 gcc/doc/extend.texi |  6 ++
 gcc/doc/invoke.texi | 13 +
 gcc/testsuite/g++.dg/pr35513-1.C| 25 +
 gcc/testsuite/g++.dg/pr35513-2.C| 53 ++
 gcc/testsuite/gcc.target/i386/pr35513-10a.c | 17 ++
 gcc/testsuite/gcc.target/i386/pr35513-10b.c | 17 ++
 gcc/testsuite/gcc.target/i386/pr35513-11a.c | 17 ++
 gcc/testsuite/gcc.target/i386/pr35513-11b.c | 17 ++
 gcc/testsuite/gcc.target/i386/pr35513-12a.c | 17 ++
 gcc/testsuite/gcc.target/i386/pr35513-12b.c | 17 ++

[PATCH v4 2/2] Add TARGET_ASM_EMIT_GNU_PROPERTY_NOTE

2021-09-22 Thread H.J. Lu via Gcc-patches

Generate the marker for -fno-direct-extern-access to indicate that the
object file uses GOT to access all external symbols.  Access to protected
symbols in the resulting shared library is treated as local, which requires
canonical function pointers and cannot be used with copy relocation.

This marker can be used in the following ways:

1. Linker can decide the best way to resolve a relocation against a
protected symbol before seeing all relocations against the symbol.
2. Dynamic linker can decide if it is an error to have a copy relocation
in executable against the protected symbol in a shared library by checking
if the shared library is built with -fno-direct-extern-access.

* configure.ac (HAVE_LD_INDIRECT_EXTERN_ACCESS_SUPPORT): New.
Define to 1 if linker supports
GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS.
* output.h (emit_gnu_property): New.
(emit_gnu_property_note): Likewise.
* target.def (emit_gnu_property_note): Add a argetm.asm_out hook.
* toplev.c (compile_file): Call emit_gnu_property_note before
file_end.
* varasm.c (emit_gnu_property): New.
(emit_gnu_property_note): Likewise.
* config.in: Regenerated.
* configure: Likewise.
* doc/tm.texi: Likewise.
* config/i386/gnu-property.c (emit_gnu_property): Removed.
(TARGET_ASM_EMIT_GNU_PROPERTY_NOTE): New.
* doc/tm.texi.in: Add TARGET_ASM_EMIT_GNU_PROPERTY_NOTE.
---
 gcc/config.in  |  7 +
 gcc/config/i386/gnu-property.c | 31 --
 gcc/config/i386/i386.c |  2 ++
 gcc/configure  | 27 +++
 gcc/configure.ac   | 23 +
 gcc/doc/tm.texi|  5 
 gcc/doc/tm.texi.in |  2 ++
 gcc/output.h   |  2 ++
 gcc/target.def |  8 ++
 gcc/toplev.c   |  3 +++
 gcc/varasm.c   | 47 ++
 11 files changed, 126 insertions(+), 31 deletions(-)

diff --git a/gcc/config.in b/gcc/config.in
index 61cafe4f6c0..8a756aa3541 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -1690,6 +1690,13 @@
 #endif
 
 
+/* Define to 1 if your linker supports
+   GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_LD_INDIRECT_EXTERN_ACCESS_SUPPORT
+#endif
+
+
 /* Define if your PowerPC64 linker supports a large TOC. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_LD_LARGE_TOC
diff --git a/gcc/config/i386/gnu-property.c b/gcc/config/i386/gnu-property.c
index 4ba04403002..9fe8d00132e 100644
--- a/gcc/config/i386/gnu-property.c
+++ b/gcc/config/i386/gnu-property.c
@@ -24,37 +24,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "output.h"
 #include "linux-common.h"
 
-static void
-emit_gnu_property (unsigned int type, unsigned int data)
-{
-  int p2align = ptr_mode == SImode ? 2 : 3;
-
-  switch_to_section (get_section (".note.gnu.property",
- SECTION_NOTYPE, NULL));
-
-  ASM_OUTPUT_ALIGN (asm_out_file, p2align);
-  /* name length.  */
-  fprintf (asm_out_file, ASM_LONG "1f - 0f\n");
-  /* data length.  */
-  fprintf (asm_out_file, ASM_LONG "4f - 1f\n");
-  /* note type: NT_GNU_PROPERTY_TYPE_0.  */
-  fprintf (asm_out_file, ASM_LONG "5\n");
-  fprintf (asm_out_file, "0:\n");
-  /* vendor name: "GNU".  */
-  fprintf (asm_out_file, STRING_ASM_OP "\"GNU\"\n");
-  fprintf (asm_out_file, "1:\n");
-  ASM_OUTPUT_ALIGN (asm_out_file, p2align);
-  /* pr_type.  */
-  fprintf (asm_out_file, ASM_LONG "0x%x\n", type);
-  /* pr_datasz.  */
-  fprintf (asm_out_file, ASM_LONG "3f - 2f\n");
-  fprintf (asm_out_file, "2:\n");
-  fprintf (asm_out_file, ASM_LONG "0x%x\n", data);
-  fprintf (asm_out_file, "3:\n");
-  ASM_OUTPUT_ALIGN (asm_out_file, p2align);
-  fprintf (asm_out_file, "4:\n");
-}
-
 void
 file_end_indicate_exec_stack_and_gnu_property (void)
 {
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 9ca1ef512a4..3b678c4d5b6 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -24569,6 +24569,8 @@ ix86_libgcc_floating_mode_supported_p
 #if !TARGET_MACHO && !TARGET_DLLIMPORT_DECL_ATTRIBUTES
 # undef TARGET_ASM_RELOC_RW_MASK
 # define TARGET_ASM_RELOC_RW_MASK ix86_reloc_rw_mask
+# undef TARGET_ASM_EMIT_GNU_PROPERTY_NOTE
+# define TARGET_ASM_EMIT_GNU_PROPERTY_NOTE emit_gnu_property_note
 #endif
 
 static bool ix86_libc_has_fast_function (int fcode ATTRIBUTE_UNUSED)
diff --git a/gcc/configure b/gcc/configure
index b3de17009b8..13fe041b0b6 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -32172,6 +32172,33 @@ fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ld_bndplt_support" >&5
 $as_echo "$ld_bndplt_support" >&6; }
 
+# Check if linker supports GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS.
+ld_indirect_extern_access=0
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking linker with 
GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS" >&5
+$as_echo_n "checking

[PATCH v4 0/2] Implement indirect external access

2021-09-22 Thread H.J. Lu via Gcc-patches

Changes in the v4 patch.

1. Add nodirect_extern_access attribute.

Changes in the v3 patch.

1. GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS support has been added to
GNU binutils 2.38.  But the -z indirect-extern-access linker option is
only available for Linux/x86.  However, the --max-cache-size=SIZE linker
option was also addded within a day.  --max-cache-size=SIZE is used to
check for GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS support.

Changes in the v2 patch.

1. Rename the option to -fdirect-extern-access.

---
On systems with copy relocation:
* A copy in executable is created for the definition in a shared library
at run-time by ld.so.
* The copy is referenced by executable and shared libraries.
* Executable can access the copy directly.

Issues are:
* Overhead of a copy, time and space, may be visible at run-time.
* Read-only data in the shared library becomes read-write copy in
executable at run-time.
* Local access to data with the STV_PROTECTED visibility in the shared
library must use GOT.

On systems without function descriptor, function pointers vary depending
on where and how the functions are defined.
* If the function is defined in executable, it can be the address of
function body.
* If the function, including the function with STV_PROTECTED visibility,
is defined in the shared library, it can be the address of the PLT entry
in executable or shared library.

Issues are:
* The address of function body may not be used as its function pointer.
* ld.so needs to search loaded shared libraries for the function pointer
of the function with STV_PROTECTED visibility.

Here is a proposal to remove copy relocation and use canonical function
pointer:

1. Accesses, including in PIE and non-PIE, to undefined symbols must
use GOT.
  a. Linker may optimize out GOT access if the data is defined in PIE or
  non-PIE.
2. Read-only data in the shared library remain read-only at run-time
3. Address of global data with the STV_PROTECTED visibility in the shared
library is the address of data body.
  a. Can use IP-relative access.
  b. May need GOT without IP-relative access.
4. For systems without function descriptor,
  a. All global function pointers of undefined functions in PIE and
  non-PIE must use GOT.  Linker may optimize out GOT access if the
  function is defined in PIE or non-PIE.
  b. Function pointer of functions with the STV_PROTECTED visibility in
  executable and shared library is the address of function body.
   i. Can use IP-relative access.
   ii. May need GOT without IP-relative access.
   iii. Branches to undefined functions may use PLT.
5. Single global definition marker:

Add GNU_PROPERTY_1_NEEDED:

#define GNU_PROPERTY_1_NEEDED GNU_PROPERTY_UINT32_OR_LO

to indicate the needed properties by the object file.

Add GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS:

#define GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS (1U << 0)

to indicate that the object file requires canonical function pointers and
cannot be used with copy relocation.  This bit should be cleared in
executable when there are non-GOT or non-PLT relocations in relocatable
input files without this bit set.

  a. Protected symbol access within the shared library can be treated as
  local.
  b. Copy relocation should be disallowed at link-time and run-time.
  c. GOT function pointer reference is required at link-time and run-time.

The indirect external access marker can be used in the following ways:

1. Linker can decide the best way to resolve a relocation against a
protected symbol before seeing all relocations against the symbol.
2. Dynamic linker can decide if it is an error to have a copy relocation
in executable against the protected symbol in a shared library by checking
if the shared library is built with -fno-direct-extern-access.

Add a compiler option, -fdirect-extern-access. -fdirect-extern-access is
the default.  With -fno-direct-extern-access:

1. Always to use GOT to access undefined symbols, including in PIE and
non-PIE.  This is safe to do and does not break the ABI.
2. In executable and shared library, for symbols with the STV_PROTECTED
visibility:
  a. The address of data symbol is the address of data body.
  b. For systems without function descriptor, the function pointer is
  the address of function body.
These break the ABI and resulting shared libraries may not be compatible
with executables which are not compiled with -fno-direct-extern-access.
3. Generate an indirect external access marker in relocatable objects if
supported by linker.

H.J. Lu (2):
  Add -f[no-]direct-extern-access
  Add TARGET_ASM_EMIT_GNU_PROPERTY_NOTE

 gcc/c-family/c-attribs.c| 34 +++
 gcc/common.opt  |  4 ++
 gcc/config.in   |  7 +++
 gcc/config/i386/gnu-property.c  | 31 --
 gcc/config/i386/i386-protos.h   |  2 +-
 gcc/config/i386/i386.c  | 64 -
 gcc/configure

Re: [PATCH] Enable auto-vectorization at O2 with very-cheap cost model.

2021-09-22 Thread Hongtao Liu via Gcc-patches

On Thu, Sep 23, 2021 at 9:48 AM Hongtao Liu  wrote:
>
> On Wed, Sep 22, 2021 at 10:21 PM Martin Sebor  wrote:
> >
> > On 9/21/21 7:38 PM, Hongtao Liu wrote:
> > > On Mon, Sep 20, 2021 at 4:13 AM Martin Sebor  wrote:
> > ...
> > > diff --git a/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c 
> > > b/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c
> > > index 1d79930cd58..9351f7e7a1a 100644
> > > --- a/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c
> > > +++ b/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c
> > > @@ -1,7 +1,7 @@
> > >/* PR middle-end/91458 - inconsistent warning for writing past the 
> > > end
> > >   of an array member
> > >   { dg-do compile }
> > > -   { dg-options "-O2 -Wall -Wno-array-bounds -fno-ipa-icf" } */
> > > +   { dg-options "-O2 -Wall -Wno-array-bounds -fno-ipa-icf 
> > > -fno-tree-vectorize" } */
> > 
> >  The testcase is large - what part requires this change?  Given the
> >  testcase was added for inconsistent warnings do they now become
> >  inconsistent again as we enable vectorization at -O2?
> > 
> >  That said, the testcase adjustments need some explaining - I suppose
> >  you didn't just slap -fno-tree-vectorize to all of those changing
> >  behavior?
> > 
> > >>> void ga1_ (void)
> > >>> {
> > >>> a1_.a[0] = 0;
> > >>> a1_.a[1] = 1; // { dg-warning 
> > >>> "\\\[-Wstringop-overflow" }
> > >>> a1_.a[2] = 2; // { dg-warning 
> > >>> "\\\[-Wstringop-overflow" }
> > >>>
> > >>> struct A1 a;
> > >>> a.a[0] = 0;
> > >>> a.a[1] = 1;   // { dg-warning 
> > >>> "\\\[-Wstringop-overflow" }
> > >>> a.a[2] = 2;   // { dg-warning 
> > >>> "\\\[-Wstringop-overflow" }
> > >>> sink ();
> > >>> }
> > >>>
> > >>> It's supposed to be 2 warning for a.a[1] = 1 and a.a[2] = 1 since
> > >>> there are 2 accesses, but after enabling vectorization, there's only
> > >>> one access, so one warning is missing which causes the failure.
> >
> > With the stores vectorized, is the warning on the correct line or
> > does it point to the first store, the one that's in bounds, as
> > it does with -O3?  The latter would be a regression at -O2.
> For the upper case, It points to the second store which is out of
> bounds, the third store warning is missing.
> >
> > >>
> > >> I would find it preferable to change the test code over disabling
> > >> optimizations that are on by default.  My concern is that the test
> > >> would no longer exercise the default behavior.  (The same goes for
> > >> the -fno-ipa-icf option.)
> > > Hmm, it's a middle-end test, for some backend, it may not do
> > > vectorization(it depends on TARGET_VECTOR_MODE_SUPPORTED_P and
> > > relative cost model).
> >
> > Yes, there are quite a few warning tests like that.  Their main
> > purpose is to verify that in common GCC invocations (i.e., without
> > any special options) warnings are a) issued when expected and b)
> > not issued when not expected.  Otherwise, middle end warnings are
> > known to have both false positives and false negatives in some
> > invocations, depending on what optimizations are in effect.
> > Indiscriminately disabling common optimizations for these large
> > tests and invoking them under artificial conditions would
> > compromise this goal and hide the problems.
> >
> > If enabling vectorization at -O2 causes regressions in the quality
> > of diagnostics (as the test failure above indicates seems to be
> > happening) we should investigate these and open bugs for them so
> > they can be fixed.  We can then tweak the specific failing test
> > cases to avoid the failures until they are fixed.
> There are indeed cases of false positives and false negatives
> .i.e.
> // Verify warning for access to a definition with an initializer that
> // initializes the one-element array member.
> struct A1 a1i_1 = { 0, { 1 } };
>
> void ga1i_1 (void)
> {
>   a1i_1.a[0] = 0;
>   a1i_1.a[1] = 1;   // { dg-warning "\\\[-Wstringop-overflow" }
>   a1i_1.a[2] = 2;   // { dg-warning "\\\[-Wstringop-overflow" }
>
>   struct A1 a = { 0, { 1 } }; --- false positive here.
>   a.a[0] = 1;
>   a.a[1] = 2;   // { dg-warning
> "\\\[-Wstringop-overflow" } false negative here.
>   a.a[2] = 3;   // { dg-warning
> "\\\[-Wstringop-overflow" } false negative here.
>   sink ();
> }
Similar for
* gcc.dg/Warray-bounds-51.c.
* gcc.dg/Warray-parameter-3.c
* gcc.dg/Wstringop-overflow-14.c
* gcc.dg/Wstringop-overflow-21.c

So there're 3 situations.
1. All accesses are out of bound, and after vectorization, there are
some warnings missing.
2. Part of accesses are inbound, part of accesses are out of bound,
and after vectorization, the warning goes from out of bound line to
inbound line.
3. All access are out of bound, and after vectoriation, all warning
are missing, and goes to a false-positive line.

Re: [PATCH] Enable auto-vectorization at O2 with very-cheap cost model.

2021-09-22 Thread Hongtao Liu via Gcc-patches

On Wed, Sep 22, 2021 at 10:21 PM Martin Sebor  wrote:
>
> On 9/21/21 7:38 PM, Hongtao Liu wrote:
> > On Mon, Sep 20, 2021 at 4:13 AM Martin Sebor  wrote:
> ...
> > diff --git a/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c 
> > b/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c
> > index 1d79930cd58..9351f7e7a1a 100644
> > --- a/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c
> > +++ b/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c
> > @@ -1,7 +1,7 @@
> >/* PR middle-end/91458 - inconsistent warning for writing past the 
> > end
> >   of an array member
> >   { dg-do compile }
> > -   { dg-options "-O2 -Wall -Wno-array-bounds -fno-ipa-icf" } */
> > +   { dg-options "-O2 -Wall -Wno-array-bounds -fno-ipa-icf 
> > -fno-tree-vectorize" } */
> 
>  The testcase is large - what part requires this change?  Given the
>  testcase was added for inconsistent warnings do they now become
>  inconsistent again as we enable vectorization at -O2?
> 
>  That said, the testcase adjustments need some explaining - I suppose
>  you didn't just slap -fno-tree-vectorize to all of those changing
>  behavior?
> 
> >>> void ga1_ (void)
> >>> {
> >>> a1_.a[0] = 0;
> >>> a1_.a[1] = 1; // { dg-warning 
> >>> "\\\[-Wstringop-overflow" }
> >>> a1_.a[2] = 2; // { dg-warning 
> >>> "\\\[-Wstringop-overflow" }
> >>>
> >>> struct A1 a;
> >>> a.a[0] = 0;
> >>> a.a[1] = 1;   // { dg-warning 
> >>> "\\\[-Wstringop-overflow" }
> >>> a.a[2] = 2;   // { dg-warning 
> >>> "\\\[-Wstringop-overflow" }
> >>> sink ();
> >>> }
> >>>
> >>> It's supposed to be 2 warning for a.a[1] = 1 and a.a[2] = 1 since
> >>> there are 2 accesses, but after enabling vectorization, there's only
> >>> one access, so one warning is missing which causes the failure.
>
> With the stores vectorized, is the warning on the correct line or
> does it point to the first store, the one that's in bounds, as
> it does with -O3?  The latter would be a regression at -O2.
For the upper case, It points to the second store which is out of
bounds, the third store warning is missing.
>
> >>
> >> I would find it preferable to change the test code over disabling
> >> optimizations that are on by default.  My concern is that the test
> >> would no longer exercise the default behavior.  (The same goes for
> >> the -fno-ipa-icf option.)
> > Hmm, it's a middle-end test, for some backend, it may not do
> > vectorization(it depends on TARGET_VECTOR_MODE_SUPPORTED_P and
> > relative cost model).
>
> Yes, there are quite a few warning tests like that.  Their main
> purpose is to verify that in common GCC invocations (i.e., without
> any special options) warnings are a) issued when expected and b)
> not issued when not expected.  Otherwise, middle end warnings are
> known to have both false positives and false negatives in some
> invocations, depending on what optimizations are in effect.
> Indiscriminately disabling common optimizations for these large
> tests and invoking them under artificial conditions would
> compromise this goal and hide the problems.
>
> If enabling vectorization at -O2 causes regressions in the quality
> of diagnostics (as the test failure above indicates seems to be
> happening) we should investigate these and open bugs for them so
> they can be fixed.  We can then tweak the specific failing test
> cases to avoid the failures until they are fixed.
There are indeed cases of false positives and false negatives
.i.e.
// Verify warning for access to a definition with an initializer that
// initializes the one-element array member.
struct A1 a1i_1 = { 0, { 1 } };

void ga1i_1 (void)
{
  a1i_1.a[0] = 0;
  a1i_1.a[1] = 1;   // { dg-warning "\\\[-Wstringop-overflow" }
  a1i_1.a[2] = 2;   // { dg-warning "\\\[-Wstringop-overflow" }

  struct A1 a = { 0, { 1 } }; --- false positive here.
  a.a[0] = 1;
  a.a[1] = 2;   // { dg-warning
"\\\[-Wstringop-overflow" } false negative here.
  a.a[2] = 3;   // { dg-warning
"\\\[-Wstringop-overflow" } false negative here.
  sink ();
}

the last 2 warnings are missing, and there's new warning on the line
*struct A1 a = { 0, { 1 } };
>
> Martin



-- 
BR,
Hongtao

[PATCH] Fix value uninitialization in vn_reference_insert_pieces [PR102400]

2021-09-22 Thread Feng Xue OS via Gcc-patches

Bootstrapped/regtested on x86_64-linux.

Thanks,
Feng
---
2021-09-23  Feng Xue  

gcc/ChangeLog
PR tree-optimization/102400
* tree-ssa-sccvn.c (vn_reference_insert_pieces): Initialize
result_vdef to zero value.
---
 gcc/tree-ssa-sccvn.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index a901f51a025..e8b1c39184d 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -3811,6 +3811,7 @@ vn_reference_insert_pieces (tree vuse, alias_set_type set,
   if (result && TREE_CODE (result) == SSA_NAME)
 result = SSA_VAL (result);
   vr1->result = result;
+  vr1->result_vdef = NULL_TREE;
 
   slot = valid_info->references->find_slot_with_hash (vr1, vr1->hashcode,
  INSERT);
-- 
2.17.1

[PATCH] Fix null-pointer dereference in delete_dead_or_redundant_call [PR102451]

2021-09-22 Thread Feng Xue OS via Gcc-patches

Bootstrapped/regtested on x86_64-linux and aarch64-linux.

Thanks,
Feng

---
2021-09-23  Feng Xue  

gcc/ChangeLog:
PR tree-optimization/102451
* tree-ssa-dse.c (delete_dead_or_redundant_call): Record bb of stmt
before removal.
---
 gcc/tree-ssa-dse.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
index 98daa8ab24c..27287fe88ee 100644
--- a/gcc/tree-ssa-dse.c
+++ b/gcc/tree-ssa-dse.c
@@ -978,6 +978,7 @@ delete_dead_or_redundant_call (gimple_stmt_iterator *gsi, 
const char *type)
   fprintf (dump_file, "\n");
 }
 
+  basic_block bb = gimple_bb (stmt);
   tree lhs = gimple_call_lhs (stmt);
   if (lhs)
 {
@@ -985,7 +986,7 @@ delete_dead_or_redundant_call (gimple_stmt_iterator *gsi, 
const char *type)
   gimple *new_stmt = gimple_build_assign (lhs, ptr);
   unlink_stmt_vdef (stmt);
   if (gsi_replace (gsi, new_stmt, true))
-bitmap_set_bit (need_eh_cleanup, gimple_bb (stmt)->index);
+   bitmap_set_bit (need_eh_cleanup, bb->index);
 }
   else
 {
@@ -994,7 +995,7 @@ delete_dead_or_redundant_call (gimple_stmt_iterator *gsi, 
const char *type)
 
   /* Remove the dead store.  */
   if (gsi_remove (gsi, true))
-   bitmap_set_bit (need_eh_cleanup, gimple_bb (stmt)->index);
+   bitmap_set_bit (need_eh_cleanup, bb->index);
   release_defs (stmt);
 }
 }
-- 
2.17.1

Re: [PATCH] rs6000: Add psabi diagnostic for C++ zero-width bit field ABI change (PR102024)

2021-09-22 Thread Segher Boessenkool

Hi!

On Tue, Sep 21, 2021 at 05:35:56PM -0500, Bill Schmidt wrote:
> Previously zero-width bit fields were removed from structs, so that 
> otherwise
> homogeneous aggregates were treated as such and passed in FPRs and VSRs.
> This was incorrect behavior per the ELFv2 ABI.  Now that these fields are no
> longer being removed, we generate the correct parameter passing code.  Alert
> the unwary user in the rare cases where this behavior changes.

> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no 
> regressions.
> Is this okay for trunk?

This can obviously not change anything for other ABIs, so it doesn't
need testing anywhere else.  Good :-)

> @@ -6227,7 +6227,7 @@ const struct altivec_builtin_types 
> altivec_overloaded_builtins[] = {
>  
>  static int
>  rs6000_aggregate_candidate (const_tree type, machine_mode *modep,
> - int *empty_base_seen)
> + int *empty_base_seen, int *zero_width_bf_seen)

The new function parameter should be described in the function comment.

> + if (DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD (field))
> +   {
> + *zero_width_bf_seen = 1;
> + continue;
> +   }

Please add a comment here, saying what it is for?  I'd do it inside the
braces.

>   sub_count = rs6000_aggregate_candidate (TREE_TYPE (field), modep,
> - empty_base_seen);
> + empty_base_seen,
> + zero_width_bf_seen);

So it is important that this function only ever sets *zero_width_bf_seen
to 1, never resets it; please document that with it as well?

> +   inform (input_location,
> +   "parameter passing for an argument containing "
> +   "zero-width bit fields but that is otherwise a 
> "
> +   "homogeneous aggregate changed in GCC 12.1");

Just "GCC 12" please.  You might want to indicate that older compilers
did the wrong thing here.  And maybe say this is only for ELFv2 somehow?
In some way that doesn't make it more confusing than not saying it :-)

> +double
> +foo (a_thing a) // { dg-message "parameter passing for an argument 
> containing zero-width bit fields but that is otherwise a homogeneous 
> aggregate changed in GCC 12.1" }

I think you used "format=flawed" again?

Okay for trunk with such comment updates.  Thanks!

Segher

Re: [PATCH] c++: fix wrong fixit hints for misspelled typedef [PR77565]

2021-09-22 Thread Michel Morin via Gcc-patches

On Thu, Sep 23, 2021 at 5:09 AM Jason Merrill  wrote:
>
> On 9/21/21 20:53, Michel Morin wrote:
> > On Tue, Sep 21, 2021 at 5:24 AM Jason Merrill  wrote:
> >>
> >> On 9/17/21 13:31, Michel Morin wrote:
> >>> On Fri, Sep 17, 2021 at 3:23 AM Jason Merrill  wrote:
> 
>  On 9/16/21 11:50, Michel Morin wrote:
> > On Thu, Sep 16, 2021 at 5:44 AM Jason Merrill  wrote:
> >>
> >> On 9/14/21 04:29, Michel Morin via Gcc-patches wrote:
> >>> On Tue, Sep 14, 2021 at 7:14 AM David Malcolm  
> >>> wrote:
> 
>  On Tue, 2021-09-14 at 03:35 +0900, Michel Morin via Gcc-patches 
>  wrote:
> > Hi,
> >
> > PR77565 reports that, with the code `typdef int Int;`, GCC emits
> > "did you mean 'typeof'?" instead of "did you mean 'typedef'?".
> >
> > This happens because the typo corrector determines that `typeof` is 
> > a
> > candidate for suggestion (through
> > `cp_keyword_starts_decl_specifier_p`),
> > but `typedef` is not.
> >
> > This patch fixes the issue by adding `typedef` as a candidate. The
> > patch
> > additionally adds the `inline` specifier and cv-specifiers as a
> > candidate.
> > Here is a patch (tests `make check-gcc` pass on darwin):
> 
>  Thanks for this patch (and for reporting the bug in the first place).
> 
>  I notice that, as well as being used for fix-it hints by
>  lookup_name_fuzzy (indirectly via suggest_rid_p),
>  cp_keyword_starts_decl_specifier_p is also used by
>  cp_lexer_next_token_is_decl_specifier_keyword, which is used by
>  cp_parser_lambda_declarator_opt and 
>  cp_parser_constructor_declarator_p.
> >>>
> >>> Ah, you're right! Thank you for pointing this out.
> >>> I failed to grep those functions somehow.
> >>>
> >>> One thing that confuses me is that cp_keyword_starts_decl_specifier_p
> >>> misses many keywords that can start decl-specifiers (e.g.
> >>> typedef/inline/cv-qual and friend/explicit/virtual).
> >>> So let's wait C++ frontend maintainers ;)
> >>
> >> That is strange.  Let's add all the rest of them as well.
> >
> > Done. Thanks for your help!
> >
> > One more thing — cp_keyword_starts_decl_specifier_p includes 
> > RID_ATTRIBUTE
> > (from the beginning; see https://gcc.gnu.org/PR28261 ), but attributes 
> > are
> > not decl-specifiers. Would it be reasonable to remove this?
> 
>  It looks like the place that PR28261 used
>  cp_lexer_next_token_is_decl_specifier_keyword specifically exempts
>  attributes:
> 
> > && (!cp_lexer_next_token_is_decl_specifier_keyword 
> > (parser->lexer)
> > /* GNU attributes can actually appear both at the start 
> > of
> >a parameter and parenthesized declarator.
> >S (__attribute__((unused)) int);
> >is a constructor, but
> >S (__attribute__((unused)) foo) (int);
> >is a function declaration.  */
> > || (cp_parser_allow_gnu_extensions_p (parser)
> > && cp_next_tokens_can_be_gnu_attribute_p (parser)))
> 
>  So yes, let's remove RID_ATTRIBUTE and the || clause there.  I'd keep
>  the comment, but move it to go with the test for C++11 attributes below.
> >>>
> >>> Done. No regressions introduced.
> >>>
> > One more thing — cp_keyword_starts_decl_specifier_p includes 
> > RID_ATTRIBUTE
> > (from the beginning; see https://gcc.gnu.org/PR28261 ), but attributes 
> > are
> > not decl-specifiers.
> >>>
> >>> Oh, this is wrong. I thought that, since C++11 attributes are not a
> >>> decl-specifier, neither are GNU attributes. But the comment just before
> >>> cp_parser_decl_specifier_seq says that GNU attributes are considered as a
> >>> decl-specifier. So I'm not confident about the removal of RID_ATTRIBUTE in
> >>> cp_keyword_starts_decl_specifier_p...
> >>
> >> GNU attributes can appear in lots of places, and the only two callers of
> >> cp_parser_next_token_is_decl_specifier_keyword don't want to treat
> >> attributes accordingly.
> >
> > Makes sense.
> >
> >> Let's go with both your patches, and also
> >> remove the consequently-unnecessary attributes check in
> >> cp_parser_lambda_declarator_opt:
> >>
> >>>if (cp_lexer_next_token_is_decl_specifier_keyword (parser->lexer)
> >>>&& !cp_next_tokens_can_be_gnu_attribute_p (parser))
> >>
> >> OK with that change.
> >
> > Updated and rebased the patch. No regressions on x86_64-apple-darwin.
> >
> > Thank you for your help!
>
> Looks good, thanks.  You can push the patches yourself, right?

This is my first patch contribution to GCC, and I don't have write access.
So it'd be great if someone pushes the patches.

I assume these

Re: [PATCH] rs6000: Modify the way for extra penalized cost

2021-09-22 Thread Segher Boessenkool

Hi!

On Tue, Sep 21, 2021 at 11:24:08AM +0800, Kewen.Lin wrote:
> on 2021/9/18 上午6:01, Segher Boessenkool wrote:
> > On Thu, Sep 16, 2021 at 09:14:15AM +0800, Kewen.Lin wrote:
> >> The way with nunits * stmt_cost can get one much exaggerated
> >> penalized cost, such as: for V16QI on P8, it's 16 * 20 = 320,
> >> that's why we need one bound.  To make it scale, this patch
> >> doesn't use nunits * stmt_cost any more, but it still keeps
> >> nunits since there are actually nunits scalar loads there.  So
> >> it uses one cost adjusted from stmt_cost, since the current
> >> stmt_cost sort of considers nunits, we can stablize the cost
> >> for big nunits and retain the cost for small nunits.  After
> >> some tries, this patch gets the adjusted cost as:
> >>
> >> stmt_cost / (log2(nunits) * log2(nunits))
> > 
> > So for  V16QI it gives *16/(4*4) so *1
> > V8HI  it gives *8/(3*3)  so *8/9
> > V4SI  it gives *4/(2*2)  so *1
> > V2DI  it gives *2/(1*1)  so *2
> > and for V1TI  it gives *1/(0*0) which is UB (no, does not crash for us,
> > just gives wildly wrong answers; the div returns 0 on recent systems).
> 
> I don't expected we will have V1TI for strided/elementwise load,
> if it's one unit vector, it's the whole vector itself.
> Besides, the below assertion should exclude it already.

Yes.  But ignoring the UB for unexpectedly large vector components, the
1 / 1.111 / 1 / 2  scoring does not make much sense.  The formulas
"look" smooth and even sort of reasonable, but as soon as you look at
what it *means*, and realise the domain if the function is discrete
(only four or five possible inputs), and then see how the function
behaves on that...  Hrm :-)

> > This of course is assuming nunits will always be a power of 2, but I'm
> > sure that we have many other places in the compiler assuming that
> > already, so that is fine.  And if one day this stops being true we will
> > get a nice ICE, pretty much the best we could hope for.
> 
> Yeah, exact_log2 returns -1 for non power of 2 input, for example:

Exactly.

> >> +unsigned int adjusted_cost = stmt_cost / nunits_sq;
> > 
> > But this can divide by 0.  Or are we somehow guaranteed that nunits
> > will never be 1?  Yes the log2 check above, sure, but that ICEs if this
> > is violated; is there anything that actually guarantees it is true?
> 
> As I mentioned above, I don't expect we can have nunits 1 strided/ew load,
> and the ICE should check this and ensure dividing by zero never happens.  :)

Can you assert that *directly* then please?

> > A magic crazy formula like this is no good.  If you want to make the
> > cost of everything but V2D* be the same, and that of V2D* be twice that,
> > that is a weird heuristic, but we can live with that perhaps.  But that
> > beats completely unexplained (and unexplainable) magic!
> > 
> > Sorry.
> 
> That's all right, thanks for the comments!  let's improve it.  :)

I like that spirit :-)

> How about just assigning 2 for V2DI and 1 for the others for the
> penalized_cost_per_load with some detailed commentary, it should have
> the same effect with this "magic crazy formula", but I guess it can
> be more clear.

That is fine yes!  (Well, V2DF the same I guess?  Or you'll need very
detailed commentary :-) )

It is fine to say "this is just a heuristic without much supporting
theory" in places.  That is what most of our --param= are as well, for
example.  If counting two-element vectors as twice as expensive as all
other vectors helps performance, then so be it: if there is no better
way to cost things (or we do not know one), then what else are we to do?


Segher

Re: [PATCH] Overhaul jump thread state in forward threader.

2021-09-22 Thread Jeff Law via Gcc-patches





On 9/22/2021 12:41 PM, Aldy Hernandez wrote:

I've been pulling state from across the forward jump threader into the
jt_state class, but it it still didn't feel right.  The ultimate goal
was to keep track of candidate threading paths so that the simplifier
could simplify statements with the path as context.  This patch completes
the transition, while cleaning up a lot of things in the process.

I've revamped both state and the simplifier such that a base state class
contains only the blocks as they're registered, and any pass specific
knowledge is where it belongs... in the pass.  This allows VRP to keep
its const and copies business, and DOM to keep this as well as its evrp
client.  This makes the threader cleaner, as it will now have no knowledge
of either const/copies or evrp.

This also paves the wave for the upcoming hybrid threader, which will
just derive the state class and provide almost nothing, since the ranger
doesn't need to register any equivalences or ranges as it folds.

There is some code duplication in the simplifier, since both the DOM and
VRP clients use a vr_values based simplifier, but this is temporary as
the VRP client is about to be replaced with a hybrid ranger.

For a better view of what this patch achieves, here are the base
classes:

class jt_state
{
public:
   virtual ~jt_state () { }
   virtual void push (edge);
   virtual void pop ();
   virtual void register_equiv (tree dest, tree src, bool update_range =
false);
   virtual void register_equivs_edge (edge e);
   virtual void register_equivs_stmt (gimple *, basic_block,
 class jt_simplifier *);
   virtual void record_ranges_from_stmt (gimple *stmt, bool temporary);
   void get_path (vec &);
   void append_path (basic_block);
   void dump (FILE *);
   void debug ();
private:
   auto_vec m_blocks;
};

class jt_simplifier
{
public:
   virtual ~jt_simplifier () { }
   virtual tree simplify (gimple *, gimple *, basic_block, jt_state *) =
0;
};

There are no functional changes.

OK pending tests?

p.s. It's sad that this is starting to look clean, just in time to get
wiped out.

Sometimes you have to get it to this state so that it can get zapped.



gcc/ChangeLog:

* tree-ssa-dom.c (class dom_jump_threader_simplifier): Rename...
(class dom_jt_state): ...this and provide virtual overrides.
(dom_jt_state::register_equiv): New.
(class dom_jt_simplifier): Rename from
dom_jump_threader_simplifier.
(dom_jump_threader_simplifier::simplify): Rename...
(dom_jt_simplifier::simplify): ...to this.
(pass_dominator::execute): Use dom_jt_simplifier and
dom_jt_state.
* tree-ssa-threadedge.c (jump_threader::jump_threader):
Clean-up.
(jt_state::register_equivs_stmt): Abstract out...
(jump_threader::record_temporary_equivalences_from_stmts_at_dest):
...from here.
(jump_threader::thread_around_empty_blocks): Update state.
(jump_threader::thread_through_normal_block): Same.
(jt_state::jt_state): Remove.
(jt_state::push): Remove pass specific bits.  Keep block vector
updated.
(jt_state::append_path): New.
(jt_state::pop): Remove pass specific bits.
(jt_state::register_equiv): Same.
(jt_state::record_ranges_from_stmt): Same.
(jt_state::register_equivs_on_edge): Same.  Rename...
(jt_state::register_equivs_edge):  ...to this.
(jt_state::dump): New.
(jt_state::debug): New.
(jump_threader_simplifier::simplify): Remove.
(jt_state::get_path): New.
* tree-ssa-threadedge.h (class jt_simplifier): Make into a base
class.  Expose common functionality as virtual methods.
(class jump_threader_simplifier): Same.  Rename...
(class jt_simplifier): ...to this.
* tree-vrp.c (class vrp_jump_threader_simplifier): Rename...
(class vrp_jt_simplifier): ...to this. Provide pass specific
overrides.
(class vrp_jt_state): New.
(vrp_jump_threader_simplifier::simplify): Rename...
(vrp_jt_simplifier::simplify): ...to this.  Inline code from
what used to be the base class.
(vrp_jump_threader::vrp_jump_threader): Use vrp_jt_state and
vrp_jt_simplifier.

OK
jeff

Re: [PATCH] configure: Update --help output for --with-multilib-list

2021-09-22 Thread Jim Wilson

On Fri, Sep 17, 2021 at 4:39 AM Jonathan Wakely via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> The list of architectures that support the option is incomplete.
>
> gcc/ChangeLog:
>
> * configure.ac: Fix --with-multilib-list description.
> * configure: Regenerate.
>
> OK for trunk?
>

Looks like or1k has --with-multilib-list support also.  I'd suggest adding
that to the list also.  Ok with that change..

Jim

Fortran: Improve file-reading error diagnostic [PR55534] (was: Re: [Patch] Fortran: Improve -Wmissing-include-dirs warnings [PR55534])

2021-09-22 Thread Tobias Burnus


Hi Harald,

On 22.09.21 20:29, Harald Anlauf via Gcc-patches wrote:

What I find a bit confusing - from the viewpoint of a user - is the
case of using the preprocessor (-cpp), as one gets e.g.

: Warning: ./no/such/dir: No such file or directory
[-Wmissing-include-dirs]

while without -cpp:

f951: Warning: Nonexistent include directory './no/such/dir/'
[-Wmissing-include-dirs]


C/C++ do something likewise (grep for that string).

The reason for the  is the code in cpp.c's gfc_cpp_init,
which uses:
  cpp_change_file (cpp_in, LC_RENAME, _(""));

It might be possible to reset it by passing NULL to it, at the end
of that function but I don't know whether that causes side effects.
At least linemap_add then uses set->depth--.
It might work just fine, but I do not know.
(Additionally, cb_file_change or print_line needs to be updated
to handle to_file == NULL.)

Feel free to experiment there. Otherwise, I leave it as is.

 * * *

However, this patch now improves the diagnostic printed by
load_file – and uses directly an fatal error instead of
a usual error and then propagating the error through.

Errors are now also properly colored.

Note:
* -fpre-included= is not easily testable. It works when calling
  the compiler itself (f951) but the driver (gfortran) overrides
  it here with:
   -fpre-include=/usr/include/finclude/math-vector-fortran.h
  which exits.

* I did not include the test "include_22.f90" with:
include "include_22.f90"  ! { dg-error "File 'include_22.f90' is being included 
recursively" }
  as the error message seemingly confused DejaGNU and causes it
  to enter an endless loop.

OK for mainline?

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran: Improve file-reading error diagnostic [PR55534]

	PR fortran/55534

gcc/fortran/ChangeLog:

	* scanner.c (load_file): Return void, call (gfc_)fatal_error for
	all errors.
	(include_line, include_stmt, gfc_new_file): Remove exit call
	for failed load_file run.

gcc/testsuite/ChangeLog:

	* gfortran.dg/include_9.f90: Add dg-prune-output.
	* gfortran.dg/include_23.f90: New test.
	* gfortran.dg/include_24.f90: New test.

 gcc/fortran/scanner.c| 66 
 gcc/testsuite/gfortran.dg/include_23.f90 |  4 ++
 gcc/testsuite/gfortran.dg/include_24.f90 |  4 ++
 gcc/testsuite/gfortran.dg/include_9.f90  |  1 +
 4 files changed, 33 insertions(+), 42 deletions(-)

diff --git a/gcc/fortran/scanner.c b/gcc/fortran/scanner.c
index 52124bd5d36..5a450692ba3 100644
--- a/gcc/fortran/scanner.c
+++ b/gcc/fortran/scanner.c
@@ -47,6 +47,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "toplev.h"	/* For set_src_pwd.  */
 #include "debug.h"
 #include "options.h"
+#include "diagnostic-core.h"  /* For fatal_error. */
 #include "cpp.h"
 #include "scanner.h"
 
@@ -2230,7 +2231,7 @@ preprocessor_line (gfc_char_t *c)
 }
 
 
-static bool load_file (const char *, const char *, bool);
+static void load_file (const char *, const char *, bool);
 
 /* include_line()-- Checks a line buffer to see if it is an include
line.  If so, we call load_file() recursively to load the included
@@ -2396,9 +2397,7 @@ include_line (gfc_char_t *line)
 		   read by anything else.  */
 
   filename = gfc_widechar_to_char (begin, -1);
-  if (!load_file (filename, NULL, false))
-exit (FATAL_EXIT_CODE);
-
+  load_file (filename, NULL, false);
   free (filename);
   return 1;
 }
@@ -2505,9 +2504,7 @@ include_stmt (gfc_linebuf *b)
   filename[i] = (unsigned char) c;
 }
   filename[length] = '\0';
-  if (!load_file (filename, NULL, false))
-exit (FATAL_EXIT_CODE);
-
+  load_file (filename, NULL, false);
   free (filename);
 
 do_ret:
@@ -2525,9 +2522,11 @@ do_ret:
   return ret;
 }
 
+
+
 /* Load a file into memory by calling load_line until the file ends.  */
 
-static bool
+static void
 load_file (const char *realfilename, const char *displayedname, bool initial)
 {
   gfc_char_t *line;
@@ -2549,13 +2548,8 @@ load_file (const char *realfilename, const char *displayedname, bool initial)
 
   for (f = current_file; f; f = f->up)
 if (filename_cmp (filename, f->filename) == 0)
-  {
-	fprintf (stderr, "%s:%d: Error: File '%s' is being included "
-		 "recursively\n", current_file->filename, current_file->line,
-		 filename);
-	return false;
-  }
-
+  fatal_error (linemap_line_start (line_table, current_file->line, 0),
+		   "File %qs is being included recursively", filename);
   if (initial)
 {
   if (gfc_src_file)
@@ -2567,10 +2561,7 @@ load_file (const char *realfilename, const char *displayedname, bool initial)
 	input = gfc_open_file (realfilename);
 
   if (input == NULL)
-	{
-	  gfc_error_now ("Cannot open file %qs", filename);
-	  return false;
-	}
+	gfc_fatal_error ("Cannot

Re: [PATCH v3] Fix ICE when mixing VLAs and statement expressions [PR91038]

2021-09-22 Thread Jason Merrill via Gcc-patches


On 9/5/21 15:14, Uecker, Martin wrote:


Here is the third version of the patch. This also
fixes the index zero case.  Thus, this should be
a complete fix for 91038 and should fix all cases
also supported by clang.  Still not working is
returning a struct of variable size from a
statement expression (29970) when the size depends
on computations inside the statement expression.

Bootstrapped and regression tested
on x86-64 for all languages.

Martin




Fix ICE when mixing VLAs and statement expressions [PR91038]

When returning VM-types from statement expressions, this can
lead to an ICE when declarations from the statement expression
are referred to later. Most of these issues can be addressed by
gimplifying the base expression earlier in gimplify_compound_lval.
Another issue is fixed by not reording some size-related expressions
during folding. This fixes PR91038 and some of the test cases
from PR29970 (structs with VLA members need further work).

 
 2021-08-01  Martin Uecker  
 
 gcc/

PR c/91038
PR c/29970
* gimplify.c (gimplify_var_or_parm_decl): Update comment.
(gimplify_compound_lval): Gimplify base expression first.
(gimplify_target_expr): Do not gimplify size expression.
* fold-const.c (fold_binary_loc): Do not reorder SAVE_EXPR
in pointer arithmetic for variably modified types.
 
 gcc/testsuite/

PR c/91038
PR c/29970
* gcc.dg/vla-stexp-3.c: New test.
* gcc.dg/vla-stexp-4.c: New test.
* gcc.dg/vla-stexp-5.c: New test.
* gcc.dg/vla-stexp-6.c: New test.
* gcc.dg/vla-stexp-7.c: New test.
* gcc.dg/vla-stexp-8.c: New test.
* gcc.dg/vla-stexp-9.c: New test.


diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index ff23f12f33c..1e6f50692b5 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -10854,7 +10854,15 @@ fold_binary_loc (location_t loc, enum tree_code code, 
tree type,
  return build2_loc (loc, COMPOUND_EXPR, type, TREE_OPERAND (arg0, 0),
 tem);
}
-  if (TREE_CODE (arg1) == COMPOUND_EXPR)
+  /* This interleaves execution of the two sub-expressions
+which is allowed in C.  For pointer arithmetic when the
+the pointer has a variably modified type, the right expression
+might have a SAVE_EXPR which depends on the left expr, so
+do not fold in this case.  */
+  if (TREE_CODE (arg1) == COMPOUND_EXPR
+ && !(code == POINTER_PLUS_EXPR
+  && TREE_CODE (TREE_OPERAND (arg1, 0)) == SAVE_EXPR)
+  && variably_modified_type_p (type, NULL_TREE))


This seems pretty fragile.  If the problem is that the SAVE_EXPR depends 
on a statement-expr on the LHS, can't that happen with expressions other 
than POINTER_PLUS_EXPR?


Maybe we should include the statement-expr in the SAVE_EXPR?


{
  tem = fold_build2_loc (loc, code, type, op0,
 fold_convert_loc (loc, TREE_TYPE (op1),
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 99d1c7fcce4..8ee205f593c 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -2840,7 +2840,10 @@ gimplify_var_or_parm_decl (tree *expr_p)
   declaration, for which we've already issued an error.  It would
   be really nice if the front end wouldn't leak these at all.
   Currently the only known culprit is C++ destructors, as seen
- in g++.old-deja/g++.jason/binding.C.  */
+ in g++.old-deja/g++.jason/binding.C.
+ Another possible culpit are size expressions for variably modified
+ types which are lost in the FE or not gimplified correctly.
+  */
if (VAR_P (decl)
&& !DECL_SEEN_IN_BIND_EXPR_P (decl)
&& !TREE_STATIC (decl) && !DECL_EXTERNAL (decl)
@@ -2985,16 +2988,22 @@ gimplify_compound_lval (tree *expr_p, gimple_seq 
*pre_p, gimple_seq *post_p,
   expression until we deal with any variable bounds, sizes, or
   positions in order to deal with PLACEHOLDER_EXPRs.
  
- So we do this in three steps.  First we deal with the annotations

- for any variables in the components, then we gimplify the base,
- then we gimplify any indices, from left to right.  */
+ The base expression may contain a statement expression that
+ has declarations used in size expressions, so has to be
+ gimplified before gimplifying the size expressions.
+
+ So we do this in three steps.  First we deal with variable
+ bounds, sizes, and positions, then we gimplify the base,
+ then we deal with the annotations for any variables in the
+ components and any indices, from left to right.  */
+
for (i = expr_stack.length () - 1; i >= 0; i--)
  {
tree t = expr_stack[i];
  
if (TREE_CODE (t) == ARRAY_REF || TREE_CODE (t) == ARRAY_RANGE_REF)

{
- /* Gimplify the low bound and element type size and put them into
+ /* Deal with the low bound and element type size and put them into

Re: [PATCH] rs6000: Add psabi diagnostic for C++ zero-width bit field ABI change (PR102024)

2021-09-22 Thread Bill Schmidt via Gcc-patches


Hi Jakub,

On 9/22/21 11:33 AM, Jakub Jelinek wrote:

On Wed, Sep 22, 2021 at 05:02:15PM +0200, Jakub Jelinek via Gcc-patches wrote:

@@ -6298,7 +6298,8 @@ rs6000_aggregate_candidate (const_tree type, machine_mode 
*modep,
  return -1;
count = rs6000_aggregate_candidate (TREE_TYPE (type), modep,
-   empty_base_seen);
+   empty_base_seen,
+   zero_width_bf_seen);
if (count == -1
|| !index
|| !TYPE_MAX_VALUE (index)
@@ -6336,6 +6337,12 @@ rs6000_aggregate_candidate (const_tree type, 
machine_mode *modep,
if (TREE_CODE (field) != FIELD_DECL)
  continue;
+   if (DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD (field))
+ {
+   *zero_width_bf_seen = 1;
+   continue;
+ }

So, from what you wrote, :0 in the ppc* psABIs the intent is that :0 is not
ignored, right?
In that case I don't really understand the above (the continue in
particular).  Because the continue means it is ignored for C++ and not
ignored for C, so basically you return to the 4.5-11 ABI incompatibility
between C and C++.
C++ :0 will have DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD set, C :0 will not...

To be more precise, I'd expect what most targets want to do for the
actual ABI decisions not to use DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD at all.
I.e. do:
   if (TREE_CODE (field) != FIELD_DECL)
 continue;
   if (DECL_BIT_FIELD (field) && integer_zerop (DECL_SIZE (field)))
 {
   // :0
   // in some psABIs, ignore it, i.e. continue;
   // in others psABIs, take them into account, i.e. do nothing.
 }
and use DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD only for the -Wpsabi purposes.

The only exception would be for targets that decide to keep GCC 4.5-11
compatibility with the C incompatible with C++.


I think you're misunderstanding what I'm trying to do with this patch.  
I am not changing the code generation at all (your patch did that).  All 
I'm doing is detecting when the old code generation and new code 
generation will differ, and emitting a diagnostic in that case.


The way I do that is to allow rs6000_aggregate_candidate to *think* that 
something is a homogeneous aggregate even when zero-width bitfields are 
present (hence the continue), but record the fact that we saw one.  This 
gives the same answer as we gave before your patch.  Then, in 
rs6000_discover_homogeneous_aggregate, once we think we have a 
homogeneous aggregate, we check whether we actually had a zero-width 
bitfield present.  If so, then we diagnose the change in code generation 
and return false, indicating that we didn't actually find a homogeneous 
aggregate.  Other than the diagnostic, this matches the behavior after 
your patch.


I've verified that we didn't change code generation for C code with 
zero-width bitfields as a result of either your patch or mine. Before 
and after, a C struct containing a zero-width bitfield causes us to 
avoid generating code for a homogeneous aggregate, just as we do for C++ 
after your patch.


I hope this helps clear things up, and I apologize for not giving a 
better description of my intent.


Thanks!
Bill


Jakub

Re: [PATCH] warn for more impossible null pointer tests [PR102103]

2021-09-22 Thread Jason Merrill via Gcc-patches


On 9/21/21 20:34, Martin Sebor wrote:

On 9/21/21 3:40 PM, Jason Merrill wrote:

On 9/17/21 12:02, Martin Sebor wrote:

On 9/8/21 2:06 PM, Jason Merrill wrote:

On 9/2/21 7:53 PM, Martin Sebor wrote:
@@ -4622,28 +4622,94 @@ warn_for_null_address (location_t location, 
tree op, tsubst_flags_t complain)

    if (!warn_address
    || (complain & tf_warning) == 0
    || c_inhibit_evaluation_warnings != 0
-  || warning_suppressed_p (op, OPT_Waddress))
+  || warning_suppressed_p (op, OPT_Waddress)
+  || processing_template_decl != 0)


Completely suppressing this warning in templates seems like a 
regression;  I'd think we could recognize many relevant cases before 
instantiation.  You just can't assume that ADDR_EXPR has the default 
meaning if it has unknown type (i.e. because op0 is type-dependent).


I added the suppression to keep g++.dg/warn/pr101219.C from failing
but in hindsight I should have questioned the reasoning behind
the "no warning emitted here (no instantiation)" comment in the test.

I agree that it would be helpful to diagnose the type-independent
subset of the problem even in uninstantiated templates.  Current
trunk doesn't (it never has), but with my patch and the suppression
above removed it does.  I've updated the tests to expect it.

Please see the attached revision.

Martin

PS There are still more opportunities to issue -Waddress in templates
that this patch doesn't handle, e.g.,:

   template  bool f (T *p) { return  == 0; }

Handling this will take more surgery.

PPS It seems that most other warnings (and even some errors, like
-Wnarrowing) are suppressed in uninstantiated templates as well,
even for non-dependent expressions.  In the few test cases I looked
at Clang does better.  It sounds like you'd like to see improvements
in this direction not just for -Waddress but in general.  Just for
the avoidance of doubt, can you confirm that for future reference?


Yes, in general it's better to diagnose sooner.


+  if (TREE_CODE (cop) == NON_LVALUE_EXPR)
+    /* Unwrap the expression for C++ 98.  */
+    cop = TREE_OPERAND (cop, 0);


What does this have to do with C++98?


The code is needed to avoid failures in C++ 98 in the test below
where COP is a NON_LVALUE_EXPR which isn't handled below otherwise.
I didn't investigate why that happens (it works fine if f() is
an ordinary member function).

   void f (bool);

   void g ()
   {
     struct A { virtual void vf (); };

     f (::vf);   // missing -Waddress in C++ 98 mode
   }




+  if (TREE_CODE (cop) == PTRMEM_CST)
+    {
+  /* The address of a nonstatic data member is never null.  */
+  warning_at (location, OPT_Waddress,
+  "the address %qE will never be NULL",


Capitalizing NULL when talking about pointers-to-members seems a bit 
odd, but I guess it's fine.


I agree.  My personal preference is for lowercase null (in all
languages) since that's the technical term for it.  I used NULL
here only to conform to the existing style.  I'm willing to
change all these warnings to either use null or to some form
that doesn't mention null (there are two in use, although if
I had my druthers I'd choose some other phrasing altogether).
Let me know if you would support such a change.


I don't feel strongly about it.  I agree that lowercase or another 
phrasing would be better, but probably better to avoid adding work for 
the translators with the churn.



The C++ changes are OK.


Jeff, should I take your previous "Generally OK" as an approval
for the rest of the patch as well?  (It has not changed in v2.)
I have just submitted a Glibc patch to suppress the new instances
there.

Martin

Re: [PATCH] c++: fix wrong fixit hints for misspelled typedef [PR77565]

2021-09-22 Thread Jason Merrill via Gcc-patches


On 9/21/21 20:53, Michel Morin wrote:

On Tue, Sep 21, 2021 at 5:24 AM Jason Merrill  wrote:


On 9/17/21 13:31, Michel Morin wrote:

On Fri, Sep 17, 2021 at 3:23 AM Jason Merrill  wrote:


On 9/16/21 11:50, Michel Morin wrote:

On Thu, Sep 16, 2021 at 5:44 AM Jason Merrill  wrote:


On 9/14/21 04:29, Michel Morin via Gcc-patches wrote:

On Tue, Sep 14, 2021 at 7:14 AM David Malcolm  wrote:


On Tue, 2021-09-14 at 03:35 +0900, Michel Morin via Gcc-patches wrote:

Hi,

PR77565 reports that, with the code `typdef int Int;`, GCC emits
"did you mean 'typeof'?" instead of "did you mean 'typedef'?".

This happens because the typo corrector determines that `typeof` is a
candidate for suggestion (through
`cp_keyword_starts_decl_specifier_p`),
but `typedef` is not.

This patch fixes the issue by adding `typedef` as a candidate. The
patch
additionally adds the `inline` specifier and cv-specifiers as a
candidate.
Here is a patch (tests `make check-gcc` pass on darwin):


Thanks for this patch (and for reporting the bug in the first place).

I notice that, as well as being used for fix-it hints by
lookup_name_fuzzy (indirectly via suggest_rid_p),
cp_keyword_starts_decl_specifier_p is also used by
cp_lexer_next_token_is_decl_specifier_keyword, which is used by
cp_parser_lambda_declarator_opt and cp_parser_constructor_declarator_p.


Ah, you're right! Thank you for pointing this out.
I failed to grep those functions somehow.

One thing that confuses me is that cp_keyword_starts_decl_specifier_p
misses many keywords that can start decl-specifiers (e.g.
typedef/inline/cv-qual and friend/explicit/virtual).
So let's wait C++ frontend maintainers ;)


That is strange.  Let's add all the rest of them as well.


Done. Thanks for your help!

One more thing — cp_keyword_starts_decl_specifier_p includes RID_ATTRIBUTE
(from the beginning; see https://gcc.gnu.org/PR28261 ), but attributes are
not decl-specifiers. Would it be reasonable to remove this?


It looks like the place that PR28261 used
cp_lexer_next_token_is_decl_specifier_keyword specifically exempts
attributes:


&& (!cp_lexer_next_token_is_decl_specifier_keyword (parser->lexer)
/* GNU attributes can actually appear both at the start of
   a parameter and parenthesized declarator.
   S (__attribute__((unused)) int);
   is a constructor, but
   S (__attribute__((unused)) foo) (int);
   is a function declaration.  */
|| (cp_parser_allow_gnu_extensions_p (parser)
&& cp_next_tokens_can_be_gnu_attribute_p (parser)))


So yes, let's remove RID_ATTRIBUTE and the || clause there.  I'd keep
the comment, but move it to go with the test for C++11 attributes below.


Done. No regressions introduced.


One more thing — cp_keyword_starts_decl_specifier_p includes RID_ATTRIBUTE
(from the beginning; see https://gcc.gnu.org/PR28261 ), but attributes are
not decl-specifiers.


Oh, this is wrong. I thought that, since C++11 attributes are not a
decl-specifier, neither are GNU attributes. But the comment just before
cp_parser_decl_specifier_seq says that GNU attributes are considered as a
decl-specifier. So I'm not confident about the removal of RID_ATTRIBUTE in
cp_keyword_starts_decl_specifier_p...


GNU attributes can appear in lots of places, and the only two callers of
cp_parser_next_token_is_decl_specifier_keyword don't want to treat
attributes accordingly.


Makes sense.


Let's go with both your patches, and also
remove the consequently-unnecessary attributes check in
cp_parser_lambda_declarator_opt:


   if (cp_lexer_next_token_is_decl_specifier_keyword (parser->lexer)
   && !cp_next_tokens_can_be_gnu_attribute_p (parser))


OK with that change.


Updated and rebased the patch. No regressions on x86_64-apple-darwin.

Thank you for your help!


Looks good, thanks.  You can push the patches yourself, right?


I've split the patch into two. The first one is for adding missing keywords to
fix PR77565 and the second one is for removing the "attribute" keyword.
Here is the second patch (if this is not applied, that's no problem ;) )

==
c++: adjust the handling of RID_ATTRIBUTE.

gcc/cp/ChangeLog:

* parser.c (cp_keyword_starts_decl_specifier_p): Do not
handle RID_ATTRIBUTE.
(cp_parser_constructor_declarator_p): Remove now-redundant
checks.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 40308d0d33f..d184a3aca7e 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -1062,7 +1062,6 @@ cp_keyword_starts_decl_specifier_p (enum rid keyword)
   case RID_TYPEDEF:
   case RID_INLINE:
 /* GNU extensions.  */
-case RID_ATTRIBUTE:
   case RID_TYPEOF:
 /* C++11 extensions.  */
   case RID_DECLTYPE:
@@ -30798,23 +30797,22 @@ cp_parser_constructor_declarator_p
(cp_parser *parser, cp_parser_flags flags,
  /* A parameter declaration begins with a

Re: [PATCH] c++: improve dumping of templated decls

2021-09-22 Thread Jason Merrill via Gcc-patches


On 9/22/21 11:56, Patrick Palka wrote:

This makes the dumping routines output more information for templated
decls, to help streamline debugging.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk?


OK.


gcc/cp/ChangeLog:

* ptree.c (cxx_print_decl): Dump the DECL_TEMPLATE_RESULT of
a TEMPLATE_DECL.  Dump the DECL_TEMPLATE_INFO rather than just
printing its pointer.
---
  gcc/cp/ptree.c | 10 +++---
  1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/gcc/cp/ptree.c b/gcc/cp/ptree.c
index 7f140f5f06b..1dcd764af01 100644
--- a/gcc/cp/ptree.c
+++ b/gcc/cp/ptree.c
@@ -51,6 +51,7 @@ cxx_print_decl (FILE *file, tree node, int indent)
  }
else if (TREE_CODE (node) == TEMPLATE_DECL)
  {
+  print_node (file, "result", DECL_TEMPLATE_RESULT (node), indent + 4);
print_node (file, "parms", DECL_TEMPLATE_PARMS (node), indent + 4);
indent_to (file, indent + 3);
fprintf (file, " full-name \"%s\"",
@@ -115,13 +116,8 @@ cxx_print_decl (FILE *file, tree node, int indent)

if (VAR_OR_FUNCTION_DECL_P (node)

&& DECL_TEMPLATE_INFO (node))
-{
-  if (need_indent)
-   indent_to (file, indent + 3);
-  fprintf (file, " template-info %p",
-  (void *) DECL_TEMPLATE_INFO (node));
-  need_indent = false;
-}
+print_node (file, "template-info", DECL_TEMPLATE_INFO (node),
+   indent + 4);
  }
  
  void

Re: [PATCH] c++: error message for dependent template members [PR70417]

2021-09-22 Thread Jason Merrill via Gcc-patches


On 9/17/21 18:22, Anthony Sharp wrote:

And also re-attaching the patch!

On Fri, 17 Sept 2021 at 23:17, Anthony Sharp  wrote:


Re-adding gcc-patches@gcc.gnu.org.

-- Forwarded message -
From: Anthony Sharp 
Date: Fri, 17 Sept 2021 at 23:11
Subject: Re: [PATCH] c++: error message for dependent template members [PR70417]
To: Jason Merrill 


Hi Jason! Apologies for the delay.


This is basically core issue 1835, http://wg21.link/cwg1835



This was changed for C++23 by the paper "Declarations and where to find
them", http://wg21.link/p1787


Interesting, I was not aware of that. I was very vaguely aware that a
template-id in a class member access expression could be found by
ordinary lookup (very bottom of here
https://en.cppreference.com/w/cpp/language/dependent_name), but it's
interesting to see it is deeper than I realised.


But in either case, whether create is in a dependent scope depends on
how we resolve impl::, we don't need to remember further back in the
expression.  So your dependent_expression_p parameter seems like the
wrong approach.  Note that when we're looking up the name after ->, the
type of the object expression is in parser->context->object_type.


That's true. I think my thinking was that since it already got figured
out in cp_parser_postfix_dot_deref_expression, which is where . and ->
access expressions come from, I thought I might as well pass it
through, since it seemed to work. But looking again, you're right,
it's not really worth the hassle; might as well just call
dependent_scope_p again.


The cases you fixed in symbol-summary.h are indeed mistakes, but not
ill-formed, so giving an error on them is wrong.  For example, here is a
well-formed program that is rejected with your patch:



template  void f(T t) { t.m(0); }
struct A { int m; } a;
int main() { f(a); }


I suppose there was always going to be edge-cases when doing it the
way I've done. But yes, it can be worked-around by making it a warning
instead. Interestingly Clang doesn't trip up on that example, so I
guess they must be examining it some other way (e.g. at instantiation
time) - but that approach perhaps misses out on the slight performance
improvement this seems to bring.


Now that we're writing C++, I'd prefer to avoid this kind of pattern in
favor of RAII, such as saved_token_sentinel.  If this is still relevant
after addressing the above comments.


Sorry, it's the junior developer in me showing! So this confused me at
first. After having mucked around a bit I tried using
saved_token_sentinel but didn't see any benefit since it doesn't
rollback on going out of scope, and I'll always want to rollback. I
can call rollback directly, but then I might as well save and restore
myself. So what I did was use it but also modify it slightly to
rollback by default on going out of scope (in my mind that makes more
sense, since if something goes wrong you wouldn't normally want to
commit anything that happened [edit 1: unless committing was part of
the whole sanity checking thing] [edit 2: well I guess you could also
argue that since this is a parser after all, we like to KEEP things
sometimes]). But anyways, I made this configurable; it now has three
modes - roll-back, commit or do nothing. Let me know if you think
that's not the way to go.


I like adding the configurability, but I think let's keep committing as 
the default behavior.  And adding the parameter to the rollback function 
seems unnecessary.  For the behavior argument, let's use an enum to be 
clearer.



This code doesn't handle skipping matched ()/{}/[] in the
template-argument-list.  You probably want to involve
cp_parser_skip_to_end_of_template_parameter_list somehow.


Good point. It required some refactoring, but I have used it. Also,
just putting it out there, this line from
cp_parser_skip_to_end_of_template_parameter_list makes zero sense to
me (why throw an error OR immediately return?), but I have worked
around it, since it seems to break without it:


/* Are we ready, yet?  If not, issue error message.  */
if (cp_parser_require (parser, CPP_GREATER, RT_GREATER))
   return false;


If the next token is >, we're already at the end of the template 
argument list.  If not, then something has gone wrong, so give an error 
and skip ahead.



Last thing - I initially made a mistake. I put something like:

(next_token->type == CPP_NAME
  && MAYBE_CLASS_TYPE_P (parser->scope)
  && !constructor_name_p (cp_expr (next_token->u.value,
   
next_token->location),
parser->scope))

Instead of:

!(next_token->type == CPP_NAME
   && MAYBE_CLASS_TYPE_P (parser->scope)
   && constructor_name_p (cp_expr (next_token->u.value,
   
next_token->location),
parser->scope))

This meant a lot of things were being excluded that weren't supposed
to be. Oops!

Re: [PATCH V3 0/6] Initial support for AVX512FP16

2021-09-22 Thread Joseph Myers

On Wed, 22 Sep 2021, Iain Sandoe wrote:

> However, note that the use of __MACH__ to guard the Mach-O code has been
> there for a long time (it is present in all open branches).  So it's possible 
> that has
> been silently doing the wrong thing for some time,

That's "wrong thing" as in having previously been suboptimal; the 
semantics of the function aliases would still have been correct (until the 
HFmode changes introduced a build failure), it would just have been less 
efficient for them to be wrappers rather than proper aliases.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Objective-C: fix class_ro layout for non-LP64

2021-09-22 Thread Iain Sandoe

Hi Matt,

thanks for the patch.

> On 21 Sep 2021, at 23:29, Matt Jacobson via Gcc-patches 
>  wrote:
> 
> Fix class_ro layout for non-LP64.  

> On LP64, the requisite padding is added at a lower level.  

However, the behaviour is changed - the existing implementation is explicit 
about the fields and
clears the reserved ones (and, ISTR, that was based on what the gcc-4.2.1 
compiler did).

(as an aside, in general, I have to test changes in the runtime across the 
range of supported
 Darwin versions, since there have been clang changes too).

> For non-LP64, this fixes binary compatibility with clang-built 
> classes/runtimes.

> Tested by examining the generated assembly for a class_ro in both cases (and 
> in 
> the case of clang), for both x86_64 (64-bit pointers) and AVR (16-bit 
> pointers).
> Tested by running a program on AVR with a GCC-built class using a clang-built 
> Objective-C runtime.  Tested by running a program on x86_64/Darwin with a GCC-
> built class and the clang-built system Objective-C runtime.
> 
> Patch also available at:
> 
> 
> I don't have commit access, so if this patch is suitable, I'd need someone 
> else 
> to commit it for me.  Thanks.

how about we keep the LP64 behaviour unchanged and amend the comments and 
ifdefs as below,
does this work for your case?
(typed into email and thus untested).

thanks
Iain

> gcc/objc/ChangeLog:
> 
> 2021-09-21  Matt Jacobson  
> 
>   * objc-next-runtime-abi-02.c (struct class_ro_t): Remove explicit 
> alignment 
>   padding.
>   (build_v2_class_templates): Remove explicit alignment padding.
>   (build_v2_class_ro_t_initializer): Adjust initializer.
> 
> 
> diff --git a/gcc/objc/objc-next-runtime-abi-02.c 
> b/gcc/objc/objc-next-runtime-abi-02.c
> index 42645e22316..c3af369ff0d 100644
> --- a/gcc/objc/objc-next-runtime-abi-02.c
> +++ b/gcc/objc/objc-next-runtime-abi-02.c
> @@ -632,9 +632,7 @@ struct class_ro_t
> uint32_t const flags;
> uint32_t const instanceStart;
> uint32_t const instanceSize;\

> -#ifdef __LP64__
> -uint32_t const reserved;
> -#endif
> +// [32 bits of reserved space here on LP64 platforms]
^revert this

> const uint8_t * const ivarLayout;
> const char *const name;
> const struct method_list_t * const baseMethods;
> @@ -677,11 +675,6 @@ build_v2_class_templates (void)
>   /* uint32_t const instanceSize; */
>   add_field_decl (integer_type_node, "instanceSize", );
> 
#ifdef __LP64__
/* For compatibility with existing implementations of the 64 bit NeXT
   library, explicitly describe reserved fileds used for alignment
   padding.  */

>   /* uint32_t const reserved; */
>   add_field_decl (integer_type_node, "reserved", );
#endif
> 
>   /* const uint8_t * const ivarLayout; */
>   cnst_strg_type = build_pointer_type (unsigned_char_type_node);
>   add_field_decl (cnst_strg_type, "ivarLayout", );
> @@ -3225,12 +3218,6 @@ build_v2_class_ro_t_initializer (tree type, tree name,
>   CONSTRUCTOR_APPEND_ELT (initlist, NULL_TREE,
> build_int_cst (integer_type_node, instanceSize));
#ifdef __LP64__
/* For compatibility with existing implementations of the 64 bit NeXT
   library, ensure that reserved padding fields are 0-initialized.  */

>   CONSTRUCTOR_APPEND_ELT (initlist, NULL_TREE,
>   build_int_cst (integer_type_node, 0));
#endif
> 
>   /* ivarLayout */
>   unsigned_char_star = build_pointer_type (unsigned_char_type_node);
>   if (ivarLayout)

[PATCH] Overhaul jump thread state in forward threader.

2021-09-22 Thread Aldy Hernandez via Gcc-patches

I've been pulling state from across the forward jump threader into the
jt_state class, but it it still didn't feel right.  The ultimate goal
was to keep track of candidate threading paths so that the simplifier
could simplify statements with the path as context.  This patch completes
the transition, while cleaning up a lot of things in the process.

I've revamped both state and the simplifier such that a base state class
contains only the blocks as they're registered, and any pass specific
knowledge is where it belongs... in the pass.  This allows VRP to keep
its const and copies business, and DOM to keep this as well as its evrp
client.  This makes the threader cleaner, as it will now have no knowledge
of either const/copies or evrp.

This also paves the wave for the upcoming hybrid threader, which will
just derive the state class and provide almost nothing, since the ranger
doesn't need to register any equivalences or ranges as it folds.

There is some code duplication in the simplifier, since both the DOM and
VRP clients use a vr_values based simplifier, but this is temporary as
the VRP client is about to be replaced with a hybrid ranger.

For a better view of what this patch achieves, here are the base
classes:

class jt_state
{
public:
  virtual ~jt_state () { }
  virtual void push (edge);
  virtual void pop ();
  virtual void register_equiv (tree dest, tree src, bool update_range =
false);
  virtual void register_equivs_edge (edge e);
  virtual void register_equivs_stmt (gimple *, basic_block,
 class jt_simplifier *);
  virtual void record_ranges_from_stmt (gimple *stmt, bool temporary);
  void get_path (vec &);
  void append_path (basic_block);
  void dump (FILE *);
  void debug ();
private:
  auto_vec m_blocks;
};

class jt_simplifier
{
public:
  virtual ~jt_simplifier () { }
  virtual tree simplify (gimple *, gimple *, basic_block, jt_state *) =
0;
};

There are no functional changes.

OK pending tests?

p.s. It's sad that this is starting to look clean, just in time to get
wiped out.

gcc/ChangeLog:

* tree-ssa-dom.c (class dom_jump_threader_simplifier): Rename...
(class dom_jt_state): ...this and provide virtual overrides.
(dom_jt_state::register_equiv): New.
(class dom_jt_simplifier): Rename from
dom_jump_threader_simplifier.
(dom_jump_threader_simplifier::simplify): Rename...
(dom_jt_simplifier::simplify): ...to this.
(pass_dominator::execute): Use dom_jt_simplifier and
dom_jt_state.
* tree-ssa-threadedge.c (jump_threader::jump_threader):
Clean-up.
(jt_state::register_equivs_stmt): Abstract out...
(jump_threader::record_temporary_equivalences_from_stmts_at_dest):
...from here.
(jump_threader::thread_around_empty_blocks): Update state.
(jump_threader::thread_through_normal_block): Same.
(jt_state::jt_state): Remove.
(jt_state::push): Remove pass specific bits.  Keep block vector
updated.
(jt_state::append_path): New.
(jt_state::pop): Remove pass specific bits.
(jt_state::register_equiv): Same.
(jt_state::record_ranges_from_stmt): Same.
(jt_state::register_equivs_on_edge): Same.  Rename...
(jt_state::register_equivs_edge):  ...to this.
(jt_state::dump): New.
(jt_state::debug): New.
(jump_threader_simplifier::simplify): Remove.
(jt_state::get_path): New.
* tree-ssa-threadedge.h (class jt_simplifier): Make into a base
class.  Expose common functionality as virtual methods.
(class jump_threader_simplifier): Same.  Rename...
(class jt_simplifier): ...to this.
* tree-vrp.c (class vrp_jump_threader_simplifier): Rename...
(class vrp_jt_simplifier): ...to this. Provide pass specific
overrides.
(class vrp_jt_state): New.
(vrp_jump_threader_simplifier::simplify): Rename...
(vrp_jt_simplifier::simplify): ...to this.  Inline code from
what used to be the base class.
(vrp_jump_threader::vrp_jump_threader): Use vrp_jt_state and
vrp_jt_simplifier.
---
 gcc/tree-ssa-dom.c| 134 ++--
 gcc/tree-ssa-threadedge.c | 322 --
 gcc/tree-ssa-threadedge.h |  51 +++---
 gcc/tree-vrp.c|  81 --
 4 files changed, 351 insertions(+), 237 deletions(-)

diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index 49d8f96408f..f58b6b78a41 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -585,31 +585,137 @@ record_edge_info (basic_block bb)
 }
 }
 
-class dom_jump_threader_simplifier : public jump_threader_simplifier
+class dom_jt_state : public jt_state
 {
 public:
-  dom_jump_threader_simplifier (vr_values *v,
-   avail_exprs_stack *avails)
-: jump_threader_simplifier (v), m_avail_exprs_stack (avails) { }
+  dom_jt_state (const_and_copies

Re: [Patch] Fortran: Improve -Wmissing-include-dirs warnings [PR55534]

2021-09-22 Thread Harald Anlauf via Gcc-patches


Hi Tobias,

Am 21.09.21 um 21:22 schrieb Tobias Burnus:

While the previous patch fixed -Wno-missing-include-dirs and sorted
out some inconsistencies with libcpp warnings, it had two issues:

* Some superfluous warnings were printed, e.g. for
     gfortran nonexisting/file.f90
   there was a warning about include path "nonexisting" not existing
   and twice the error that the "nonexisting/file.f90" could not be
   read.

* At least as invoked when build GCC or when running the GCC testsuite,
   the passed -B -isystem etc. arguments lead to proper but pointless
   diagnostic about 'finclude' or other directories not being found,
   causing excess-error FAILS and -Werror build fails.

While the latter could be fixed by adding -Wno-missing-include-dirs,
it still felt like the wrong approach.


I concur.


While the testsuite does run for me, others reported that they do
see missing-include-dirs warnings. Instead of adding a bunch of
-Wno-missing-include-dirs to the test config, I now only warn for
-I and -J by default (similar to previous state) and only do a full
warnings when the user requested passes the -Wmissing-include-dirs
explicitly. The Fortran behavior is now also properly documented
in the manual.


I had actually only looked at my use cases (-I, -J), which worked
as expected.

What I find a bit confusing - from the viewpoint of a user - is the
case of using the preprocessor (-cpp), as one gets e.g.

: Warning: ./no/such/dir: No such file or directory 
[-Wmissing-include-dirs]


while without -cpp:

f951: Warning: Nonexistent include directory './no/such/dir/' 
[-Wmissing-include-dirs]


If you feel like me that the printing "" should be
documented, feel free to do so.  I failed to find it.

(In some build setups the users do not normally see the actual
command line invoking the compiler, and they have to guess.)


In order to handle the silencing of the diagnostic and to avoid
double output via the Fortran code and libcpp (or rather: gcc/incpath.c),
I had to add some not that clean and obvious diagnostic flags.
I hope they still make sense and are somewhat readable.

OK? Comments?


I think that's good to go.  Even better if you have a user-friendly
solution to my above comment.

Thanks for the patch!

Harald


Tobias

PS: There is also some inconsistency whether fprintf stderr and
gfc_error is used. All calls in load_file could be fatal errors
as all exist with an error - and similar issues get different
error messages for no good reason. I have not tried to solve
this issue – but can if deemed reasonable as follow-up patch.

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 
80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: 
Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; 
Registergericht München, HRB 106955

Re: [PATCH] assert that deleting by pointer to base in unique_ptr does not cause UB

2021-09-22 Thread Antony Polukhin via Gcc-patches

ср, 22 сент. 2021 г. в 20:44, Jonathan Wakely :
>
> On Wed, 22 Sept 2021 at 18:09, Antony Polukhin wrote:
> >
> > std::unique_ptr allows construction from std::unique_ptr of derived
> > type as per [unique.ptr.single.asgn] and [unique.ptr.single.ctor]. If
> > std::default_delete is used with std::unique_ptr, then after such
> > construction a delete is called on a pointer to base. According to
> > [expr.delete] calling a delete on a non similar object without a
> > virtual destructor is an undefined behavior.
> >
> > This patch turns that undefined behavior into static assertions inside
> > std::unique_ptr.
>
> The undefined behaviour only happens if the destructor is actually
> reached at runtime, but won't these static assertions make it
> ill-formed to instantiate these members, even if the UB never happens?
>
> For example, if you ensure that release() is called before
> destruction, the undefined delete never happens.

Ugh... I've missed that use case. Patch is just wrong, discard it

-- 
Best regards,
Antony Polukhin

Re: [PATCH] assert that deleting by pointer to base in unique_ptr does not cause UB

2021-09-22 Thread Ville Voutilainen via Gcc-patches

On Wed, 22 Sept 2021 at 20:49, Antony Polukhin  wrote:
>
> ср, 22 сент. 2021 г. в 20:23, Ville Voutilainen :
> >
> > On Wed, 22 Sept 2021 at 20:09, Antony Polukhin via Libstdc++
> >  wrote:
> > >
> > > std::unique_ptr allows construction from std::unique_ptr of derived
> > > type as per [unique.ptr.single.asgn] and [unique.ptr.single.ctor]. If
> > > std::default_delete is used with std::unique_ptr, then after such
> > > construction a delete is called on a pointer to base. According to
> > > [expr.delete] calling a delete on a non similar object without a
> > > virtual destructor is an undefined behavior.
> > >
> > > This patch turns that undefined behavior into static assertions inside
> > > std::unique_ptr.
> >
> > I don't understand the sizeof(_Tp) == sizeof(_Up) part in the
> > static_assert. I fail to see how
> > a same-size check suggests that the types are similar enough that a
> > delete-expression works.
>
> I used the following logic:
> [unique.ptr.single.*] sections have the constraint that
> "unique_ptr::pointer is implicitly convertible to pointer".
> There's already a static assert that T in unique_ptr is not void,
> so U either has to be the same type T, or a type derived from T. If a
> derived type adds members, then size changes and types are not similar
> as the decompositions won't have the qualification-decompositions with
> the same n.

Right, but the delete-expression on a non-polymorphic type where the
static type and the dynamic
type are different is UB regardless of whether the derived type adds members.

Re: [PATCH] Simplify paradoxical subreg extensions of TRUNCATE

2021-09-22 Thread Jeff Law via Gcc-patches





On 9/21/2021 6:54 AM, Roger Sayle wrote:

That define_insn is making my eyes bleed!  I think that's the most convincing
argument I've ever read on gcc-patches, and I can see now what Segher is so
opposed to.
Then you haven't seen enough patterns or your eyes haven't toughened up 
through the years :-)



Jeff

Re: [PATCH] assert that deleting by pointer to base in unique_ptr does not cause UB

2021-09-22 Thread Antony Polukhin via Gcc-patches

ср, 22 сент. 2021 г. в 20:23, Ville Voutilainen :
>
> On Wed, 22 Sept 2021 at 20:09, Antony Polukhin via Libstdc++
>  wrote:
> >
> > std::unique_ptr allows construction from std::unique_ptr of derived
> > type as per [unique.ptr.single.asgn] and [unique.ptr.single.ctor]. If
> > std::default_delete is used with std::unique_ptr, then after such
> > construction a delete is called on a pointer to base. According to
> > [expr.delete] calling a delete on a non similar object without a
> > virtual destructor is an undefined behavior.
> >
> > This patch turns that undefined behavior into static assertions inside
> > std::unique_ptr.
>
> I don't understand the sizeof(_Tp) == sizeof(_Up) part in the
> static_assert. I fail to see how
> a same-size check suggests that the types are similar enough that a
> delete-expression works.

I used the following logic:
[unique.ptr.single.*] sections have the constraint that
"unique_ptr::pointer is implicitly convertible to pointer".
There's already a static assert that T in unique_ptr is not void,
so U either has to be the same type T, or a type derived from T. If a
derived type adds members, then size changes and types are not similar
as the decompositions won't have the qualification-decompositions with
the same n.

-- 
Best regards,
Antony Polukhin

Re: [PATCH] Simplify paradoxical subreg extensions of TRUNCATE

2021-09-22 Thread Jeff Law via Gcc-patches





On 9/21/2021 6:01 AM, Richard Sandiford via Gcc-patches wrote:

[Using this is a convenient place to reply to the thread as a whole]

Richard Biener via Gcc-patches  writes:

On Mon, Sep 6, 2021 at 12:15 PM Segher Boessenkool
 wrote:

On Sun, Sep 05, 2021 at 11:28:30PM +0100, Roger Sayle wrote:

This patch simplifies the RTX (subreg:HI (truncate:QI (reg:SI))) as
(truncate:HI (reg:SI)), and closely related variants.

Subregs of other than regs are undefined in RTL.  You will first have to
define this (in documentation as well as in other code that handles
subregs).  I doubt this is possible to do, subreg have so many
overloaded meanings already.

I suppose (subreg:MODE1 (xyz:MODE2 ..)) where xyz is not REG or MEM
is equal to

   (set (reg:MODE2) (xyz:MODE2 ..))
   (subreg:MODE1 (reg:MODE2) ...)

with 'reg' being a pseudo reg is the (only?) sensible way of defining it.

Agreed.  And I think that's already the de facto definition (and has been
for a long time).  Subreg as an operation has to have defined semantics
for all the cases that simplify_subreg handles, otherwise we have GIGO
and a lot of the function could be deleted.  We can (and do) choose
to prevent some of those operations becoming actual rtxes, but even there,
the de facto rules are quite broad.  E.g.:

- Like you said later, simplify_gen_subreg is opt-out rather than opt-in
   in terms of the subreg rtxes that it's prepared to create.

- Even rs6000.md has:

 (define_insn "*mul3_highpart"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
 (subreg:GPR
   (mult: (any_extend:
   (match_operand:GPR 1 "gpc_reg_operand" "r"))
 (any_extend:
   (match_operand:GPR 2 "gpc_reg_operand" "r")))
  0))]
   "WORDS_BIG_ENDIAN && !(mode == SImode && TARGET_POWERPC64)"
   "mulh %0,%1,%2"
   [(set_attr "type" "mul")
(set_attr "size" "")])

   Many other ports have similar patterns.

The problem with “combine can generate invalid rtl but backends
should reject it” is that, generally, people write combine patterns
by looking at what combine _wants_ to generate and then writing
.md patterns to match that.  In other words, combine in practice
defines the (de facto) correct rtl representation of a combined
sequence.
Yup.   In fact, I'm having this exact concern with an internal chunk of 
work right now :-)





Given:

Trying 10 -> 15:
10: r29:QI=trunc(r32:SI)
   REG_DEAD r32:SI
15: r38:HI=r29:QI#0
   REG_DEAD r29:QI
Failed to match this instruction:
(set (reg:HI 38)
 (subreg:HI (truncate:QI (reg:SI 32)) 0))

I'm sure there's a temptation to add an .md pattern that matches
the subreg. :-)
And that's why my stated position is that any subreg in a backend 
pattern needs to be justified.  Obviously all kinds exist, but when I 
see them I ask for a justification.


Jeff

Re: [PATCH] assert that deleting by pointer to base in unique_ptr does not cause UB

2021-09-22 Thread Jonathan Wakely via Gcc-patches

On Wed, 22 Sept 2021 at 18:09, Antony Polukhin wrote:
>
> std::unique_ptr allows construction from std::unique_ptr of derived
> type as per [unique.ptr.single.asgn] and [unique.ptr.single.ctor]. If
> std::default_delete is used with std::unique_ptr, then after such
> construction a delete is called on a pointer to base. According to
> [expr.delete] calling a delete on a non similar object without a
> virtual destructor is an undefined behavior.
>
> This patch turns that undefined behavior into static assertions inside
> std::unique_ptr.

The undefined behaviour only happens if the destructor is actually
reached at runtime, but won't these static assertions make it
ill-formed to instantiate these members, even if the UB never happens?

For example, if you ensure that release() is called before
destruction, the undefined delete never happens.

Re: [PATCH] assert that deleting by pointer to base in unique_ptr does not cause UB

2021-09-22 Thread Ville Voutilainen via Gcc-patches

On Wed, 22 Sept 2021 at 20:09, Antony Polukhin via Libstdc++
 wrote:
>
> std::unique_ptr allows construction from std::unique_ptr of derived
> type as per [unique.ptr.single.asgn] and [unique.ptr.single.ctor]. If
> std::default_delete is used with std::unique_ptr, then after such
> construction a delete is called on a pointer to base. According to
> [expr.delete] calling a delete on a non similar object without a
> virtual destructor is an undefined behavior.
>
> This patch turns that undefined behavior into static assertions inside
> std::unique_ptr.

I don't understand the sizeof(_Tp) == sizeof(_Up) part in the
static_assert. I fail to see how
a same-size check suggests that the types are similar enough that a
delete-expression works.

[PATCH] assert that deleting by pointer to base in unique_ptr does not cause UB

2021-09-22 Thread Antony Polukhin via Gcc-patches

std::unique_ptr allows construction from std::unique_ptr of derived
type as per [unique.ptr.single.asgn] and [unique.ptr.single.ctor]. If
std::default_delete is used with std::unique_ptr, then after such
construction a delete is called on a pointer to base. According to
[expr.delete] calling a delete on a non similar object without a
virtual destructor is an undefined behavior.

This patch turns that undefined behavior into static assertions inside
std::unique_ptr.

Changelog:
* include/bits/unique_ptr.h: Add static asserts that
deleting by pointer to base in unique_ptr does not cause UB
* testsuite/20_util/unique_ptr/assign/slicing_neg.cc:
New test.


-- 
Best regards,
Antony Polukhin
diff --git a/libstdc++-v3/include/bits/unique_ptr.h 
b/libstdc++-v3/include/bits/unique_ptr.h
index 6e55375..53a68f5 100644
--- a/libstdc++-v3/include/bits/unique_ptr.h
+++ b/libstdc++-v3/include/bits/unique_ptr.h
@@ -339,7 +339,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
is_convertible<_Ep, _Dp>>::type>>
unique_ptr(unique_ptr<_Up, _Ep>&& __u) noexcept
: _M_t(__u.release(), std::forward<_Ep>(__u.get_deleter()))
-   { }
+   {
+ static_assert(!is_same<_Dp, default_delete<_Tp>>::value
+   || has_virtual_destructor::type>::value
+   || sizeof(_Tp) == sizeof(_Up),
+   "type of pointer owned by __u must be similar to the type of 
pointer "
+   "owned by this object or the latter must have a virtual 
destructor");
+   }
 
 #if _GLIBCXX_USE_DEPRECATED
 #pragma GCC diagnostic push
@@ -385,6 +391,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   unique_ptr&>::type
operator=(unique_ptr<_Up, _Ep>&& __u) noexcept
{
+ static_assert(!is_same<_Dp, default_delete<_Tp>>::value
+   || has_virtual_destructor::type>::value
+   || sizeof(_Tp) == sizeof(_Up),
+   "type of pointer owned by __u must be similar to the type of 
pointer "
+   "owned by this object or the latter must have a virtual 
destructor");
+
  reset(__u.release());
  get_deleter() = std::forward<_Ep>(__u.get_deleter());
  return *this;
diff --git a/libstdc++-v3/testsuite/20_util/unique_ptr/assign/slicing_neg.cc 
b/libstdc++-v3/testsuite/20_util/unique_ptr/assign/slicing_neg.cc
new file mode 100644
index 000..e93483a
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/unique_ptr/assign/slicing_neg.cc
@@ -0,0 +1,86 @@
+// { dg-do compile { target c++11 } }
+// { dg-prune-output "virtual destructor" }
+
+// Copyright (C) 2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+#include 
+
+struct A { };
+struct B : A { };
+struct C : B { int i; };
+
+struct Ac { char c; };
+struct Bc : Ac { };
+struct Cc : Bc { short s; };
+
+
+void test01()
+{
+  std::unique_ptr upB;
+
+  std::unique_ptr cA;
+  cA = std::move(upB);
+
+  std::unique_ptr vA;
+  vA = std::move(upB);
+
+  std::unique_ptr cvA;
+  cvA = std::move(upB);
+}
+
+void test02()
+{
+  std::unique_ptr upC;
+
+  std::unique_ptr cA{std::move(upC)};  // { dg-error "required from 
here" }
+  cA = std::move(upC);  // { dg-error "required from here" }
+
+  std::unique_ptr vA{std::move(upC)};  // { dg-error "required 
from here" }
+  vA = std::move(upC);  // { dg-error "required from here" }
+
+  std::unique_ptr cvA{std::move(upC)};  // { dg-error 
"required from here" }
+  cvA = std::move(upC);  // { dg-error "required from here" }
+}
+
+void test03()
+{
+  std::unique_ptr upB;
+
+  std::unique_ptr cA;
+  cA = std::move(upB);
+
+  std::unique_ptr vA;
+  vA = std::move(upB);
+
+  std::unique_ptr cvA;
+  cvA = std::move(upB);
+}
+
+void test04()
+{
+  std::unique_ptr upC;
+
+  std::unique_ptr cA{std::move(upC)};  // { dg-error "required from 
here" }
+  cA = std::move(upC);  // { dg-error "required from here" }
+
+  std::unique_ptr vA{std::move(upC)};  // { dg-error "required 
from here" }
+  vA = std::move(upC);  // { dg-error "required from here" }
+
+  std::unique_ptr cvA{std::move(upC)};  // { dg-error 
"required from here" }
+  cvA = std::move(upC);  // { dg-error "required from here" }
+}

Re: PING**2 – Re: [Patch] Fortran: Handle allocated() with coindexed scalars [PR93834] (was: [PATCH] PR fortran/93834 - [9/10/11/12 Regression] ICE in trans_caf_is_present, at fortran/trans-intrinsic.

2021-09-22 Thread Tobias Burnus


(1) PING**2

(2) However, as it causes for others test-suite fails,* the
 https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579965.html
 [Patch] Fortran: Improve -Wmissing-include-dirs warnings [PR55534]
is probably more important.

[* I don't see it locally as it probably uses and finds directories from
the install dir; however, others see it (hundreds of tails)
including the regression tracker at
https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579963.html ]

(3) Also pending is
  https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579903.html
  [Patch] Fortran: Fix assumed-size to assumed-rank passing [PR94070]
(Thanks Thomas for reviewing the auxiliary loop patch :-)

Tobias

PS: Also pending is the GFC <-> CFI descriptor conversion patch, but expect
a revised patch soon. (Fixes found issues, uses aux loop function, fixes
contiguous attribute handling, len=* with assumed-size/explicit-size arrays,
...)

On 16.09.21 14:26, Tobias Burnus wrote:

Patch PING – see comment in the follow-up email of the patch email -
and in the email(s) before in that thread.

Tobias

On 07.09.21 16:33, Tobias Burnus wrote:

Now I actually tested the patch – and fixed some issues.

OK? – It does add support for 'allocated(a[i])' by treating
it as 'allocated(a)', as 'a' must be collectively allocated
("established") on all images of the team.*

'a[i]' is (probably) an allocatable, following Malcolm in
answer to my question to the J3-list as linked below.

Tobias

* Ignoring issues related to failed images. It could
also be handled by fetching 'a' from the remote
image, but I am not sure that's better in terms of
handling failed images.

PS:
On 07.09.21 10:02, Tobias Burnus wrote:

Hi Harald,

I spend yesterday about two hours with this. Now I am still
tired but understand more. I think the confusion between the
two of us is due to wording and in which directions the
thoughts then go:


Talking about coindexed, all of a[i], b[i]%c and c%d[i] are
coindexed and there are many constraints like "shall not be
a coindexed variable" – which then rejects all of those.
That's what I was thinking of.

I think your starting point is that while ('a' = allocatable)
  a, b%a, c[5]%d(1)%a
are ALLOCATABLE, adding a subobject reference such as
  a(:), b%a(:,:), c[5]%d(1)%a(:,:,:)
makes the variable no longer allocatable.
I think that's what you were thinking of.

We then both argued along those different lines – which caused
the confusion as we both thought we talked about the same.


While those cases are clear, the question is whether
  a[i] or b%a[i]
is allocatable or not – assuming that 'a' is a scalar.
(For an array, '(:)' has to appear before the image-selector,
which in turn makes it nonallocatable.)


I tried to pinpoint the words for this in the standard – and
failed. I think I need a "how to read the Fortran standard" 101
and some long time actually reading it :-(

Malcolm has answered me – and he believes (but only offhand) that
  a[i]  and  b%a[i]
_are_ allocatable. See (6) at
https://mailman.j3-fortran.org/pipermail/j3/2021-September/013322.html


This implies that
  if ( allocated (a[i]) .and. allocated (b%a[i]) ) stop 1
is valid.

However, I do note that coarray allocatables have to be collectively
(de)allocated, therefore
  allocated (a[i]) .and. allocated (b%a[i])
is equivalent to
  allocated (a) .and. allocated (b%a)
at least assuming that no image has failed.


First: Does this answer all the questions you had and resolved the
confusion?
Secondly, do you agree about the last bits of the analysis?
Thirdly, what do you think of the attached patch?

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Re: [PATCH] rs6000: Add psabi diagnostic for C++ zero-width bit field ABI change (PR102024)

2021-09-22 Thread Jakub Jelinek via Gcc-patches

On Wed, Sep 22, 2021 at 05:02:15PM +0200, Jakub Jelinek via Gcc-patches wrote:
> > > > @@ -6298,7 +6298,8 @@ rs6000_aggregate_candidate (const_tree type, 
> > > > machine_mode *modep,
> > > >   return -1;
> > > > count = rs6000_aggregate_candidate (TREE_TYPE (type), modep,
> > > > -   empty_base_seen);
> > > > +   empty_base_seen,
> > > > +   zero_width_bf_seen);
> > > > if (count == -1
> > > > || !index
> > > > || !TYPE_MAX_VALUE (index)
> > > > @@ -6336,6 +6337,12 @@ rs6000_aggregate_candidate (const_tree type, 
> > > > machine_mode *modep,
> > > > if (TREE_CODE (field) != FIELD_DECL)
> > > >   continue;
> > > > +   if (DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD (field))
> > > > + {
> > > > +   *zero_width_bf_seen = 1;
> > > > +   continue;
> > > > + }
> 
> So, from what you wrote, :0 in the ppc* psABIs the intent is that :0 is not
> ignored, right?
> In that case I don't really understand the above (the continue in
> particular).  Because the continue means it is ignored for C++ and not
> ignored for C, so basically you return to the 4.5-11 ABI incompatibility
> between C and C++.
> C++ :0 will have DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD set, C :0 will not...

To be more precise, I'd expect what most targets want to do for the
actual ABI decisions not to use DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD at all.
I.e. do:
  if (TREE_CODE (field) != FIELD_DECL)
continue;
  if (DECL_BIT_FIELD (field) && integer_zerop (DECL_SIZE (field)))
{
  // :0
  // in some psABIs, ignore it, i.e. continue;
  // in others psABIs, take them into account, i.e. do nothing.
}
and use DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD only for the -Wpsabi purposes.

The only exception would be for targets that decide to keep GCC 4.5-11
compatibility with the C incompatible with C++.

Jakub

[PATCH 2/2] arm: implement -mbranch-protection command line option

2021-09-22 Thread Andrea Corallo via Gcc-patches

Hi all,

second patch of a series that enables Armv8.1-M in GCC adding Branch
Target Identification Mechanism [1].

This patch implements the -mbranch-protection option. Possible values
are "none", "bti" and "standard".

When the provided value is "bti" o "standard" the bti pass is run.  By
defaut the pass is off.

Regressioned and bootstraped on arm-linux-gnu aarch64-linux-gnu.

Best Regards

  Andrea

[1] 


>From aec6bfd6d65fc4b5675dcc89417bc2612dd719cd Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Wed, 8 Sep 2021 18:10:15 +0200
Subject: [PATCH 2/2] arm: implement -mbranch-protection command line option

gcc/ChangeLog

2021-09-15  Andrea Corallo  

* doc/invoke.texi (-mbranch-protection): Document.

* config/arm/arm.opt (-mbranch-protection): Add option.

* config/arm/arm.h (TARGET_HAVE_PACBTI): New macro.

* config/arm/arm.c (arm_parse_branch_protection): New function.
(arm_configure_build_target): Invoke 'arm_parse_branch_protection'
+ verify 'arm_enable_bti'.

* config/arm/arm.c (arm_file_start): Set 'Tag_BTI_extension'
'Tag_BTI_use' attribute.
---
 gcc/config/arm/arm.c   | 30 +-
 gcc/config/arm/arm.h   |  4 
 gcc/config/arm/arm.opt |  7 +++
 gcc/doc/invoke.texi|  8 
 4 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index b62db21a734..75b9b03d680 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3172,6 +3172,25 @@ static sbitmap isa_all_fpubits_internal;
 static sbitmap isa_all_fpbits;
 static sbitmap isa_quirkbits;
 
+static void
+arm_parse_branch_protection (const char *str)
+{
+  if (!strcmp(str, "none"))
+{
+  arm_enable_bti = false;
+  return;
+}
+
+  if (!strcmp(str, "bti")
+  || !strcmp(str, "standard"))
+{
+  arm_enable_bti = true;
+  return;
+}
+
+  error ("invalid -mbranch-protection option: %qs", str);
+}
+
 /* Configure a build target TARGET from the user-specified options OPTS and
OPTS_SET.  If WARN_COMPATIBLE, emit a diagnostic if both the CPU and
architecture have been specified, but the two are not identical.  */
@@ -3200,6 +3219,9 @@ arm_configure_build_target (struct arm_build_target 
*target,
   arch_opts = strchr (opts->x_arm_arch_string, '+');
 }
 
+  if (opts->x_arm_branch_protection_string)
+arm_parse_branch_protection (opts->x_arm_branch_protection_string);
+
   if (opts->x_arm_cpu_string)
 {
   arm_selected_cpu = arm_parse_cpu_option_name (all_cores, "-mcpu",
@@ -28266,6 +28288,12 @@ arm_file_start (void)
arm_emit_eabi_attribute ("Tag_ABI_FP_16bit_format", 38,
 (int) arm_fp16_format);
 
+  if (arm_enable_bti)
+   {
+ arm_emit_eabi_attribute ("Tag_BTI_extension", 52, 1);
+ arm_emit_eabi_attribute ("Tag_BTI_use", 52, 1);
+   }
+
   if (arm_lang_output_object_attributes_hook)
arm_lang_output_object_attributes_hook();
 }
@@ -32802,7 +32830,7 @@ arm_fusion_enabled_p (tune_params::fuse_ops op)
 bool
 aarch_bti_enabled (void)
 {
-  return false; // FIXME
+  return arm_enable_bti;
 }
 
 /* Check if INSN is a BTI J insn.  */
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 015299c1534..31b685f081d 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -628,6 +628,10 @@ extern const int arm_arch_cde_coproc_bits[];
LOB (low overhead branch features) extension instructions.  */
 #define TARGET_HAVE_LOB (arm_arch8_1m_main)
 
+/* Nonzero if this chip provides Armv8.1-M Mainline
+   PAC-BTI extension instructions.  */
+#define TARGET_HAVE_PACBTI (arm_arch8_1m_main)
+
 /* Define this macro if it is advisable to hold scalars in registers
in a wider mode than that declared by the program.  In such cases,
the value is constrained to be within the bounds of the declared
diff --git a/gcc/config/arm/arm.opt b/gcc/config/arm/arm.opt
index af478a946b2..782d1c23484 100644
--- a/gcc/config/arm/arm.opt
+++ b/gcc/config/arm/arm.opt
@@ -54,6 +54,9 @@ Enum(arm_abi_type) String(iwmmxt) Value(ARM_ABI_IWMMXT)
 EnumValue
 Enum(arm_abi_type) String(aapcs-linux) Value(ARM_ABI_AAPCS_LINUX)
 
+TargetVariable
+bool arm_enable_bti = false
+
 mabort-on-noreturn
 Target Mask(ABORT_NORETURN)
 Generate a call to abort if a noreturn function returns.
@@ -300,6 +303,10 @@ mbranch-cost=
 Target RejectNegative Joined UInteger Var(arm_branch_cost) Init(-1)
 Cost to assume for a branch insn.
 
+mbranch-protection=
+Target RejectNegative Joined Var(arm_branch_protection_string) Save
+Use branch-protection features.
+
 mgeneral-regs-only
 Target RejectNegative Mask(GENERAL_REGS_ONLY) Save
 Generate code which uses the core registers only (r0-r14).
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi

[PATCH] c++: improve dumping of templated decls

2021-09-22 Thread Patrick Palka via Gcc-patches

This makes the dumping routines output more information for templated
decls, to help streamline debugging.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

gcc/cp/ChangeLog:

* ptree.c (cxx_print_decl): Dump the DECL_TEMPLATE_RESULT of
a TEMPLATE_DECL.  Dump the DECL_TEMPLATE_INFO rather than just
printing its pointer.
---
 gcc/cp/ptree.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/gcc/cp/ptree.c b/gcc/cp/ptree.c
index 7f140f5f06b..1dcd764af01 100644
--- a/gcc/cp/ptree.c
+++ b/gcc/cp/ptree.c
@@ -51,6 +51,7 @@ cxx_print_decl (FILE *file, tree node, int indent)
 }
   else if (TREE_CODE (node) == TEMPLATE_DECL)
 {
+  print_node (file, "result", DECL_TEMPLATE_RESULT (node), indent + 4);
   print_node (file, "parms", DECL_TEMPLATE_PARMS (node), indent + 4);
   indent_to (file, indent + 3);
   fprintf (file, " full-name \"%s\"",
@@ -115,13 +116,8 @@ cxx_print_decl (FILE *file, tree node, int indent)
   
   if (VAR_OR_FUNCTION_DECL_P (node)
   && DECL_TEMPLATE_INFO (node))
-{
-  if (need_indent)
-   indent_to (file, indent + 3);
-  fprintf (file, " template-info %p",
-  (void *) DECL_TEMPLATE_INFO (node));
-  need_indent = false;
-}
+print_node (file, "template-info", DECL_TEMPLATE_INFO (node),
+   indent + 4);
 }
 
 void
-- 
2.33.0.514.g99c99ed825

[PATCH 1/2] arm: add arm bti pass

2021-09-22 Thread Andrea Corallo via Gcc-patches

Hi all,

this patch is part of a series that enables Armv8.1-M in GCC and adds
Branch Target Identification Mechanism [1].

This patch moves and generalize the Aarch64 "bti" pass so it can be
used also by the Arm backend.

The pass iterates through the instructions and adds the necessary BTI
instructions at the beginning of every function and at every landing
pads targeted by indirect jumps.

Regressioned and bootstraped on arm-linux-gnu aarch64-linux-gnu.

Best Regards

  Andrea

[1] 


>From 94ee67dbc78c5ea15dde7114d7bffc18a5843cb7 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Wed, 28 Jul 2021 15:49:16 +0200
Subject: [PATCH 1/2] arm: add arm bti pass

gcc/ChangeLog

2021-09-15  Andrea Corallo  

* config/arm/unspecs.md (UNSPECV_BTI): Add unspec.

* config/arm/t-arm (aarch-bti-insert.o): Add rule.

* config/arm/arm.md (bti): New pattern.

* config/arm/arm.c (aarch_bti_enabled, aarch_bti_j_insn_p)
(aarch_pac_insn_p, aarch_gen_bti_c, aarch_gen_bti_j): New
functions.

* config/arm/arm-protos.h (make_pass_insert_bti): Add proto.

* config/arm/arm-passes.def: New file.

* config/arm/aarch-common-protos.h (aarch_bti_enabled)
(aarch_bti_j_insn_p, aarch_pac_insn_p, aarch_gen_bti_c)
(aarch_gen_bti_j): Add protos.

* config/arm/aarch-bti-insert.c: New file, rename from
'gcc/config/aarch64/aarch64-bti-insert.c' and generalize.

* config/aarch64/t-aarch64 (aarch-bti-insert.o): Rename from
'aarch64-bti-insert.o' and account for new folder.

* config/aarch64/aarch64.c (aarch_bti_enabled)
(aarch_bti_j_insn_p, aarch_pac_insn_p, aarch_gen_bti_c)
(aarch_gen_bti_j): New functions.
(aarch64_output_mi_thunk)
(aarch64_print_patchable_function_entry)
(aarch64_file_end_indicate_exec_stack): Rename 'aarch64_bti_enabled'
=> 'aarch_bti_enabled'.

* config/aarch64/aarch64-protos.h: Remove 'aarch64_bti_enabled'.

* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Rename
'aarch64_bti_enabled' into 'aarch_bti_enabled'.

* config.gcc (aarch64*-*-*): Rename 'aarch64-bti-insert.o' into
'aarch-bti-insert.o'.
(arm*-*-*): Add 'aarch-bti-insert.o'.

gcc/testsuite/ChangeLog

2021-09-15  Andrea Corallo  

* gcc.target/arm/bti1.c: New testcase.

* gcc.target/arm/bti2.c: Likewise.
---
 gcc/config.gcc|  4 +-
 gcc/config/aarch64/aarch64-c.c|  2 +-
 gcc/config/aarch64/aarch64-protos.h   |  1 -
 gcc/config/aarch64/aarch64.c  | 58 ++--
 gcc/config/aarch64/t-aarch64  |  4 +-
 .../aarch-bti-insert.c}   | 66 ---
 gcc/config/arm/aarch-common-protos.h  |  5 ++
 gcc/config/arm/arm-passes.def | 21 ++
 gcc/config/arm/arm-protos.h   |  2 +
 gcc/config/arm/arm.c  | 35 ++
 gcc/config/arm/arm.md |  6 ++
 gcc/config/arm/t-arm  | 10 +++
 gcc/config/arm/unspecs.md |  1 +
 gcc/testsuite/gcc.target/arm/bti1.c   | 12 
 gcc/testsuite/gcc.target/arm/bti2.c   | 58 
 15 files changed, 222 insertions(+), 63 deletions(-)
 rename gcc/config/{aarch64/aarch64-bti-insert.c => arm/aarch-bti-insert.c} 
(80%)
 create mode 100644 gcc/config/arm/arm-passes.def
 create mode 100644 gcc/testsuite/gcc.target/arm/bti1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/bti2.c

diff --git a/gcc/config.gcc b/gcc/config.gcc
index d9bfbfdc0d2..648cf28e105 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -322,7 +322,7 @@ aarch64*-*-*)
c_target_objs="aarch64-c.o"
cxx_target_objs="aarch64-c.o"
d_target_objs="aarch64-d.o"
-   extra_objs="aarch64-builtins.o aarch-common.o aarch64-sve-builtins.o 
aarch64-sve-builtins-shapes.o aarch64-sve-builtins-base.o 
aarch64-sve-builtins-sve2.o cortex-a57-fma-steering.o aarch64-speculation.o 
falkor-tag-collision-avoidance.o aarch64-bti-insert.o aarch64-cc-fusion.o"
+   extra_objs="aarch64-builtins.o aarch-common.o aarch64-sve-builtins.o 
aarch64-sve-builtins-shapes.o aarch64-sve-builtins-base.o 
aarch64-sve-builtins-sve2.o cortex-a57-fma-steering.o aarch64-speculation.o 
falkor-tag-collision-avoidance.o aarch-bti-insert.o aarch64-cc-fusion.o"
target_gtfiles="\$(srcdir)/config/aarch64/aarch64-builtins.c 
\$(srcdir)/config/aarch64/aarch64-sve-builtins.h 
\$(srcdir)/config/aarch64/aarch64-sve-builtins.cc"
target_has_targetm_common=yes
;;
@@ -346,7 +346,7 @@ arc*-*-*)
;;
 arm*-*-*)
cpu_type=arm
-   extra_objs="arm-builtins.o aarch-common.o"
+

[PATCH] top-level configure: setup target_configdirs based on repository

2021-09-22 Thread Andrew Burgess

The top-level configure script is shared between the gcc repository
and the binutils-gdb repository.

The target_configdirs variable in the configure.ac script, defines
sub-directories that contain components that should be built for the
target using the target tools.

Some components, e.g. zlib, are built as both host and target
libraries.

This causes problems for binutils-gdb.  If we run 'make all' in the
binutils-gdb repository we end up trying to build a target version of
the zlib library, which requires the target compiler be available.
Often the target compiler isn't immediately available, and so the
build fails.

The problem with zlib impacted a previous attempt to synchronise the
top-level configure scripts from gcc to binutils-gdb, see this thread:

  https://sourceware.org/pipermail/binutils/2019-May/107094.html

And I'm in the process of importing libbacktrace in to binutils-gdb,
which is also a host and target library, and triggers the same issues.

I believe that for binutils-gdb, at least at the moment, there are no
target libraries that we need to build.

My proposal then is to make the value of target_libraries change based
on which repository we are building in.  Specifically, if the source
tree has a gcc/ directory then we should set the target_libraries
variable, otherwise this variable is left entry.

I think that if someone tries to create a single unified tree (gcc +
binutils-gdb in a single source tree) and then build, this change will
not have a negative impact, the tree still has gcc/ so we'd expect the
target compiler to be built, which means building the target_libraries
should work just fine.

However, if the source tree lacks gcc/ then we assume the target
compiler isn't built/available, and so target_libraries shouldn't be
built.

There is already precedent within configure.ac for check on the
existence of gcc/ in the source tree, see the handling of
-enable-werror around line 3658.

I've tested a build of gcc on x86-64, and the same set of target
libraries still seem to get built.  On binutils-gdb this change
resolves the issues with 'make all'.

Any thoughts?

ChangeLog:

* configure: Regenerate.
* configure.ac (target_configdirs): Only set this when building
within the gcc repository.
---
 ChangeLog|  6 ++
 configure| 12 ++--
 configure.ac | 12 ++--
 3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/configure b/configure
index 85ab9915402..3ef5c2b553f 100755
--- a/configure
+++ b/configure
@@ -2849,9 +2849,17 @@ target_tools="target-rda"
 ## We assign ${configdirs} this way to remove all embedded newlines.  This
 ## is important because configure will choke if they ever get through.
 ## ${configdirs} is directories we build using the host tools.
-## ${target_configdirs} is directories we build using the target tools.
+##
+## ${target_configdirs} is directories we build using the target
+## tools, these are only needed when working in the gcc tree.  This
+## file is also reused in the binutils-gdb tree, where building any
+## target stuff doesn't make sense.
 configdirs=`echo ${host_libs} ${host_tools}`
-target_configdirs=`echo ${target_libraries} ${target_tools}`
+if test -d ${srcdir}/gcc; then
+  target_configdirs=`echo ${target_libraries} ${target_tools}`
+else
+  target_configdirs=""
+fi
 build_configdirs=`echo ${build_libs} ${build_tools}`
 
 
diff --git a/configure.ac b/configure.ac
index 1df038b04f3..d1217e3f886 100644
--- a/configure.ac
+++ b/configure.ac
@@ -180,9 +180,17 @@ target_tools="target-rda"
 ## We assign ${configdirs} this way to remove all embedded newlines.  This
 ## is important because configure will choke if they ever get through.
 ## ${configdirs} is directories we build using the host tools.
-## ${target_configdirs} is directories we build using the target tools.
+##
+## ${target_configdirs} is directories we build using the target
+## tools, these are only needed when working in the gcc tree.  This
+## file is also reused in the binutils-gdb tree, where building any
+## target stuff doesn't make sense.
 configdirs=`echo ${host_libs} ${host_tools}`
-target_configdirs=`echo ${target_libraries} ${target_tools}`
+if test -d ${srcdir}/gcc; then
+  target_configdirs=`echo ${target_libraries} ${target_tools}`
+else
+  target_configdirs=""
+fi
 build_configdirs=`echo ${build_libs} ${build_tools}`
 
 m4_divert_text([PARSE_ARGS],
-- 
2.25.4

Re: [PATCH, Fortran] diagnostic for argument w/type parameters for assumed-type dummy

2021-09-22 Thread Tobias Burnus


On 22.09.21 16:58, Sandra Loosemore wrote:


This patch is adds the missing diagnostic noted in PR fortran/101319.
OK to commit?


LGTM. Thanks!

For reference, the F2018 wording is: "If the actual argument is of a
derived type that has type parameters, type-bound procedures, or final
subroutines, the dummy argument shall not be assumed-type."

Tobias


commit 9d5b9062d728d1b1bf5acfb914e06d776bdcdb60
Author: Sandra Loosemore
Date:   Wed Sep 22 07:49:17 2021 -0700

 Fortran: diagnostic for argument w/type parameters for assumed-type dummy

 2021-09-22  Sandra Loosemore

  PR fortran/101319

 gcc/fortran/
  * interface.c (gfc_compare_actual_formal): Extend existing
  assumed-type diagnostic to also check for argument with type
  parameters.

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

RE: [PATCH 1/5]AArch64 sve: combine inverted masks into NOTs

2021-09-22 Thread Tamar Christina via Gcc-patches

Hi,

Sending a new version of the patch because I noticed the pattern was overriding 
the nor pattern.

A second pattern is needed to capture the nor case as combine will match the
longest sequence first.  So without this pattern we end up de-optimizing nor
and instead emit two nots.  I did not find a better way to do this.

Note: This patch series is working incrementally towards generating the most
  efficient code for this and other loops in small steps.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md (*fcm_bic_combine,
*fcm_nor_combine, *fcmuo_bic_combine,
*fcmuo_nor_combine): New.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/pred-not-gen.c-1: New test.
* gcc.target/aarch64/sve/pred-not-gen.c-2: New test.
* gcc.target/aarch64/sve/pred-not-gen.c-3: New test.
* gcc.target/aarch64/sve/pred-not-gen.c-4: New test.

--- inline copy of patch ---

diff --git a/gcc/config/aarch64/aarch64-sve.md 
b/gcc/config/aarch64/aarch64-sve.md
index 
359fe0e457096cf4042a774789a5c241420703d3..8fe4c721313e70592d2cf0acbfbe2f07b070b51a
 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -8126,6 +8126,160 @@ (define_insn_and_split "*fcmuo_and_combine"
  UNSPEC_COND_FCMUO))]
 )
 
+;; Similar to *fcm_and_combine, but for BIC rather than AND.
+;; In this case, we still need a separate NOT/BIC operation, but predicating
+;; the comparison on the BIC operand removes the need for a PTRUE.
+(define_insn_and_split "*fcm_bic_combine"
+  [(set (match_operand: 0 "register_operand" "=Upa")
+   (and:
+ (and:
+   (not:
+ (unspec:
+   [(match_operand: 1)
+(const_int SVE_KNOWN_PTRUE)
+(match_operand:SVE_FULL_F 2 "register_operand" "w")
+(match_operand:SVE_FULL_F 3 "aarch64_simd_reg_or_zero" "wDz")]
+   SVE_COND_FP_CMP_I0))
+   (match_operand: 4 "register_operand" "Upa"))
+ (match_dup: 1)))
+   (clobber (match_scratch: 5 "="))]
+  "TARGET_SVE"
+  "#"
+  "&& 1"
+  [(set (match_dup 5)
+   (unspec:
+ [(match_dup 4)
+  (const_int SVE_MAYBE_NOT_PTRUE)
+  (match_dup 2)
+  (match_dup 3)]
+ SVE_COND_FP_CMP_I0))
+   (set (match_dup 0)
+   (and:
+ (not:
+   (match_dup 5))
+ (match_dup 4)))]
+{
+  if (can_create_pseudo_p ())
+operands[5] = gen_reg_rtx (mode);
+}
+)
+
+;; Make sure that we expand to a nor when the operand 4 of
+;; *fcm_bic_combine is a not.
+(define_insn_and_split "*fcm_nor_combine"
+  [(set (match_operand: 0 "register_operand" "=Upa")
+   (and:
+ (and:
+   (not:
+ (unspec:
+   [(match_operand: 1)
+(const_int SVE_KNOWN_PTRUE)
+(match_operand:SVE_FULL_F 2 "register_operand" "w")
+(match_operand:SVE_FULL_F 3 "aarch64_simd_reg_or_zero" "wDz")]
+   SVE_COND_FP_CMP_I0))
+   (not:
+ (match_operand: 4 "register_operand" "Upa")))
+ (match_dup: 1)))
+   (clobber (match_scratch: 5 "="))]
+  "TARGET_SVE"
+  "#"
+  "&& 1"
+  [(set (match_dup 5)
+   (unspec:
+ [(match_dup 1)
+  (const_int SVE_KNOWN_PTRUE)
+  (match_dup 2)
+  (match_dup 3)]
+ SVE_COND_FP_CMP_I0))
+   (set (match_dup 0)
+   (and:
+ (and:
+   (not:
+ (match_dup 5))
+   (not:
+ (match_dup 4)))
+ (match_dup 1)))]
+{
+  if (can_create_pseudo_p ())
+operands[5] = gen_reg_rtx (mode);
+}
+)
+
+(define_insn_and_split "*fcmuo_bic_combine"
+  [(set (match_operand: 0 "register_operand" "=Upa")
+   (and:
+ (and:
+   (not:
+ (unspec:
+   [(match_operand: 1)
+(const_int SVE_KNOWN_PTRUE)
+(match_operand:SVE_FULL_F 2 "register_operand" "w")
+(match_operand:SVE_FULL_F 3 "aarch64_simd_reg_or_zero" "wDz")]
+   UNSPEC_COND_FCMUO))
+   (match_operand: 4 "register_operand" "Upa"))
+ (match_dup: 1)))
+   (clobber (match_scratch: 5 "="))]
+  "TARGET_SVE"
+  "#"
+  "&& 1"
+  [(set (match_dup 5)
+   (unspec:
+ [(match_dup 4)
+  (const_int SVE_MAYBE_NOT_PTRUE)
+  (match_dup 2)
+  (match_dup 3)]
+ UNSPEC_COND_FCMUO))
+   (set (match_dup 0)
+   (and:
+ (not:
+   (match_dup 5))
+ (match_dup 4)))]
+{
+  if (can_create_pseudo_p ())
+operands[5] = gen_reg_rtx (mode);
+}
+)
+
+;; Same for unordered comparisons.
+(define_insn_and_split "*fcmuo_nor_combine"
+  [(set (match_operand: 0 "register_operand" "=Upa")
+   (and:
+ (and:
+   (not:
+ (unspec:
+   [(match_operand: 1)
+(const_int SVE_KNOWN_PTRUE)
+

[Ada] Simplify contract of Ada.Strings.Fixed.Trim for proof

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

Use of two nested existential quantications makes proof brittle. Use
instead explicit values for the bounds given by Index_Non_Blank, like
done in Ada.Strings.Bounded.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-strfix.ads (Trim): Simplify contracts.
* libgnat/a-strfix.adb (Trim): Remove white space.diff --git a/gcc/ada/libgnat/a-strfix.adb b/gcc/ada/libgnat/a-strfix.adb
--- a/gcc/ada/libgnat/a-strfix.adb
+++ b/gcc/ada/libgnat/a-strfix.adb
@@ -865,7 +865,7 @@ package body Ada.Strings.Fixed with SPARK_Mode is
   High, Low : Integer;
 
begin
-  Low := Index (Source, Set => Left, Test  => Outside, Going => Forward);
+  Low := Index (Source, Set => Left, Test => Outside, Going => Forward);
 
   --  Case where source comprises only characters in Left
 


diff --git a/gcc/ada/libgnat/a-strfix.ads b/gcc/ada/libgnat/a-strfix.ads
--- a/gcc/ada/libgnat/a-strfix.ads
+++ b/gcc/ada/libgnat/a-strfix.ads
@@ -1133,31 +1133,15 @@ package Ada.Strings.Fixed with SPARK_Mode is
 --  Otherwise, the returned string is a slice of Source
 
 else
-  (for some Low in Source'Range =>
- (for some High in Source'Range =>
-
---  Trim returns the slice of Source between Low and High
-
-Trim'Result = Source (Low .. High)
-
-  --  Values of Low and High and the characters at their
-  --  position depend on Side.
-
-  and then
-(if Side = Left then High = Source'Last
- else Source (High) /= ' ')
-  and then
-(if Side = Right then Low = Source'First
- else Source (Low) /= ' ')
-
-  --  All characters outside range Low .. High are
-  --  Space characters.
-
-  and then
-(for all J in Source'Range =>
-   (if J < Low then Source (J) = ' ')
-  and then
-(if J > High then Source (J) = ' '),
+  (declare
+ Low  : constant Positive :=
+   (if Side = Right then Source'First
+else Index_Non_Blank (Source, Forward));
+ High : constant Positive :=
+   (if Side = Left then Source'Last
+else Index_Non_Blank (Source, Backward));
+   begin
+ Trim'Result = Source (Low .. High))),
  Global => null;
--  Returns the string obtained by removing from Source all leading Space
--  characters (if Side = Left), all trailing Space characters (if
@@ -1203,30 +1187,13 @@ package Ada.Strings.Fixed with SPARK_Mode is
 --  Otherwise, the returned string is a slice of Source
 
 else
-  (for some Low in Source'Range =>
- (for some High in Source'Range =>
-
---  Trim returns the slice of Source between Low and High
-
-Trim'Result = Source (Low .. High)
-
-  --  Characters at the bounds of the returned string are
-  --  not contained in Left or Right.
-
-  and then not Ada.Strings.Maps.Is_In (Source (Low), Left)
-  and then not Ada.Strings.Maps.Is_In (Source (High), Right)
-
-  --  All characters before Low are contained in Left.
-  --  All characters after High are contained in Right.
-
-  and then
-(for all K in Source'Range =>
-   (if K < Low
-then
-  Ada.Strings.Maps.Is_In (Source (K), Left))
-and then
-  (if K > High then
-   Ada.Strings.Maps.Is_In (Source (K), Right)),
+   (declare
+  Low  : constant Positive :=
+Index (Source, Left, Outside, Forward);
+  High : constant Positive :=
+Index (Source, Right, Outside, Backward);
+begin
+  Trim'Result = Source (Low .. High))),
  Global => null;
--  Returns the string obtained by removing from Source all leading
--  characters in Left and all trailing characters in Right.

[Ada] Reuse routines for detecting attributes Old and Result

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

Code cleanup related to handling of attribute 'Old in Contract_Cases;
semantics is unaffected.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch4.adb (Expand_N_Op_Eq): Reuse Is_Attribute_Result.
* exp_prag.adb (Expand_Attributes): Reuse Is_Attribute_Old.diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -8609,8 +8609,7 @@ package body Exp_Ch4 is
   --  f'Machine (expr) to eliminate surprise from extra precision.
 
   if Is_Floating_Point_Type (Typl)
-and then Nkind (Original_Node (Lhs)) = N_Attribute_Reference
-and then Attribute_Name (Original_Node (Lhs)) = Name_Result
+and then Is_Attribute_Result (Original_Node (Lhs))
   then
  --  Stick in the Typ'Machine call if not already there
 


diff --git a/gcc/ada/exp_prag.adb b/gcc/ada/exp_prag.adb
--- a/gcc/ada/exp_prag.adb
+++ b/gcc/ada/exp_prag.adb
@@ -1525,9 +1525,7 @@ package body Exp_Prag is
  begin
 --  Attribute 'Old
 
-if Nkind (N) = N_Attribute_Reference
-  and then Attribute_Name (N) = Name_Old
-then
+if Is_Attribute_Old (N) then
Pref := Prefix (N);
 
Indirect := Indirect_Temp_Needed (Etype (Pref));

[Ada] Spurious error on deferred constant with predicate

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

Do not insert a predicate check after a deferred constant declaration,
as the constant is not elaborated at this point, but only at the point
of its completion.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch3.adb (Analyze_Object_Declaration): Do not insert a
predicate check after a deferred constant declaration.diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -4505,7 +4505,7 @@ package body Sem_Ch3 is
   --  default initial value (including via a Default_Value or
   --  Default_Component_Value aspect, see AI12-0301) and then this is not
   --  an internal declaration whose initialization comes later (as for an
-  --  aggregate expansion).
+  --  aggregate expansion) or a deferred constant.
   --  If expression is an aggregate it may be expanded into assignments
   --  and the declaration itself is marked with No_Initialization, but
   --  the predicate still applies.
@@ -4519,6 +4519,7 @@ package body Sem_Ch3 is
   (Present (E)
 or else
   Is_Partially_Initialized_Type (T, Include_Implicit => False))
+and then not (Constant_Present (N) and then No (E))
   then
  --  If the type has a static predicate and the expression is known at
  --  compile time, see if the expression satisfies the predicate.

[Ada] Fix conformance errors and erroneous code

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

This patch fixes many cases where a formal parameter is declared as
Node_Id on the spec, and Entity_Id on the body (and similar), which is
illegal according to the conformance rules.

It also removes some erroneous pragmas Suppress, and initializes the
uninitialized variables that were being read.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* contracts.adb, einfo-utils.adb, einfo-utils.ads, exp_ch7.adb,
exp_ch9.adb, exp_disp.adb, exp_prag.adb, exp_smem.adb,
exp_util.adb, freeze.adb, sem_aggr.adb, sem_attr.adb,
sem_ch8.adb, sem_prag.ads, sem_util.adb, sem_util.ads: Fix
conformance errors.
* errout.adb, erroutc.adb: Remove pragmas Suppress.
* err_vars.ads: Initialize variables that were previously being
read uninitialized.diff --git a/gcc/ada/contracts.adb b/gcc/ada/contracts.adb
--- a/gcc/ada/contracts.adb
+++ b/gcc/ada/contracts.adb
@@ -3440,7 +3440,7 @@ package body Contracts is
-- Get_Postcond_Enabled --
--
 
-   function Get_Postcond_Enabled (Subp : Entity_Id) return Node_Id is
+   function Get_Postcond_Enabled (Subp : Entity_Id) return Entity_Id is
   Decl : Node_Id;
begin
   Decl :=
@@ -3465,7 +3465,7 @@ package body Contracts is

 
function Get_Result_Object_For_Postcond
- (Subp : Entity_Id) return Node_Id
+ (Subp : Entity_Id) return Entity_Id
is
   Decl : Node_Id;
begin
@@ -3490,7 +3490,7 @@ package body Contracts is
-- Get_Return_Success_For_Postcond --
-
 
-   function Get_Return_Success_For_Postcond (Subp : Entity_Id) return Node_Id
+   function Get_Return_Success_For_Postcond (Subp : Entity_Id) return Entity_Id
is
   Decl : Node_Id;
begin


diff --git a/gcc/ada/einfo-utils.adb b/gcc/ada/einfo-utils.adb
--- a/gcc/ada/einfo-utils.adb
+++ b/gcc/ada/einfo-utils.adb
@@ -701,7 +701,7 @@ package body Einfo.Utils is
-- Entry_Index_Type --
--
 
-   function Entry_Index_Type (Id : E) return N is
+   function Entry_Index_Type (Id : E) return E is
begin
   pragma Assert (Ekind (Id) = E_Entry_Family);
   return Etype (Discrete_Subtype_Definition (Parent (Id)));
@@ -1745,7 +1745,7 @@ package body Einfo.Utils is
-- Link_Entities --
---
 
-   procedure Link_Entities (First : Entity_Id; Second : Node_Id) is
+   procedure Link_Entities (First, Second : Entity_Id) is
begin
   if Present (Second) then
  Set_Prev_Entity (Second, First);  --  First <-- Second


diff --git a/gcc/ada/einfo-utils.ads b/gcc/ada/einfo-utils.ads
--- a/gcc/ada/einfo-utils.ads
+++ b/gcc/ada/einfo-utils.ads
@@ -625,7 +625,7 @@ package Einfo.Utils is
 
--  WARNING: There is a matching C declaration of this subprogram in fe.h
 
-   procedure Link_Entities (First : Entity_Id; Second : Entity_Id);
+   procedure Link_Entities (First, Second : Entity_Id);
--  Link entities First and Second in one entity chain.
--
--  NOTE: No updates are done to the First_Entity and Last_Entity fields


diff --git a/gcc/ada/err_vars.ads b/gcc/ada/err_vars.ads
--- a/gcc/ada/err_vars.ads
+++ b/gcc/ada/err_vars.ads
@@ -105,12 +105,15 @@ package Err_Vars is
--  of the following global variables to appropriate values before making a
--  call to one of the error message routines with a string containing the
--  insertion character to get the value inserted in an appropriate format.
+   --
+   --  Some of these are initialized below, because they are read before being
+   --  set by clients.
 
Error_Msg_Col : Column_Number;
--  Column for @ insertion character in message
 
Error_Msg_Uint_1 : Uint;
-   Error_Msg_Uint_2 : Uint;
+   Error_Msg_Uint_2 : Uint := No_Uint;
--  Uint values for ^ insertion characters in message
 
--  WARNING: There is a matching C declaration of these variables in fe.h
@@ -119,21 +122,21 @@ package Err_Vars is
--  Source location for # insertion character in message
 
Error_Msg_Name_1 : Name_Id;
-   Error_Msg_Name_2 : Name_Id;
-   Error_Msg_Name_3 : Name_Id;
+   Error_Msg_Name_2 : Name_Id := No_Name;
+   Error_Msg_Name_3 : Name_Id := No_Name;
--  Name_Id values for % insertion characters in message
 
Error_Msg_File_1 : File_Name_Type;
-   Error_Msg_File_2 : File_Name_Type;
-   Error_Msg_File_3 : File_Name_Type;
+   Error_Msg_File_2 : File_Name_Type := No_File;
+   Error_Msg_File_3 : File_Name_Type := No_File;
--  File_Name_Type values for { insertion characters in message
 
Error_Msg_Unit_1 : Unit_Name_Type;
-   Error_Msg_Unit_2 : Unit_Name_Type;
+   Error_Msg_Unit_2 : Unit_Name_Type := No_Unit_Name;
--  Unit_Name_Type values for $ insertion characters in message
 
Error_Msg_Node_1 : Node_Id;
-   Error_Msg_Node_2 : Node_Id;
+   Error_Msg_Node_2 : Node_Id := Empty;
--  Node_Id values for & insertion characters in message
 
Error_Msg_Warn :

[Ada] Clarify parts of Ada.Strings.Unbounded in SPARK or not

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

Except for the Free procedure which should not be called in SPARK code
(as it could be called on pointers to the stack), the rest of the public
API of Ada.Strings.Unbounded is valid in SPARK. Mark the private part
not in SPARK as it uses controlled types.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-strunb.ads: Mark package in SPARK with private part
not in SPARK.
(Free): Mark not in SPARK.diff --git a/gcc/ada/libgnat/a-strunb.ads b/gcc/ada/libgnat/a-strunb.ads
--- a/gcc/ada/libgnat/a-strunb.ads
+++ b/gcc/ada/libgnat/a-strunb.ads
@@ -53,6 +53,7 @@ private with Ada.Strings.Text_Buffers;
 --  and selector operations are provided.
 
 package Ada.Strings.Unbounded with
+  SPARK_Mode,
   Initial_Condition => Length (Null_Unbounded_String) = 0
 is
pragma Preelaborate;
@@ -73,7 +74,7 @@ is
--  Provides a (nonprivate) access type for explicit processing of
--  unbounded-length strings.
 
-   procedure Free (X : in out String_Access);
+   procedure Free (X : in out String_Access) with SPARK_Mode => Off;
--  Performs an unchecked deallocation of an object of type String_Access
 

@@ -732,6 +733,8 @@ is
--  strings applied to the string represented by Source's original value.
 
 private
+   pragma SPARK_Mode (Off);  --  Controlled types are not in SPARK
+
pragma Inline (Length);
 
package AF renames Ada.Finalization;

[Ada] Update status of some attributes

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

These attributes are no longer GNAT specific.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* snames.ads-tmpl: Update status of some attributes.diff --git a/gcc/ada/snames.ads-tmpl b/gcc/ada/snames.ads-tmpl
--- a/gcc/ada/snames.ads-tmpl
+++ b/gcc/ada/snames.ads-tmpl
@@ -963,8 +963,8 @@ package Snames is
Name_Elaborated : constant Name_Id := N + $; -- GNAT
Name_Emax   : constant Name_Id := N + $; -- Ada 83
Name_Enabled: constant Name_Id := N + $; -- GNAT
-   Name_Enum_Rep   : constant Name_Id := N + $; -- GNAT
-   Name_Enum_Val   : constant Name_Id := N + $; -- GNAT
+   Name_Enum_Rep   : constant Name_Id := N + $; -- Ada 22
+   Name_Enum_Val   : constant Name_Id := N + $; -- Ada 22
Name_Epsilon: constant Name_Id := N + $; -- Ada 83
Name_Exponent   : constant Name_Id := N + $;
Name_External_Tag   : constant Name_Id := N + $;
@@ -1017,7 +1017,7 @@ package Snames is
Name_Modulus: constant Name_Id := N + $;
Name_Null_Parameter : constant Name_Id := N + $; -- GNAT
Name_Object_Size: constant Name_Id := N + $; -- GNAT
-   Name_Old: constant Name_Id := N + $; -- GNAT
+   Name_Old: constant Name_Id := N + $; -- Ada 12
Name_Overlaps_Storage   : constant Name_Id := N + $; -- GNAT
Name_Partition_ID   : constant Name_Id := N + $;
Name_Passed_By_Reference: constant Name_Id := N + $; -- GNAT
@@ -1028,7 +1028,7 @@ package Snames is
Name_Priority   : constant Name_Id := N + $; -- Ada 05
Name_Range  : constant Name_Id := N + $;
Name_Range_Length   : constant Name_Id := N + $; -- GNAT
-   Name_Reduce : constant Name_Id := N + $; -- GNAT
+   Name_Reduce : constant Name_Id := N + $; -- Ada 22
Name_Ref: constant Name_Id := N + $; -- GNAT
Name_Restriction_Set: constant Name_Id := N + $; -- GNAT
Name_Result : constant Name_Id := N + $; -- GNAT

[Ada] VxWorks inconsistent use of return type (STATUS)

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

Type STATUS is made a new int with constants OK and ERROR declared.
In code where ERROR clashes with an already defined integer value,
the clashing variable is renamed to IERR. In other clashes where
STATUS is a variable that variable is renamed.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnarl/s-interr__vxworks.adb (Interfaces.C): Remove as
unused.
(System.VxWorks.Ext): Import.
(System.VxWorks.Ext.STATUS): use type.
(STATUS): New subtype.
(OK): New constant.
(Interrupt_Connector): Return STATUS type vice int.
(Interrupt_Connect, Notify_Interrupt, Unbind_Handler,
Interrupt_Server_Task): Rename Status to Result. Assert Result =
OK.
* libgnarl/s-osinte__vxworks.adb (To_Clock_Ticks): Define constant
IERR, and return it vice ERROR.
(Binary_Semaphore_Delete): Return STATUS type vice int.
(Binary_Semaphore_Obtain): Likewise.
(Binary_Semaphore_Release): Likewise.
(Binary_Semaphore_Flush): Likewise.
* libgnarl/s-osinte__vxworks.ads (SVE): Renaming of
System.VxWorks.Ext.
(STATUS): Use SVE in declaration of subtype.
(BOOL): Likewise.
(vx_freq_t): Likewise.
(t_id): Likewise.
(gitpid): Use SVE in renaming of subprogram
(Task_Stop): Likewise.
(Task_Cont): Likewise.
(Int_Lock): Likewise.
(Int_Unlock): Likewise.
(Set_Time_Slice): Likewise.
(semDelete): Likewise.
(taskCpuAffinitySet): Likewise.
(taskMaskAffinitySet): Likewise.
(sigset_t): Use SVE in declaration of type.
(OK): Remove as unused.
(ERROR): Likewise.
(taskOptionsGet): return STATUS vice int.
(taskSuspend): Likewise.
(taskResume): Likewise.
(taskDelay): Likewise.
(taskVarAdd): Likewise.
(taskVarDelete): Likewise.
(taskVarSet): Likewise.
(tlkKeyCreate): Likewise.
(taskPrioritySet): Likewise.
(semGive): Likewise.
(semTake): Likewise.
(Binary_Semaphore_Delete): Likewise.
(Binary_Semaphore_Obtain): Likewise.
(Binary_Semaphore_Release): Likewise.
(Binary_Semaphore_Flush): Likewise.
(Interrupt_Connect): Likewise.
* libgnarl/s-taprop__vxworks.adb
(System.VxWorks.Ext.STATUS): use type.
(int): Syntactically align subtype.
(STATUS): New subtype.
(OK): New constant.
(Finalize_Lock): Check STATUS vice int. Assert OK.
(Finalize_Lock): Likewise.
(Write_Lock): Likewise.
(Write_Lock): Likewise.
(Write_Lock): Likewise.
(Unlock): Likewise.
(Unlock): Likewise.
(Unlock): Likewise.
(Unlock): Likewise.
(Sleep): Likewise.
(Sleep): Likewise.
(Sleep): Likewise.
(Timed_Sleep): Likewise and test Result.
(Timed_Delay): Likewise and test Result.
(Wakeup): Likewise.
(Yield): Likewise.
(Finalize_TCB): Likewise.
(Suspend_Until_True): Check OK.
(Stop_All_Tasks): Declare Dummy STATUS vice in.  Check OK.
(Is_Task_Context): Use OSI renaming.
(Initialize): Use STATUS vice int.
* libgnarl/s-vxwext.adb
(IERR): Renamed from ERROR.
(taskCpuAffinitySet): Return IERR (int).
(taskMaskAffinitySet): Likewise.
* libgnarl/s-vxwext.ads
(STATUS): New subtype.
(OK): New STATUS constant.
(ERROR): Likewise.
* libgnarl/s-vxwext__kernel-smp.adb
(IERR): Renamed from ERROR.
(Int_Lock): Return IERR.
(semDelete): Return STATUS.
(Task_Cont): Likewise.
(Task_Stop): Likewise.
* libgnarl/s-vxwext__kernel.adb
(IERR): Renamed from ERROR.
(semDelete): Return STATUS.
(Task_Cont): Likewise.
(Task_Stop): Likewise.
(taskCpuAffinitySet): Return IERR (int)
(taskMaskAffinitySet): Likewise.
* libgnarl/s-vxwext__kernel.ads
(STATUS): New subtype.
(OK): New STATUS constant.
(ERROR): Likewise.
(Interrupt_Connect): Return STATUS
(semDelete): Likewise.
(Task_Cont): Likewise.
(Task_Stop): Likewise.
(Set_Time_Slice): Likewise.
* libgnarl/s-vxwext__rtp-smp.adb
(IERR): Renamed from ERROR.
(Int_Lock): return IERR constant vice ERROR.
(Interrupt_Connect): Return STATUS.
(semDelete): Likewise.
(Set_Time_Slice): Likewise.
* libgnarl/s-vxwext__rtp.adb
(IERR): Renamed from ERROR.
(Int_Lock): return IERR constant vice ERROR.
(Int_Unlock): Return STATUS.
(semDelete): Likewise.
(Set_Time_Slice): Likewise.
(taskCpuAffinitySet): Return IERR (int)
(taskMaskAffinitySet): Likewise.
* libgnarl/s-vxwext__rtp.ads
(STATUS): New subtype.
(OK): New STATUS constant.

[Ada] More flexibility in preprocessor

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

When using -gnatd.M (or in codepeer mode), we want to be more flexible
in the preprocessor syntax, to accomodate better legacy toolchains.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* prep.adb (Preprocess): Allow for more flexibility when
Relaxed_RM_Semantics is set.diff --git a/gcc/ada/prep.adb b/gcc/ada/prep.adb
--- a/gcc/ada/prep.adb
+++ b/gcc/ada/prep.adb
@@ -1410,7 +1410,12 @@ package body Prep is
 
  Scan.all;
 
- if Token /= Tok_If then
+ --  Ignore all recoverable errors if Relaxed_RM_Semantics
+
+ if Relaxed_RM_Semantics then
+null;
+
+ elsif Token /= Tok_If then
 Error_Msg -- CODEFIX
   ("IF expected", Token_Ptr);
 No_Error_Found := False;
@@ -1453,21 +1458,31 @@ package body Prep is
   --  Illegal preprocessor line
 
   when others =>
- No_Error_Found := False;
-
  if Pp_States.Last = 0 then
 Error_Msg -- CODEFIX
   ("IF expected", Token_Ptr);
+No_Error_Found := False;
 
- elsif
-   Pp_States.Table (Pp_States.Last).Else_Ptr = 0
+ elsif Relaxed_RM_Semantics
+   and then Get_Name_String (Token_Name) = "endif"
  then
+--  In relaxed mode, accept "endif" instead of
+--  "end if".
+
+--  Decrement the depth of the #if stack
+
+if Pp_States.Last > 0 then
+   Pp_States.Decrement_Last;
+end if;
+ elsif Pp_States.Table (Pp_States.Last).Else_Ptr = 0 then
 Error_Msg
   ("IF, ELSIF, ELSE, or `END IF` expected",
Token_Ptr);
+No_Error_Found := False;
 
  else
 Error_Msg ("IF or `END IF` expected", Token_Ptr);
+No_Error_Found := False;
  end if;
 
  --  Skip to the end of this illegal line

[Ada] Contracts written for the Ada.Strings.Bounded library

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

Written SPARK contracts describing the behaviours of all functions of
the Ada.Strings.Bounded library. All contracts are replicated in
Ada.Strings.Superbounded, the package that provides the explicit
implementation of bounded strings. The contracts (with the exception of
Trim, which uses search functions to determine the cutting points) only
use the functions Length, Element and Slice, which are expression
functions accessing the data of bounded strings. So far, all contracts
in Ada.Strings.Superbounded are proved, except the longest ones (Insert,
Overwrite, Replace_Slice), whose bodies are thus turned with SPARK_Mode
Off (but absence of runtime errors has been ensured before turning
SPARK_Mode Off). The contracts in Ada.Strings.Bounded are proved using
the contracts in Superbounded.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-strbou.adb: Turn SPARK_Mode on.
* libgnat/a-strbou.ads: Write contracts.
* libgnat/a-strfix.ads (Index): Fix grammar error in a comment.
* libgnat/a-strsea.ads (Index): Likewise.
* libgnat/a-strsup.adb: Rewrite the body to take into account
the new definition of Super_String using Relaxed_Initialization
and a predicate.
(Super_Replicate, Super_Translate, Times): Added loop
invariants, and ghost lemmas for Super_Replicate and Times.
(Super_Trim): Rewrite the body using search functions to
determine the cutting points.
(Super_Element, Super_Length, Super_Slice, Super_To_String):
Remove (now written as expression functions in a-strsup.ads).
* libgnat/a-strsup.ads: Added contracts.
(Super_Element, Super_Length, Super_Slice, Super_To_String):
Rewrite as expression functions.

patch.diff.gz
Description: application/gzip

[Ada] Add adequate guard before calling First_Rep_Item

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

New contracts on Ada.Strings.Bounded revealed an unprotected call to
First_Rep_Item on a possibly empty node.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch13.adb (Build_Predicate_Functions): Add guard.diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -10104,7 +10104,10 @@ package body Sem_Ch13 is
  --  If the type is private, check whether full view has inherited
  --  predicates.
 
- if Is_Private_Type (Typ) and then No (Ritem) then
+ if Is_Private_Type (Typ)
+   and then No (Ritem)
+   and then Present (Full_View (Typ))
+ then
 Ritem := First_Rep_Item (Full_View (Typ));
  end if;

[Ada] VxWorks inconsistent use of return type (BOOL)

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

Type BOOL is made a new int to be consistent with the typedef used in
the C header files.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnarl/s-vxwext.ads (BOOL): New int type.
(Interrupt_Context): Change return type to BOOL.
* libgnarl/s-vxwext__kernel.ads: Likewise.
* libgnarl/s-vxwext__rtp-smp.adb: Likewise.
* libgnarl/s-vxwext__rtp.adb: Likewise.
* libgnarl/s-vxwext__rtp.ads: Likewise.
* libgnarl/s-osinte__vxworks.adb (Interrupt_Context): Change
return type to BOOL.
* libgnarl/s-osinte__vxworks.ads (BOOL) New subtype.
(taskIsSuspended): Change return type to BOOL.
(Interrupt_Context): Change return type to BOOL. Adjust comments
accordingly.
* libgnarl/s-taprop__vxworks.adb (System.VxWorks.Ext.BOOL):
use type.
(Is_Task_Context): Test Interrupt_Context against 0.
* libgnat/i-vxwork.ads (BOOL): New int.
(intContext): Change return type to BOOL. Adjust comments.
* libgnat/i-vxwork__x86.ads: Likewise.diff --git a/gcc/ada/libgnarl/s-osinte__vxworks.adb b/gcc/ada/libgnarl/s-osinte__vxworks.adb
--- a/gcc/ada/libgnarl/s-osinte__vxworks.adb
+++ b/gcc/ada/libgnarl/s-osinte__vxworks.adb
@@ -203,7 +203,7 @@ package body System.OS_Interface is
-- Interrupt_Context --
---
 
-   function Interrupt_Context return int is
+   function Interrupt_Context return BOOL is
begin
   return System.VxWorks.Ext.Interrupt_Context;
end Interrupt_Context;


diff --git a/gcc/ada/libgnarl/s-osinte__vxworks.ads b/gcc/ada/libgnarl/s-osinte__vxworks.ads
--- a/gcc/ada/libgnarl/s-osinte__vxworks.ads
+++ b/gcc/ada/libgnarl/s-osinte__vxworks.ads
@@ -57,6 +57,7 @@ package System.OS_Interface is
type unsigned_long_long is mod 2 ** long_long'Size;
type size_t is mod 2 ** Standard'Address_Size;
 
+   subtype BOOLis System.VxWorks.Ext.BOOL;
subtype vx_freq_t   is System.VxWorks.Ext.vx_freq_t;
 
---
@@ -307,7 +308,7 @@ package System.OS_Interface is
function taskResume (tid : t_id) return int;
pragma Import (C, taskResume, "taskResume");
 
-   function taskIsSuspended (tid : t_id) return int;
+   function taskIsSuspended (tid : t_id) return BOOL;
pragma Import (C, taskIsSuspended, "taskIsSuspended");
 
function taskDelay (ticks : int) return int;
@@ -489,10 +490,10 @@ package System.OS_Interface is
--  which is invoked after the OS has saved enough context for a high-level
--  language routine to be safely invoked.
 
-   function Interrupt_Context return int;
+   function Interrupt_Context return BOOL;
pragma Inline (Interrupt_Context);
-   --  Return 1 if executing in an interrupt context; return 0 if executing in
-   --  a task context.
+   --  Return 1 (TRUE) if executing in an interrupt context;
+   --  return 0 (FALSE) if executing in a task context.
 
function Interrupt_Number_To_Vector (intNum : int) return Interrupt_Vector;
pragma Inline (Interrupt_Number_To_Vector);


diff --git a/gcc/ada/libgnarl/s-taprop__vxworks.adb b/gcc/ada/libgnarl/s-taprop__vxworks.adb
--- a/gcc/ada/libgnarl/s-taprop__vxworks.adb
+++ b/gcc/ada/libgnarl/s-taprop__vxworks.adb
@@ -62,9 +62,10 @@ package body System.Task_Primitives.Operations is
use System.Tasking;
use System.OS_Interface;
use System.Parameters;
-   use type System.VxWorks.Ext.t_id;
use type Interfaces.C.int;
use type System.OS_Interface.unsigned;
+   use type System.VxWorks.Ext.t_id;
+   use type System.VxWorks.Ext.BOOL;
 
subtype int is System.OS_Interface.int;
subtype unsigned is System.OS_Interface.unsigned;
@@ -1304,7 +1305,7 @@ package body System.Task_Primitives.Operations is
 
function Is_Task_Context return Boolean is
begin
-  return System.OS_Interface.Interrupt_Context /= 1;
+  return System.OS_Interface.Interrupt_Context = 0;
end Is_Task_Context;
 



diff --git a/gcc/ada/libgnarl/s-vxwext.ads b/gcc/ada/libgnarl/s-vxwext.ads
--- a/gcc/ada/libgnarl/s-vxwext.ads
+++ b/gcc/ada/libgnarl/s-vxwext.ads
@@ -46,6 +46,9 @@ package System.VxWorks.Ext is
subtype int is Interfaces.C.int;
subtype unsigned is Interfaces.C.unsigned;
 
+   type BOOL is new int;
+   --  Equivalent of the C type BOOL
+
type vx_freq_t is new unsigned;
--  Equivalent of the C type _Vx_freq_t
 
@@ -66,7 +69,7 @@ package System.VxWorks.Ext is
   Parameter : System.Address := System.Null_Address) return int;
pragma Import (C, Interrupt_Connect, "intConnect");
 
-   function Interrupt_Context return int;
+   function Interrupt_Context return BOOL;
pragma Import (C, Interrupt_Context, "intContext");
 
function Interrupt_Number_To_Vector


diff --git a/gcc/ada/libgnarl/s-vxwext__kernel.ads b/gcc/ada/libgnarl/s-vxwext__kernel.ads
--- a/gcc/ada/libgnarl/s-vxwext__kernel.ads
+++ b/gcc/ada/libgnarl/s-vxwext__kernel.ads
@@ -45,6 +45,9 @@ package

[Ada] Add Package_Body helper routine to be used in GNATprove

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

We already had Subprogram_[Body|Spec|Specification] family of routines;
now we also have a symmetrical Package_[Body|Spec|Specification] family.

The added Package_Body routine is essentially moved from GNATprove, but
for simplicity it doesn't support package body entities.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_aux.adb, sem_aux.ads (Package_Body): Moved from GNATprove.
* sem_elab.adb (Spec_And_Body_From_Entity): Refine type of parameter.diff --git a/gcc/ada/sem_aux.adb b/gcc/ada/sem_aux.adb
--- a/gcc/ada/sem_aux.adb
+++ b/gcc/ada/sem_aux.adb
@@ -1401,6 +1401,31 @@ package body Sem_Aux is
   and then Has_Discriminants (Typ));
end Object_Type_Has_Constrained_Partial_View;
 
+   --
+   -- Package_Body --
+   --
+
+   function Package_Body (E : Entity_Id) return Node_Id is
+  Body_Decl : Node_Id;
+  Body_Id   : constant Opt_E_Package_Body_Id :=
+Corresponding_Body (Package_Spec (E));
+
+   begin
+  if Present (Body_Id) then
+ Body_Decl := Parent (Body_Id);
+
+ if Nkind (Body_Decl) = N_Defining_Program_Unit_Name then
+Body_Decl := Parent (Body_Decl);
+ end if;
+
+ pragma Assert (Nkind (Body_Decl) = N_Package_Body);
+
+ return Body_Decl;
+  else
+ return Empty;
+  end if;
+   end Package_Body;
+
--
-- Package_Spec --
--


diff --git a/gcc/ada/sem_aux.ads b/gcc/ada/sem_aux.ads
--- a/gcc/ada/sem_aux.ads
+++ b/gcc/ada/sem_aux.ads
@@ -377,6 +377,10 @@ package Sem_Aux is
--  derived type, and the subtype is not an unconstrained array subtype
--  (RM 3.3(23.10/3)).
 
+   function Package_Body (E : Entity_Id) return Node_Id;
+   --  Given an entity for a package, return the corresponding package body, if
+   --  any, or else Empty.
+
function Package_Spec (E : Entity_Id) return Node_Id;
--  Given an entity for a package spec, return the corresponding package
--  spec if any, or else Empty.


diff --git a/gcc/ada/sem_elab.adb b/gcc/ada/sem_elab.adb
--- a/gcc/ada/sem_elab.adb
+++ b/gcc/ada/sem_elab.adb
@@ -2070,7 +2070,7 @@ package body Sem_Elab is
--  Change the status of the elaboration phase of the compiler to Status
 
procedure Spec_And_Body_From_Entity
- (Id: Node_Id;
+ (Id: Entity_Id;
   Spec_Decl : out Node_Id;
   Body_Decl : out Node_Id);
pragma Inline (Spec_And_Body_From_Entity);
@@ -15835,7 +15835,7 @@ package body Sem_Elab is
---
 
procedure Spec_And_Body_From_Entity
- (Id: Node_Id;
+ (Id: Entity_Id;
   Spec_Decl : out Node_Id;
   Body_Decl : out Node_Id)
is

[Ada] Fix infinite loop in compilation of illegal code

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

When rewriting a derived type declaration into a subtype declaration,
the aspect specifications were shared in a way that made the aspect
point to a node outside of the tree as parent node. This could lead to
an infinite loop on illegal code using a non-static value for attribute
Object_Size of the type.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* atree.adb (Rewrite): Fix parent node of shared aspects.
* atree.ads (Rewrite): Add ??? comment on incorrect
documentation.
* einfo-utils.adb (Known_Esize): Fix logic.
* sem_ch13.adb (Alignment_Check_For_Size_Change,
Analyze_Attribute_Definition_Clause): Protect against unset
Size.diff --git a/gcc/ada/atree.adb b/gcc/ada/atree.adb
--- a/gcc/ada/atree.adb
+++ b/gcc/ada/atree.adb
@@ -2025,10 +2025,16 @@ package body Atree is
 
  --  Both the old and new copies of the node will share the same list
  --  of aspect specifications if aspect specifications are present.
+ --  Restore the parent link of the aspect list to the old node, which
+ --  is the one linked in the tree.
 
  if Old_Has_Aspects then
-Set_Aspect_Specifications
-  (Sav_Node, Aspect_Specifications (Old_Node));
+declare
+   Aspects : constant List_Id := Aspect_Specifications (Old_Node);
+begin
+   Set_Aspect_Specifications (Sav_Node, Aspects);
+   Set_Parent (Aspects, Old_Node);
+end;
  end if;
   end if;
 


diff --git a/gcc/ada/atree.ads b/gcc/ada/atree.ads
--- a/gcc/ada/atree.ads
+++ b/gcc/ada/atree.ads
@@ -501,6 +501,7 @@ package Atree is
--  the contents of these two nodes fixing up the parent pointers of the
--  replaced node (we do not attempt to preserve parent pointers for the
--  original node). Neither Old_Node nor New_Node can be extended nodes.
+   --  ??? The above explanation is incorrect, instead Copy_Node is called.
--
--  Note: New_Node may not contain references to Old_Node, for example as
--  descendants, since the rewrite would make such references invalid. If


diff --git a/gcc/ada/einfo-utils.adb b/gcc/ada/einfo-utils.adb
--- a/gcc/ada/einfo-utils.adb
+++ b/gcc/ada/einfo-utils.adb
@@ -414,8 +414,7 @@ package body Einfo.Utils is
   if Use_New_Unknown_Rep then
  return not Field_Is_Initial_Zero (E, F_Esize);
   else
- return Esize (E) /= Uint_0
-   and then Present (Esize (E));
+ return Present (Esize (E)) and then Esize (E) /= Uint_0;
   end if;
end Known_Esize;
 


diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -860,6 +860,7 @@ package body Sem_Ch13 is
 
   if Known_Alignment (Typ)
 and then not Has_Alignment_Clause (Typ)
+and then Present (Size)
 and then Size mod (Alignment (Typ) * SSU) /= 0
   then
  Reinit_Alignment (Typ);
@@ -7125,7 +7126,7 @@ package body Sem_Ch13 is
 else
Check_Size (Expr, U_Ent, Size, Biased);
 
-   if Size <= 0 then
+   if No (Size) or else Size <= 0 then
   Error_Msg_N ("Object_Size must be positive", Expr);
 
elsif Is_Scalar_Type (U_Ent) then

[Ada] Removal of technical debt

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

This is an iterative patch as part of a greater project to reduce the
amount of technical debt present in the frontend of the compiler.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* ali.adb, ali.ads (Scan_ALI): Remove use of deprecated
parameter Ignore_ED, and all specification for Lower in call to
Get_File_Name.
* ali-util.adb (Read_Withed_ALIs): Modify call to Scan_ALI.
* clean.adb (Clean_Executables): Likewise.
* gnatbind.adb (Add_Artificial_ALI_File, Executable section):
Likewise.
* gnatlink.adb (Executable section): Likewise.
* gnatls.adb (Executable section): Likewise.
* make.adb (Check, Wait_For_Available_Slot): Likewise.
* aspects.ads: Add Aspect_No_Controlled_Parts to
Nonoverridable_Aspect_Id
* opt.ads: Remove function pointers used as a workaround for
ASIS.
* osint-c.adb (Executable section): Remove setting of function
pointer workarounds needed for ASIS.
* osint.adb (Read_Default_Search_Dirs): Correct behavior to
detect EOL characters.
* par_sco.adb (Output_Header): Remove comment regarding use of
First_Sloc.
(Traverse_Sync_Definition): Renamed to
Traverse_Protected_Or_Task_Definition.
* pprint.adb (Interal_List_Name): Add description about purpose,
and refactor conditional statement.
(Prepend): Removed.
* repinfo.adb (List_Rep_Info, Write_Info_Line): Remove use of
subprogram pointer.
* scng.adb (Scan): Remove CODEFIX question, and minor comment
change.
* sem_attr.adb (Analyze_Image_Attribute): Remove special
processing for 'Img.
* sem_ch6.adb (Check_Untagged_Equality): Add RM reference.
(FCE): Add comment describing behavior.
(Is_Non_Overriding_Operation): Minor comment formatting change.
* sem_type.adb (Is_Actual_Subprogram): Add comment about
Comes_From_Source test.
(Matching_Types): Describe non-matching cases.
* sem_util.adb (Is_Confirming): Add stub case for
No_Controlled_Parts.diff --git a/gcc/ada/ali-util.adb b/gcc/ada/ali-util.adb
--- a/gcc/ada/ali-util.adb
+++ b/gcc/ada/ali-util.adb
@@ -249,7 +249,6 @@ package body ALI.Util is
 Scan_ALI
   (F => Afile,
T => Text,
-   Ignore_ED => False,
Err   => False);
 
   Free (Text);


diff --git a/gcc/ada/ali.adb b/gcc/ada/ali.adb
--- a/gcc/ada/ali.adb
+++ b/gcc/ada/ali.adb
@@ -892,7 +892,6 @@ package body ALI is
function Scan_ALI
  (F: File_Name_Type;
   T: Text_Buffer_Ptr;
-  Ignore_ED: Boolean;
   Err  : Boolean;
   Ignore_Lines : String  := "X";
   Ignore_Errors: Boolean := False;
@@ -1319,8 +1318,7 @@ package body ALI is
  exit when Nextc = ',';
 
  --  Terminate if left bracket not part of wide char
- --  sequence Note that we only recognize brackets
- --  notation so far ???
+ --  sequence.
 
  exit when Nextc = '[' and then T (P + 1) /= '"';
 
@@ -2938,9 +2936,7 @@ package body ALI is
 
 --  Store AD indication unless ignore required
 
-if not Ignore_ED then
-   Withs.Table (Withs.Last).Elab_All_Desirable := True;
-end if;
+Withs.Table (Withs.Last).Elab_All_Desirable := True;
 
  elsif Nextc = 'E' then
 P := P + 1;
@@ -2957,12 +2953,9 @@ package body ALI is
Checkc ('D');
Check_At_End_Of_Field;
 
-   --  Store ED indication unless ignore required
+   --  Store ED indication
 
-   if not Ignore_ED then
-  Withs.Table (Withs.Last).Elab_Desirable :=
-True;
-   end if;
+   Withs.Table (Withs.Last).Elab_Desirable := True;
 end if;
 
  else
@@ -3213,13 +3206,10 @@ package body ALI is
 Skip_Space;
 Sdep.Increment_Last;
 
---  In the following call, Lower is not set to True, this is either
---  a bug, or it deserves a special comment as to why this is so???
-
 --  The file/path name may be quoted
 
 Sdep.Table (Sdep.Last).Sfile :=
-  Get_File_Name (May_Be_Quoted => True);
+  Get_File_Name (Lower => True, May_Be_Quoted => True);
 
 Sdep.Table (Sdep.Last).Stamp := Get_Stamp;
 Sdep.Table (Sdep.Last).Dummy_Entry :=


diff --git

[Ada] More precise analysis of function renamings in GNATprove

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

When a function renaming has a contract, it is important for GNATprove
that it is not treated as a simple wrapper, otherwise the link between
the renamed function and its renaming is lost for proof. Instead, it
should be treated as an expression function.

There is no impact on compilation.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* freeze.adb (Build_Renamed_Body): Special case for GNATprove.
* sem_ch6.adb (Analyze_Expression_Function): Remove useless test
for a node to come from source, which becomes harmful otherwise.diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb
--- a/gcc/ada/freeze.adb
+++ b/gcc/ada/freeze.adb
@@ -636,13 +636,26 @@ package body Freeze is
  Next (Param_Spec);
   end loop;
 
-  Body_Node :=
-Make_Subprogram_Body (Loc,
-  Specification => Spec,
-  Declarations => New_List,
-  Handled_Statement_Sequence =>
-Make_Handled_Sequence_Of_Statements (Loc,
-  Statements => New_List (Call_Node)));
+  --  In GNATprove, prefer to generate an expression function whenever
+  --  possible, to benefit from the more precise analysis in that case
+  --  (as if an implicit postcondition had been generated).
+
+  if GNATprove_Mode
+and then Nkind (Call_Node) = N_Simple_Return_Statement
+  then
+ Body_Node :=
+   Make_Expression_Function (Loc,
+ Specification => Spec,
+ Expression=> Expression (Call_Node));
+  else
+ Body_Node :=
+   Make_Subprogram_Body (Loc,
+ Specification  => Spec,
+ Declarations   => New_List,
+ Handled_Statement_Sequence =>
+   Make_Handled_Sequence_Of_Statements (Loc,
+ Statements => New_List (Call_Node)));
+  end if;
 
   if Nkind (Decl) /= N_Subprogram_Declaration then
  Rewrite (N,


diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -385,15 +385,9 @@ package body Sem_Ch6 is
  Analyze (New_Body);
  Set_Is_Inlined (Prev);
 
-  --  If the expression function is a completion, the previous declaration
-  --  must come from source. We know already that it appears in the current
-  --  scope. The entity itself may be internally created if within a body
-  --  to be inlined.
-
   elsif Present (Prev)
 and then Is_Overloadable (Prev)
 and then not Is_Formal_Subprogram (Prev)
-and then Comes_From_Source (Parent (Prev))
   then
  Set_Has_Completion (Prev, False);
  Set_Is_Inlined (Prev);

[Ada] Fix access to predicated parent in Itype

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

Getter function Predicated_Parent expects to be called on subtypes only,
which was not enforced always, possibly leading to assertion failures on
compiler built with assertions.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch13.adb (Build_Predicate_Functions): Access
Predicated_Parent only on subtypes.diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -10191,6 +10191,9 @@ package body Sem_Ch13 is
 or else
   (Is_Itype (Typ)
and then not Comes_From_Source (Typ)
+   and then Ekind (Typ) in E_Array_Subtype
+ | E_Record_Subtype
+ | E_Record_Subtype_With_Private
and then Present (Predicated_Parent (Typ)))
   then
  return;

[Ada] Allow more cases of import with Relaxed_RM_Semantics

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

Such as importing a package as was accepted by JGNAT, for use in
analyzers such as SPARK or CodePeer.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_prag.adb (Process_Import_Or_Interface): Relax error when
Relaxed_RM_Semantics.diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -9532,7 +9532,11 @@ package body Sem_Prag is
 
 Process_Import_Predefined_Type;
 
- else
+ --  Emit an error unless Relaxed_RM_Semantics since some legacy Ada
+ --  compilers may accept more cases, e.g. JGNAT allowed importing
+ --  a Java package.
+
+ elsif not Relaxed_RM_Semantics then
 if From_Aspect_Specification (N) then
Error_Pragma_Arg
   ("entity for aspect% must be object, subprogram "

[Ada] Improve performance for case-insensitive regular expressions

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

An existing optimization substantially improves performance in the case
of checking for pattern matches where all possible matches are known to
start with the same character. Generalize this optimization to also
apply in the case of a case-insensitive comparison (so that there are
two possible initial characters to check for, e.g. 'z' and 'Z').

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/s-regpat.adb (Match): Handle the case where Self.First
is not NUL (so we know the first character we are looking for),
but case-insensitive matching has
been specified.
(Optimize): In the case of an EXACTF Op, set Self.First as is
done in the EXACT case, except with the addition of a call to
Lower_Case.diff --git a/gcc/ada/libgnat/s-regpat.adb b/gcc/ada/libgnat/s-regpat.adb
--- a/gcc/ada/libgnat/s-regpat.adb
+++ b/gcc/ada/libgnat/s-regpat.adb
@@ -3463,18 +3463,58 @@ package body System.Regpat is
  end;
 
   elsif Self.First /= ASCII.NUL then
- --  We know what char it must start with
+ --  We know what char (modulo casing) it must start with
 
- declare
-Next_Try : Natural := Index (First_In_Data, Self.First);
+ if (Self.Flags and Case_Insensitive) = 0
+   or else Self.First not in 'a' .. 'z'
+ then
+declare
+   Next_Try : Natural := Index (First_In_Data, Self.First);
+begin
+   while Next_Try /= 0 loop
+  Matched := Try (Next_Try);
+  exit when Matched;
+  Next_Try := Index (Next_Try + 1, Self.First);
+   end loop;
+end;
+ else
+declare
+   Uc_First : constant Character := To_Upper (Self.First);
+
+   function Case_Insensitive_Index
+ (Start : Positive) return Natural;
+   --  Search for both Self.First and To_Upper (Self.First).
+   --  If both are nonzero, return the smaller one; if exactly
+   --  one is nonzero, return it; if both are zero, return zero.
+
+   ---
+   -- Case_Insenstive_Index --
+   ---
+
+   function Case_Insensitive_Index
+ (Start : Positive) return Natural
+   is
+  Lc_Index : constant Natural := Index (Start, Self.First);
+  Uc_Index : constant Natural := Index (Start, Uc_First);
+   begin
+  if Lc_Index = 0 then
+ return Uc_Index;
+  elsif Uc_Index = 0 then
+ return Lc_Index;
+  else
+ return Natural'Min (Lc_Index, Uc_Index);
+  end if;
+   end Case_Insensitive_Index;
 
- begin
-while Next_Try /= 0 loop
-   Matched := Try (Next_Try);
-   exit when Matched;
-   Next_Try := Index (Next_Try + 1, Self.First);
-end loop;
- end;
+   Next_Try : Natural := Case_Insensitive_Index (First_In_Data);
+begin
+   while Next_Try /= 0 loop
+  Matched := Try (Next_Try);
+  exit when Matched;
+  Next_Try := Case_Insensitive_Index (Next_Try + 1);
+   end loop;
+end;
+ end if;
 
   else
  --  Messy cases: try all locations (including for the empty string)
@@ -3634,6 +3674,9 @@ package body System.Regpat is
   if Program (Scan) = EXACT then
  Self.First := Program (String_Operand (Scan));
 
+  elsif Program (Scan) = EXACTF then
+ Self.First := To_Lower (Program (String_Operand (Scan)));
+
   elsif Program (Scan) = BOL
 or else Program (Scan) = SBOL
 or else Program (Scan) = MBOL

[Ada] Remove System.Img_Enum_New unit

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

It was still needed only because of bootstrap path considerations that
are obsolete after the recent overhaul of the bootstrap process.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/s-imenne.ads, libgnat/s-imenne.adb: Delete.
* gcc-interface/Make-lang.in (GNAT_ADA_OBJS): Remove s-imenne.o.
(GNATBIND_OBJS): Likewise.diff --git a/gcc/ada/gcc-interface/Make-lang.in b/gcc/ada/gcc-interface/Make-lang.in
--- a/gcc/ada/gcc-interface/Make-lang.in
+++ b/gcc/ada/gcc-interface/Make-lang.in
@@ -517,7 +517,6 @@ GNAT_ADA_OBJS+= \
  ada/libgnat/s-excmac.o	\
  ada/libgnat/s-exctab.o	\
  ada/libgnat/s-htable.o	\
- ada/libgnat/s-imenne.o	\
  ada/libgnat/s-imgint.o	\
  ada/libgnat/s-mastop.o	\
  ada/libgnat/s-memory.o	\
@@ -684,7 +683,6 @@ GNATBIND_OBJS +=  \
  ada/libgnat/s-excmac.o   \
  ada/libgnat/s-exctab.o   \
  ada/libgnat/s-htable.o   \
- ada/libgnat/s-imenne.o   \
  ada/libgnat/s-imgint.o   \
  ada/libgnat/s-mastop.o   \
  ada/libgnat/s-memory.o   \


diff --git a/gcc/ada/libgnat/s-imenne.adb /dev/null
deleted file mode 100644
--- a/gcc/ada/libgnat/s-imenne.adb
+++ /dev/null
@@ -1,170 +0,0 @@
---
---  --
--- GNAT RUN-TIME COMPONENTS --
---  --
---  S Y S T E M . I M G _ E N U M _ N E W   --
---  --
--- B o d y  --
---  --
---  Copyright (C) 2000-2021, Free Software Foundation, Inc. --
---  --
--- GNAT is free software;  you can  redistribute it  and/or modify it under --
--- terms of the  GNU General Public License as published  by the Free Soft- --
--- ware  Foundation;  either version 3,  or (at your option) any later ver- --
--- sion.  GNAT is distributed in the hope that it will be useful, but WITH- --
--- OUT ANY WARRANTY;  without even the  implied warranty of MERCHANTABILITY --
--- or FITNESS FOR A PARTICULAR PURPOSE. --
---  --
--- As a special exception under Section 7 of GPL version 3, you are granted --
--- additional permissions described in the GCC Runtime Library Exception,   --
--- version 3.1, as published by the Free Software Foundation.   --
---  --
--- You should have received a copy of the GNU General Public License and--
--- a copy of the GCC Runtime Library Exception along with this program; --
--- see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see--
--- .  --
---  --
--- GNAT was originally developed  by the GNAT team at  New York University. --
--- Extensive contributions were provided by Ada Core Technologies Inc.  --
---  --
---
-
-pragma Compiler_Unit_Warning;
-
-with Ada.Unchecked_Conversion;
-
-package body System.Img_Enum_New is
-
-   -
-   -- Image_Enumeration_8 --
-   -
-
-   procedure Image_Enumeration_8
- (Pos : Natural;
-  S   : in out String;
-  P   : out Natural;
-  Names   : String;
-  Indexes : System.Address)
-   is
-  pragma Assert (S'First = 1);
-
-  type Natural_8 is range 0 .. 2 ** 7 - 1;
-  subtype Names_Index is
-Natural_8 range Natural_8 (Names'First)
-  .. Natural_8 (Names'Last) + 1;
-  subtype Index is Natural range Natural'First .. Names'Length;
-  type Index_Table is array (Index) of Names_Index;
-  type Index_Table_Ptr is access Index_Table;
-
-  function To_Index_Table_Ptr is
-new Ada.Unchecked_Conversion (System.Address, Index_Table_Ptr);
-
-  IndexesT : constant Index_Table_Ptr := To_Index_Table_Ptr (Indexes);
-
-  pragma Assert (Pos in IndexesT'Range);
-  pragma Assert (Pos + 1 in IndexesT'Range);
-
-  Start : constant Natural := Natural (IndexesT (Pos));
-  Next  : constant Natural := Natural (IndexesT (Pos + 1));
-
-  pragma Assert (Next - 1 >= Start);
-  pragma Assert (Start >= Names'First);
-  pragma Assert (Next - 1 <= Names'Last);
-
-  pragma Assert (Next - Start <= S'Last);
-  --  The caller should guarantee that S is large

[Ada] Fix obsolete comments/name referring to girder discriminants

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

This name was changed in 2002 to stored discriminants, but some comments
and variable names were not converted.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* einfo.ads: Fix comments.
* exp_aggr.adb: Fix variable name.
* exp_util.adb: Fix comments.
* sem_ch13.adb: Fix comments.
* sem_ch3.adb: Fix comments and variable name.diff --git a/gcc/ada/einfo.ads b/gcc/ada/einfo.ads
--- a/gcc/ada/einfo.ads
+++ b/gcc/ada/einfo.ads
@@ -746,9 +746,9 @@ package Einfo is
 
 --Corresponding_Record_Component
 --   Defined in components of a derived untagged record type, including
---   discriminants. For a regular component or a girder discriminant,
+--   discriminants. For a regular component or a stored discriminant,
 --   points to the corresponding component in the parent type. Set to
---   Empty for a non-girder discriminant. It is used by the back end to
+--   Empty for a non-stored discriminant. It is used by the back end to
 --   ensure the layout of the derived type matches that of the parent
 --   type when there is no representation clause on the derived type.
 
@@ -2400,11 +2400,11 @@ package Einfo is
 --   parent, we do not consider them to be separate units for this flag).
 
 --Is_Completely_Hidden
---   Defined on discriminants. Only set on girder discriminants of
---   untagged types. When set, the entity is a girder discriminant of a
+--   Defined on discriminants. Only set on stored discriminants of
+--   untagged types. When set, the entity is a stored discriminant of a
 --   derived untagged type which is not directly visible in the derived
 --   type because the derived type or one of its ancestors have renamed the
---   discriminants in the root type. Note: there are girder discriminants
+--   discriminants in the root type. Note: there are stored discriminants
 --   which are not Completely_Hidden (e.g. discriminants of a root type).
 
 --Is_Composite_Type (synthesized)
@@ -3652,7 +3652,7 @@ package Einfo is
 
 --Next_Discriminant (synthesized)
 --   Applies to discriminants returned by First/Next_Discriminant. Returns
---   the next language-defined (i.e. perhaps non-girder) discriminant by
+--   the next language-defined (i.e. perhaps non-stored) discriminant by
 --   following the chain of declared entities as long as the kind of the
 --   entity corresponds to a discriminant. Note that the discriminants
 --   might be the only components of the record. Returns Empty if there
@@ -3842,8 +3842,8 @@ package Einfo is
 --Rec_Ext.Comp -> Rec_Ext.Parent. ... .Parent.Comp
 --
 --   In base untagged types:
--- Always points to itself except for non-girder discriminants, where
--- it points to the girder discriminant it renames.
+-- Always points to itself except for non-stored discriminants, where
+-- it points to the stored discriminant it renames.
 --
 --   In subtypes (tagged and untagged):
 -- Points to the component in the base type.


diff --git a/gcc/ada/exp_aggr.adb b/gcc/ada/exp_aggr.adb
--- a/gcc/ada/exp_aggr.adb
+++ b/gcc/ada/exp_aggr.adb
@@ -8046,7 +8046,7 @@ package body Exp_Aggr is
Discr: Entity_Id;
Decl : Node_Id;
Num_Disc : Nat := 0;
-   Num_Gird : Nat := 0;
+   Num_Stor : Nat := 0;
 
 --  Start of processing for Generate_Aggregate_For_Derived_Type
 
@@ -8082,13 +8082,13 @@ package body Exp_Aggr is
 
Discr := First_Stored_Discriminant (Base_Type (Typ));
while Present (Discr) loop
-  Num_Gird := Num_Gird + 1;
+  Num_Stor := Num_Stor + 1;
   Next_Stored_Discriminant (Discr);
end loop;
 
--  Case of more stored discriminants than new discriminants
 
-   if Num_Gird > Num_Disc then
+   if Num_Stor > Num_Disc then
 
   --  Create a proper subtype of the parent type, which is the
   --  proper implementation type for the aggregate, and convert


diff --git a/gcc/ada/exp_util.adb b/gcc/ada/exp_util.adb
--- a/gcc/ada/exp_util.adb
+++ b/gcc/ada/exp_util.adb
@@ -10645,7 +10645,7 @@ package body Exp_Util is
   end if;
 
--  Otherwise the constraint denotes a reference to some name
-   --  which results in a Girder discriminant:
+   --  which results in a Stored discriminant:
 
--
--Name : ...;
@@ -10666,7 +10666,7 @@ package body Exp_Util is
return Find_Constraint_Value (Entity (Constr));
 
 --  Otherwise the current constraint is an expression which yields
---  a Girder discriminant:
+--  a Stored discriminant:
 
 --type Typ (D1 : ...; DN : ...) is

[Ada] VxWorks inconsistent use of return type (Int_Unlock)

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

Int_Unlock/intUnlock is incorrectly declared as a function. In the
VxWorks headers intUnlock is type void.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnarl/s-osinte__vxworks.ads: Make procedure vice function.
* libgnarl/s-vxwext.ads: Likewise.
* libgnarl/s-vxwext__kernel-smp.adb: Likewise.
* libgnarl/s-vxwext__kernel.adb: Likewise.
* libgnarl/s-vxwext__kernel.ads: Likewise.
* libgnarl/s-vxwext__rtp-smp.adb: Likewise.
* libgnarl/s-vxwext__rtp.adb: Likewise.
* libgnarl/s-vxwext__rtp.ads: Likewise.
* libgnarl/s-taprop__vxworks.adb (Stop_All_Tasks): Call
Int_Unlock as a procedure.diff --git a/gcc/ada/libgnarl/s-osinte__vxworks.ads b/gcc/ada/libgnarl/s-osinte__vxworks.ads
--- a/gcc/ada/libgnarl/s-osinte__vxworks.ads
+++ b/gcc/ada/libgnarl/s-osinte__vxworks.ads
@@ -232,8 +232,7 @@ package System.OS_Interface is
--  If we are in the kernel space, lock interrupts. It typically maps to
--  intLock.
 
-   function Int_Unlock (Old : int) return int
- renames System.VxWorks.Ext.Int_Unlock;
+   procedure Int_Unlock (Old : int) renames System.VxWorks.Ext.Int_Unlock;
--  If we are in the kernel space, unlock interrupts. It typically maps to
--  intUnlock. The parameter Old is only used on PowerPC where it contains
--  the returned value from Int_Lock (the old MPSR).


diff --git a/gcc/ada/libgnarl/s-taprop__vxworks.adb b/gcc/ada/libgnarl/s-taprop__vxworks.adb
--- a/gcc/ada/libgnarl/s-taprop__vxworks.adb
+++ b/gcc/ada/libgnarl/s-taprop__vxworks.adb
@@ -1268,7 +1268,7 @@ package body System.Task_Primitives.Operations is
  C := C.Common.All_Tasks_Link;
   end loop;
 
-  Dummy := Int_Unlock (Old);
+  Int_Unlock (Old);
end Stop_All_Tasks;
 
---


diff --git a/gcc/ada/libgnarl/s-vxwext.ads b/gcc/ada/libgnarl/s-vxwext.ads
--- a/gcc/ada/libgnarl/s-vxwext.ads
+++ b/gcc/ada/libgnarl/s-vxwext.ads
@@ -57,7 +57,7 @@ package System.VxWorks.Ext is
function Int_Lock return int;
pragma Import (C, Int_Lock, "intLock");
 
-   function Int_Unlock (Old : int) return int;
+   procedure Int_Unlock (Old : int);
pragma Import (C, Int_Unlock, "intUnlock");
 
function Interrupt_Connect


diff --git a/gcc/ada/libgnarl/s-vxwext__kernel-smp.adb b/gcc/ada/libgnarl/s-vxwext__kernel-smp.adb
--- a/gcc/ada/libgnarl/s-vxwext__kernel-smp.adb
+++ b/gcc/ada/libgnarl/s-vxwext__kernel-smp.adb
@@ -48,10 +48,10 @@ package body System.VxWorks.Ext is
-- Int_Unlock --

 
-   function Int_Unlock (Old : int) return int is
+   procedure Int_Unlock (Old : int) is
   pragma Unreferenced (Old);
begin
-  return ERROR;
+  null;
end Int_Unlock;
 
---


diff --git a/gcc/ada/libgnarl/s-vxwext__kernel.adb b/gcc/ada/libgnarl/s-vxwext__kernel.adb
--- a/gcc/ada/libgnarl/s-vxwext__kernel.adb
+++ b/gcc/ada/libgnarl/s-vxwext__kernel.adb
@@ -49,10 +49,10 @@ package body System.VxWorks.Ext is
-- Int_Unlock --

 
-   function intUnlock (Old : int) return int;
+   procedure intUnlock (Old : int);
pragma Import (C, intUnlock, "intUnlock");
 
-   function Int_Unlock (Old : int) return int renames intUnlock;
+   procedure Int_Unlock (Old : int) renames intUnlock;
 
---
-- semDelete --


diff --git a/gcc/ada/libgnarl/s-vxwext__kernel.ads b/gcc/ada/libgnarl/s-vxwext__kernel.ads
--- a/gcc/ada/libgnarl/s-vxwext__kernel.ads
+++ b/gcc/ada/libgnarl/s-vxwext__kernel.ads
@@ -56,7 +56,7 @@ package System.VxWorks.Ext is
function Int_Lock return int;
pragma Convention (C, Int_Lock);
 
-   function Int_Unlock (Old : int) return int;
+   procedure Int_Unlock (Old : int);
pragma Convention (C, Int_Unlock);
 
function Interrupt_Connect


diff --git a/gcc/ada/libgnarl/s-vxwext__rtp-smp.adb b/gcc/ada/libgnarl/s-vxwext__rtp-smp.adb
--- a/gcc/ada/libgnarl/s-vxwext__rtp-smp.adb
+++ b/gcc/ada/libgnarl/s-vxwext__rtp-smp.adb
@@ -48,10 +48,10 @@ package body System.VxWorks.Ext is
-- Int_Unlock --

 
-   function Int_Unlock (Old : int) return int is
+   procedure Int_Unlock (Old : int) is
   pragma Unreferenced (Old);
begin
-  return ERROR;
+  null;
end Int_Unlock;
 
---


diff --git a/gcc/ada/libgnarl/s-vxwext__rtp.adb b/gcc/ada/libgnarl/s-vxwext__rtp.adb
--- a/gcc/ada/libgnarl/s-vxwext__rtp.adb
+++ b/gcc/ada/libgnarl/s-vxwext__rtp.adb
@@ -48,10 +48,10 @@ package body System.VxWorks.Ext is
-- Int_Unlock --

 
-   function Int_Unlock (Old : int) return int is
+   procedure Int_Unlock (Old : int) is
   pragma Unreferenced (Old);
begin
-  return ERROR;
+  null;
end Int_Unlock;
 
---


diff --git a/gcc/ada/libgnarl/s-vxwext__rtp.ads b/gcc/ada/libgnarl/s-vxwext__rtp.ads
--- a/gcc/ada/libgnarl/s-vxwext__rtp.ads
+++ b/gcc/ada/libgnarl/s-vxwext__rtp.ads
@@ -56,7 +56,7 @@ package

[Ada] VxWorks inconsistent use of return type (vx_freq_t)

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

Several inconsistencies were found. They will be submitted as separate
proposed fixes. The approach is to make them "types" rather than
"subtypes" in order to catch inconsistencies in their usage, and to
declare them in a central package. System.VxWorks.Ext was chosen, the
only difficulty with this choice is that there's a version for each
runtime and that it's a libgnarl package which leaves a few outliers
in libgnat. System.VxWorks was also considered but here the difficulty
is the fact it's architecture/libgnarl specific. Leaving the only other
alternative a new package, which seemed like overkill.

This change is for vx_freq_t a friendlier type name for _Vx_freq_t. This
type was changed to C type unsigned in the VxWorks headers, but wasn't
changed in our code, and was incorrectly specified as an int.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnarl/s-osinte__vxworks.ads (SVE): New package renaming
(vx_freq_t): New subtype.
(sysClkRateGet): Return vx_freq_t.
* libgnarl/s-vxwext.ads (vx_freq_t): New type.
* libgnarl/s-vxwext__kernel.ads: Likewise.
* libgnarl/s-vxwext__rtp.ads: Likewise.diff --git a/gcc/ada/libgnarl/s-osinte__vxworks.ads b/gcc/ada/libgnarl/s-osinte__vxworks.ads
--- a/gcc/ada/libgnarl/s-osinte__vxworks.ads
+++ b/gcc/ada/libgnarl/s-osinte__vxworks.ads
@@ -57,6 +57,8 @@ package System.OS_Interface is
type unsigned_long_long is mod 2 ** long_long'Size;
type size_t is mod 2 ** Standard'Address_Size;
 
+   subtype vx_freq_t   is System.VxWorks.Ext.vx_freq_t;
+
---
-- Errno --
---
@@ -312,7 +314,7 @@ package System.OS_Interface is
function taskDelay (ticks : int) return int;
pragma Import (C, taskDelay, "taskDelay");
 
-   function sysClkRateGet return int;
+   function sysClkRateGet return vx_freq_t;
pragma Import (C, sysClkRateGet, "sysClkRateGet");
 
--  VxWorks 5.x specific functions


diff --git a/gcc/ada/libgnarl/s-vxwext.ads b/gcc/ada/libgnarl/s-vxwext.ads
--- a/gcc/ada/libgnarl/s-vxwext.ads
+++ b/gcc/ada/libgnarl/s-vxwext.ads
@@ -46,6 +46,9 @@ package System.VxWorks.Ext is
subtype int is Interfaces.C.int;
subtype unsigned is Interfaces.C.unsigned;
 
+   type vx_freq_t is new unsigned;
+   --  Equivalent of the C type _Vx_freq_t
+
type Interrupt_Handler is access procedure (parameter : System.Address);
pragma Convention (C, Interrupt_Handler);
 


diff --git a/gcc/ada/libgnarl/s-vxwext__kernel.ads b/gcc/ada/libgnarl/s-vxwext__kernel.ads
--- a/gcc/ada/libgnarl/s-vxwext__kernel.ads
+++ b/gcc/ada/libgnarl/s-vxwext__kernel.ads
@@ -45,6 +45,9 @@ package System.VxWorks.Ext is
subtype int is Interfaces.C.int;
subtype unsigned is Interfaces.C.unsigned;
 
+   type vx_freq_t is new unsigned;
+   --  Equivalent of the C type _Vx_freq_t
+
type Interrupt_Handler is access procedure (parameter : System.Address);
pragma Convention (C, Interrupt_Handler);
 


diff --git a/gcc/ada/libgnarl/s-vxwext__rtp.ads b/gcc/ada/libgnarl/s-vxwext__rtp.ads
--- a/gcc/ada/libgnarl/s-vxwext__rtp.ads
+++ b/gcc/ada/libgnarl/s-vxwext__rtp.ads
@@ -45,6 +45,9 @@ package System.VxWorks.Ext is
subtype int is Interfaces.C.int;
subtype unsigned is Interfaces.C.unsigned;
 
+   type vx_freq_t is new unsigned;
+   --  Equivalent of the C type _Vx_freq_t
+
type Interrupt_Handler is access procedure (parameter : System.Address);
pragma Convention (C, Interrupt_Handler);

[Ada] Replace use of 'Image with use of Error_Msg_Uint

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

'Image is too recent for bootstraping and shouldn't be used when
emitting error messages in any case.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_case.adb (Composite_Case_Ops): Replace 'Image with
Error_Msg_Uint.diff --git a/gcc/ada/sem_case.adb b/gcc/ada/sem_case.adb
--- a/gcc/ada/sem_case.adb
+++ b/gcc/ada/sem_case.adb
@@ -1717,12 +1717,12 @@ package body Sem_Case is
 and then List_Length (Expressions (Expr))
/= Nat (Part_Id'Last)
  then
+Error_Msg_Uint_1 := UI_From_Int
+  (List_Length (Expressions (Expr)));
+Error_Msg_Uint_2 := UI_From_Int (Int (Part_Id'Last));
 Error_Msg_N
-  ("Array aggregate length"
-& List_Length (Expressions (Expr))'Image
-& " does not match length of"
-& " statically constrained case selector"
-& Part_Id'Last'Image, Expr);
+  ("array aggregate length ^ does not match length " &
+   "of statically constrained case selector ^", Expr);
 return;
  end if;
 
@@ -1761,12 +1761,13 @@ package body Sem_Case is
 if not Unconstrained_Array_Case
and then Strlen /= Nat (Part_Id'Last)
 then
+   Error_Msg_Uint_1 := UI_From_Int (Strlen);
+   Error_Msg_Uint_2 := UI_From_Int
+ (Int (Part_Id'Last));
Error_Msg_N
- ("String literal length"
-  & Strlen'Image
-  & " does not match length of"
-  & " statically constrained case selector"
-  & Part_Id'Last'Image, Expr);
+ ("String literal length ^ does not match length" &
+  " of statically constrained case selector ^",
+  Expr);
return;
 end if;

[Ada] Generate temporary for if-expression with -fpreserve-control-flow

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

When an if-expression is a condition in an outer decision, the compiler
may entirely encode the interplay between the two decisions, i.e. the
if-expression and the outer one, in the control-flow graph, effectively
creating branches that are shared between the two decisions.

This makes it very hard for external tools to map the control-flow graph
back to the source code, so the change instructs the compiler to generate
an intermediate temporary in this case with -fpreserve-control-flow.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch4.adb (Expand_N_If_Expression): Generate an intermediate
temporary when the expression is a condition in an outer decision
and control-flow optimizations are suppressed.diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -6253,6 +6253,46 @@ package body Exp_Ch4 is
 return;
  end if;
 
+  --  For the sake of GNATcoverage, generate an intermediate temporary in
+  --  the case where the if-expression is a condition in an outer decision,
+  --  in order to make sure that no branch is shared between the decisions.
+
+  elsif Opt.Suppress_Control_Flow_Optimizations
+and then Nkind (Original_Node (Parent (N))) in N_Case_Expression
+ | N_Case_Statement
+ | N_If_Expression
+ | N_If_Statement
+ | N_Goto_When_Statement
+ | N_Loop_Statement
+ | N_Return_When_Statement
+ | N_Short_Circuit
+  then
+ declare
+Cnn  : constant Entity_Id := Make_Temporary (Loc, 'C');
+Acts : List_Id;
+
+ begin
+--  Generate:
+--do
+--   Cnn : constant Typ := N;
+--in Cnn end
+
+Acts := New_List (
+  Make_Object_Declaration (Loc,
+Defining_Identifier => Cnn,
+Constant_Present=> True,
+Object_Definition   => New_Occurrence_Of (Typ, Loc),
+Expression  => Relocate_Node (N)));
+
+Rewrite (N,
+  Make_Expression_With_Actions (Loc,
+Expression => New_Occurrence_Of (Cnn, Loc),
+Actions=> Acts));
+
+Analyze_And_Resolve (N, Typ);
+return;
+ end;
+
   --  If no actions then no expansion needed, gigi will handle it using the
   --  same approach as a C conditional expression.

[Ada] Add -gnatX support for casing on array values

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

Improve existing support for the Ada extension feature of casing on
composite values to handle casing on array values; in particular, casing
on an array value whose subtype is unconstrained (or dynamically
constrained) so that choices in the case statement may have differing
lengths. This commit does not include support for pattern-bound
identifiers in such cases.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch5.adb (Expand_General_Case_Statement.Pattern_Match): Add
new function Indexed_Element to handle array element
comparisons. Handle case choices that are array aggregates,
string literals, or names denoting constants.
* sem_case.adb (Composite_Case_Ops.Array_Case_Ops): New package
providing utilities needed for casing on arrays.
(Composite_Case_Ops.Choice_Analysis): If necessary, include
array length as a "component" (like a discriminant) when
traversing components. We do not (yet) partition choice analysis
to deal with unequal length choices separately. Instead, we
embed everything in the minimum-dimensionality Cartesian product
space needed to handle all choices properly; this is determined
by the length of the longest choice pattern.
(Composite_Case_Ops.Choice_Analysis.Traverse_Discrete_Parts):
Include length as a "component" in the traversal if necessary.
(Composite_Case_Ops.Choice_Analysis.Parse_Choice.Traverse_Choice):
Add support for case choices that are string literals or names
denoting constants.
(Composite_Case_Ops.Choice_Analysis): Include length as a
"component" in the analysis if necessary.

(Check_Choices.Check_Case_Pattern_Choices.Ops.Value_Sets.Value_Index_Count):
Improve error message when capacity exceeded.
* doc/gnat_rm/implementation_defined_pragmas.rst: Update
documentation to reflect current implementation status.
* gnat_rm.texi: Regenerate.

patch.diff.gz
Description: application/gzip

[Ada] Fix imprecise wording for error on scalar storage order

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

The current error message does not distinguish the component clause case
from the packed case, which can be confusing in the latter case.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* freeze.adb (Check_Component_Storage_Order): Give a specific error
message for non-byte-aligned component in the packed case.  Replace
"composite" with "record" in both cases.diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb
--- a/gcc/ada/freeze.adb
+++ b/gcc/ada/freeze.adb
@@ -1352,9 +1352,15 @@ package body Freeze is
 elsif Is_Record_Type (Encl_Base)
   and then not Comp_Byte_Aligned
 then
-   Error_Msg_N
- ("type of non-byte-aligned component must have same scalar "
-  & "storage order as enclosing composite", Err_Node);
+   if Present (Component_Clause (Comp)) then
+  Error_Msg_N
+("type of non-byte-aligned component must have same scalar"
+ & " storage order as enclosing record", Err_Node);
+   else
+  Error_Msg_N
+("type of packed component must have same scalar"
+ & " storage order as enclosing record", Err_Node);
+   end if;
 
 --  Warn if specified only for the outer composite

[Ada] Make Ada.Task_Initialization compatible with No_Elaboration_Code_All

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

So that this unit can be used and called even before elaboration has
started, to ensure very early registration via e.g. C code.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnarl/a-tasini.ads, libgnarl/a-tasini.adb: Make compatible
with No_Elaboration_Code_All.
* libgnarl/s-taskin.ads, libgnarl/s-tassta.adb: Adjust
accordingly.diff --git a/gcc/ada/libgnarl/a-tasini.adb b/gcc/ada/libgnarl/a-tasini.adb
--- a/gcc/ada/libgnarl/a-tasini.adb
+++ b/gcc/ada/libgnarl/a-tasini.adb
@@ -26,13 +26,13 @@
 --  --
 --
 
-with Ada.Unchecked_Conversion;
-with System.Tasking;
-
 package body Ada.Task_Initialization is
 
-   function To_STIH is new Ada.Unchecked_Conversion
- (Initialization_Handler, System.Tasking.Initialization_Handler);
+   Global_Initialization_Handler : Initialization_Handler := null;
+   pragma Atomic (Global_Initialization_Handler);
+   pragma Export (Ada, Global_Initialization_Handler,
+  "__gnat_global_initialization_handler");
+   --  Global handler called when each task initializes.
 

-- Set_Initialization_Handler --
@@ -40,7 +40,7 @@ package body Ada.Task_Initialization is
 
procedure Set_Initialization_Handler (Handler : Initialization_Handler) is
begin
-  System.Tasking.Global_Initialization_Handler := To_STIH (Handler);
+  Global_Initialization_Handler := Handler;
end Set_Initialization_Handler;
 
 end Ada.Task_Initialization;


diff --git a/gcc/ada/libgnarl/a-tasini.ads b/gcc/ada/libgnarl/a-tasini.ads
--- a/gcc/ada/libgnarl/a-tasini.ads
+++ b/gcc/ada/libgnarl/a-tasini.ads
@@ -30,7 +30,8 @@
 --  when tasks start.
 
 package Ada.Task_Initialization is
-   pragma Preelaborate (Task_Initialization);
+   pragma Preelaborate;
+   pragma No_Elaboration_Code_All;
 
type Initialization_Handler is access procedure;
 


diff --git a/gcc/ada/libgnarl/s-taskin.ads b/gcc/ada/libgnarl/s-taskin.ads
--- a/gcc/ada/libgnarl/s-taskin.ads
+++ b/gcc/ada/libgnarl/s-taskin.ads
@@ -368,14 +368,6 @@ package System.Tasking is
--  Used to represent protected procedures to be executed when task
--  terminates.
 
-   type Initialization_Handler is access procedure;
-   pragma Favor_Top_Level (Initialization_Handler);
-   --  Use to represent procedures to be executed at task initialization.
-
-   Global_Initialization_Handler : Initialization_Handler := null;
-   pragma Atomic (Global_Initialization_Handler);
-   --  Global handler called when each task initializes.
-

-- Dispatching domain definitions --



diff --git a/gcc/ada/libgnarl/s-tassta.adb b/gcc/ada/libgnarl/s-tassta.adb
--- a/gcc/ada/libgnarl/s-tassta.adb
+++ b/gcc/ada/libgnarl/s-tassta.adb
@@ -35,6 +35,7 @@ pragma Partition_Elaboration_Policy (Concurrent);
 
 with Ada.Exceptions;
 with Ada.Unchecked_Deallocation;
+with Ada.Task_Initialization;
 
 with System.Interrupt_Management;
 with System.Tasking.Debug;
@@ -1177,6 +1178,14 @@ package body System.Tasking.Stages is
  Debug.Signal_Debug_Event (Debug.Debug_Event_Run, Self_ID);
   end if;
 
+  declare
+ use Ada.Task_Initialization;
+
+ Global_Initialization_Handler : Initialization_Handler;
+ pragma Atomic (Global_Initialization_Handler);
+ pragma Import (Ada, Global_Initialization_Handler,
+"__gnat_global_initialization_handler");
+
   begin
  --  We are separating the following portion of the code in order to
  --  place the exception handlers in a different block. In this way,

[Ada] Change message format on missing return

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

To make it compatible with -gnatwE so that this warning is easier to
spot when needed.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch6.adb (Check_Returns): Change message on missing return.diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -7760,7 +7760,7 @@ package body Sem_Ch6 is
  ("RETURN statement missing following this statement<

[Ada] Mark gnatfind and gnatxref obsolete

2021-09-22 Thread Pierre-Marie de Rodat via Gcc-patches

Before removing them completely.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* gnatfind.adb, gnatxref.adb: Mark these tools as obsolete
before removing them completely.diff --git a/gcc/ada/gnatfind.adb b/gcc/ada/gnatfind.adb
--- a/gcc/ada/gnatfind.adb
+++ b/gcc/ada/gnatfind.adb
@@ -347,6 +347,11 @@ procedure Gnatfind is
 --  Start of processing for Gnatfind
 
 begin
+   Put_Line
+ ("WARNING: gnatfind is obsolete and will be removed in the next release");
+   Put_Line
+ ("Consider using Libadalang or GNAT Studio python scripting instead");
+
Parse_Cmd_Line;
 
if not Have_Entity then


diff --git a/gcc/ada/gnatxref.adb b/gcc/ada/gnatxref.adb
--- a/gcc/ada/gnatxref.adb
+++ b/gcc/ada/gnatxref.adb
@@ -299,6 +299,11 @@ procedure Gnatxref is
end Write_Usage;
 
 begin
+   Put_Line
+ ("WARNING: gnatxref is obsolete and will be removed in the next release");
+   Put_Line
+ ("Consider using Libadalang or GNAT Studio python scripting instead");
+
Parse_Cmd_Line;
 
if not Have_File then

Re: [PATCH] Enable auto-vectorization at O2 with very-cheap cost model.

2021-09-22 Thread Martin Sebor via Gcc-patches


On 9/22/21 8:21 AM, Martin Sebor wrote:

On 9/21/21 7:38 PM, Hongtao Liu wrote:

On Mon, Sep 20, 2021 at 4:13 AM Martin Sebor  wrote:

...
diff --git a/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c 
b/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c

index 1d79930cd58..9351f7e7a1a 100644
--- a/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c
+++ b/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c
@@ -1,7 +1,7 @@
   /* PR middle-end/91458 - inconsistent warning for writing past 
the end

  of an array member
  { dg-do compile }
-   { dg-options "-O2 -Wall -Wno-array-bounds -fno-ipa-icf" } */
+   { dg-options "-O2 -Wall -Wno-array-bounds -fno-ipa-icf 
-fno-tree-vectorize" } */


The testcase is large - what part requires this change?  Given the
testcase was added for inconsistent warnings do they now become
inconsistent again as we enable vectorization at -O2?

That said, the testcase adjustments need some explaining - I suppose
you didn't just slap -fno-tree-vectorize to all of those changing
behavior?


void ga1_ (void)
{
    a1_.a[0] = 0;
    a1_.a[1] = 1; // { dg-warning 
"\\\[-Wstringop-overflow" }
    a1_.a[2] = 2; // { dg-warning 
"\\\[-Wstringop-overflow" }


    struct A1 a;
    a.a[0] = 0;
    a.a[1] = 1;   // { dg-warning 
"\\\[-Wstringop-overflow" }
    a.a[2] = 2;   // { dg-warning 
"\\\[-Wstringop-overflow" }

    sink ();
}

It's supposed to be 2 warning for a.a[1] = 1 and a.a[2] = 1 since
there are 2 accesses, but after enabling vectorization, there's only
one access, so one warning is missing which causes the failure.


With the stores vectorized, is the warning on the correct line or
does it point to the first store, the one that's in bounds, as
it does with -O3?  The latter would be a regression at -O2.



I would find it preferable to change the test code over disabling
optimizations that are on by default.  My concern is that the test
would no longer exercise the default behavior.  (The same goes for
the -fno-ipa-icf option.)

Hmm, it's a middle-end test, for some backend, it may not do
vectorization(it depends on TARGET_VECTOR_MODE_SUPPORTED_P and
relative cost model).


Yes, there are quite a few warning tests like that.  Their main
purpose is to verify that in common GCC invocations (i.e., without
any special options) warnings are a) issued when expected and b)
not issued when not expected.  Otherwise, middle end warnings are
known to have both false positives and false negatives in some
invocations, depending on what optimizations are in effect.
Indiscriminately disabling common optimizations for these large
tests and invoking them under artificial conditions would
compromise this goal and hide the problems.

If enabling vectorization at -O2 causes regressions in the quality
of diagnostics (as the test failure above indicates seems to be
happening) we should investigate these and open bugs for them so
they can be fixed.  We can then tweak the specific failing test
cases to avoid the failures until they are fixed.


To expand on the last part: in my tests with -O2 and -O3 the failure
is specific to char stores and doesn't affect stores of larger types
because they're not detected by -Wstringop-overflow.  Those accesses
are handled by -Warray-bounds.  The former runs as part of the strlen
and after vectorization, while the latter runs in vrp1 and before
vectorization.  So the "quick and dirty" solution here, to keep
the warnings on the right lines, might be to move the char store
handling from the strlen pass to vrp1.  A more robust solution is
to avoid vectorizing (or merging) out of bounds stores.



Martin

Re: [PATCH] rs6000: Add psabi diagnostic for C++ zero-width bit field ABI change (PR102024)

2021-09-22 Thread Jakub Jelinek via Gcc-patches

On Wed, Sep 22, 2021 at 09:43:23AM -0500, Bill Schmidt wrote:
> > How previously?  is this one that will need all the backports?
> 
> No, the change happened recently on trunk.

It is actually more complex.
Both C and C++ FEs thought they were removing zero bit fields, but
neither did that, then the non-working code from C FE has been removed,
and finally in 4.5 the C++ FE has been "fixed" to remove zero bit fields
"correctly".  And 12 is going to remove the removal, but marks FIELD_DECLs
that were in 4.5-11 removed and now aren't for -Wpsabi purposes.
So, for most backends, C and C++ was ABI compatible in presence of :0
initially, then for several got incompatible in 4.5 and now is time to
decide for each backend what to do according to their psABI, if :0 should be
ignored or not during the function arg/return value passing decisions.

> > > --- a/gcc/config/rs6000/rs6000-call.c
> > > +++ b/gcc/config/rs6000/rs6000-call.c
> > > @@ -6227,7 +6227,7 @@ const struct altivec_builtin_types 
> > > altivec_overloaded_builtins[] = {
> > >   static int
> > >   rs6000_aggregate_candidate (const_tree type, machine_mode *modep,
> > > - int *empty_base_seen)
> > > + int *empty_base_seen, int *zero_width_bf_seen)
> > >   {
> > > machine_mode mode;
> > > HOST_WIDE_INT size;
> > > @@ -6298,7 +6298,8 @@ rs6000_aggregate_candidate (const_tree type, 
> > > machine_mode *modep,
> > > return -1;
> > >   count = rs6000_aggregate_candidate (TREE_TYPE (type), modep,
> > > - empty_base_seen);
> > > + empty_base_seen,
> > > + zero_width_bf_seen);
> > >   if (count == -1
> > >   || !index
> > >   || !TYPE_MAX_VALUE (index)
> > > @@ -6336,6 +6337,12 @@ rs6000_aggregate_candidate (const_tree type, 
> > > machine_mode *modep,
> > >   if (TREE_CODE (field) != FIELD_DECL)
> > > continue;
> > > + if (DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD (field))
> > > +   {
> > > + *zero_width_bf_seen = 1;
> > > + continue;
> > > +   }

So, from what you wrote, :0 in the ppc* psABIs the intent is that :0 is not
ignored, right?
In that case I don't really understand the above (the continue in
particular).  Because the continue means it is ignored for C++ and not
ignored for C, so basically you return to the 4.5-11 ABI incompatibility
between C and C++.
C++ :0 will have DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD set, C :0 will not...

Jakub

[PATCH, Fortran] diagnostic for argument w/type parameters for assumed-type dummy

2021-09-22 Thread Sandra Loosemore

This patch is adds the missing diagnostic noted in PR fortran/101319. 
OK to commit?


-Sandra
commit 9d5b9062d728d1b1bf5acfb914e06d776bdcdb60
Author: Sandra Loosemore 
Date:   Wed Sep 22 07:49:17 2021 -0700

Fortran: diagnostic for argument w/type parameters for assumed-type dummy

2021-09-22  Sandra Loosemore  

	PR fortran/101319

gcc/fortran/
	* interface.c (gfc_compare_actual_formal): Extend existing
	assumed-type diagnostic to also check for argument with type
	parameters.

gcc/testsuite/
	* gfortran.dg/c-interop/assumed-type-dummy.f90: Remove xfail.

diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
index f9a7c9c..dae4b95 100644
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -3183,21 +3183,21 @@ gfc_compare_actual_formal (gfc_actual_arglist **ap, gfc_formal_arglist *formal,
 			  is_elemental, where))
 	return false;
 
-  /* TS 29113, 6.3p2.  */
+  /* TS 29113, 6.3p2; F2018 15.5.2.4.  */
   if (f->sym->ts.type == BT_ASSUMED
 	  && (a->expr->ts.type == BT_DERIVED
 	  || (a->expr->ts.type == BT_CLASS && CLASS_DATA (a->expr
 	{
-	  gfc_namespace *f2k_derived;
-
-	  f2k_derived = a->expr->ts.type == BT_DERIVED
-			? a->expr->ts.u.derived->f2k_derived
-			: CLASS_DATA (a->expr)->ts.u.derived->f2k_derived;
-
-	  if (f2k_derived
-	  && (f2k_derived->finalizers || f2k_derived->tb_sym_root))
+	  gfc_symbol *derived = (a->expr->ts.type == BT_DERIVED
+ ? a->expr->ts.u.derived
+ : CLASS_DATA (a->expr)->ts.u.derived);
+	  gfc_namespace *f2k_derived = derived->f2k_derived;
+	  if (derived->attr.pdt_type
+	  || (f2k_derived
+		  && (f2k_derived->finalizers || f2k_derived->tb_sym_root)))
 	{
-	  gfc_error ("Actual argument at %L to assumed-type dummy is of "
+	  gfc_error ("Actual argument at %L to assumed-type dummy "
+			 "has type parameters or is of "
 			 "derived type with type-bound or FINAL procedures",
 			 >expr->where);
 	  return false;
diff --git a/gcc/testsuite/gfortran.dg/c-interop/assumed-type-dummy.f90 b/gcc/testsuite/gfortran.dg/c-interop/assumed-type-dummy.f90
index a14c9a5..24bdf2b 100644
--- a/gcc/testsuite/gfortran.dg/c-interop/assumed-type-dummy.f90
+++ b/gcc/testsuite/gfortran.dg/c-interop/assumed-type-dummy.f90
@@ -73,7 +73,7 @@ contains
 type(t4) :: a4
 
 call s1 (a1)  ! OK
-call s1 (a2)  ! { dg-error "assumed-type dummy" "pr101319" { xfail *-*-* } }
+call s1 (a2)  ! { dg-error "assumed-type dummy" }
 call s1 (a3)  ! { dg-error "assumed-type dummy" }
 call s1 (a4)  ! { dg-error "assumed-type dummy" }
   end subroutine

Re: [PATCH] rs6000: Add psabi diagnostic for C++ zero-width bit field ABI change (PR102024)

2021-09-22 Thread Bill Schmidt via Gcc-patches





On 9/22/21 9:35 AM, will schmidt wrote:

On Tue, 2021-09-21 at 17:35 -0500, Bill Schmidt wrote:

Hi!

Previously zero-width bit fields were removed from structs, so that otherwise
homogeneous aggregates were treated as such and passed in FPRs and VSRs.
This was incorrect behavior per the ELFv2 ABI.  Now that these fields are no
longer being removed, we generate the correct parameter passing code.  Alert
the unwary user in the rare cases where this behavior changes.

As noted in the PR, once the GCC 12 Changes page has text describing this issue,
we can update the diagnostic message to reference that URL.  I'll handle that
in a follow-up patch.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
Is this okay for trunk?

How previously?  is this one that will need all the backports?


No, the change happened recently on trunk.

Thanks very much for the review!
Bill



Thanks!
Bill


2021-09-21  Bill Schmidt  

gcc/
PR target/102024
* config/rs6000/rs6000-call.c (rs6000_aggregate_candidate): Detect
zero-width bit fields and return indicator.
(rs6000_discover_homogeneous_aggregate): Diagnose when the
presence of a zero-width bit field changes parameter passing in
GCC 12.

gcc/testsuite/
PR target/102024
* g++.target/powerpc/pr102024.C: New.


ok


---
  gcc/config/rs6000/rs6000-call.c | 39 ++---
  gcc/testsuite/g++.target/powerpc/pr102024.C | 23 
  2 files changed, 57 insertions(+), 5 deletions(-)
  create mode 100644 gcc/testsuite/g++.target/powerpc/pr102024.C

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 7d485480225..c02b202b0cd 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -6227,7 +6227,7 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
  
  static int

  rs6000_aggregate_candidate (const_tree type, machine_mode *modep,
-   int *empty_base_seen)
+   int *empty_base_seen, int *zero_width_bf_seen)
  {
machine_mode mode;
HOST_WIDE_INT size;
@@ -6298,7 +6298,8 @@ rs6000_aggregate_candidate (const_tree type, machine_mode 
*modep,
  return -1;
  
  	count = rs6000_aggregate_candidate (TREE_TYPE (type), modep,

-   empty_base_seen);
+   empty_base_seen,
+   zero_width_bf_seen);
if (count == -1
|| !index
|| !TYPE_MAX_VALUE (index)
@@ -6336,6 +6337,12 @@ rs6000_aggregate_candidate (const_tree type, 
machine_mode *modep,
if (TREE_CODE (field) != FIELD_DECL)
  continue;
  
+	if (DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD (field))

+ {
+   *zero_width_bf_seen = 1;
+   continue;
+ }
+

Noting that the definition comes from tree.h and is
#define SET_DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD(NODE, VAL) \
   do { \
 gcc_checking_assert (DECL_BIT_FIELD (NODE));   \
 FIELD_DECL_CHECK (NODE)->decl_common.decl_flag_0 = (VAL);   \
   } while (0)

ok.




if (DECL_FIELD_ABI_IGNORED (field))
  {
if (lookup_attribute ("no_unique_address",
@@ -6347,7 +6354,8 @@ rs6000_aggregate_candidate (const_tree type, machine_mode 
*modep,
  }
  
  	sub_count = rs6000_aggregate_candidate (TREE_TYPE (field), modep,

-   empty_base_seen);
+   empty_base_seen,
+   zero_width_bf_seen);
if (sub_count < 0)
  return -1;
count += sub_count;
@@ -6381,7 +6389,8 @@ rs6000_aggregate_candidate (const_tree type, machine_mode 
*modep,
  continue;
  
  	sub_count = rs6000_aggregate_candidate (TREE_TYPE (field), modep,

-   empty_base_seen);
+   empty_base_seen,
+   zero_width_bf_seen);
if (sub_count < 0)
  return -1;
count = count > sub_count ? count : sub_count;
@@ -6423,8 +6432,10 @@ rs6000_discover_homogeneous_aggregate (machine_mode 
mode, const_tree type,
  {
machine_mode field_mode = VOIDmode;
int empty_base_seen = 0;
+  int zero_width_bf_seen = 0;
int field_count = rs6000_aggregate_candidate (type, _mode,
-   _base_seen);
+   _base_seen,
+   _width_bf_seen);
  

That appears to be all of the callers of rs6000_aggregate_candidate.
(ok).


if (field_count > 0)

Re: [PATCH] c++: concept-ids and value-dependence [PR102412]

2021-09-22 Thread Patrick Palka via Gcc-patches

On Tue, 21 Sep 2021, Jason Merrill wrote:

> On 9/21/21 09:30, Patrick Palka wrote:
> >   case TEMPLATE_ID_EXPR:
> > -  return concept_definition_p (TREE_OPERAND (expression, 0));
> > +  return concept_definition_p (TREE_OPERAND (expression, 0))
> > +   && any_dependent_template_arguments_p (TREE_OPERAND (expression, 1));
> 
> Hmm, do we even need to check concept_definition_p?  Even if other
> template-ids don't get here, if they did they would also be dependent if they
> had dependent template arguments.

Ah yeah, the concept_definition_p check doesn't seem to be needed.  IIUC
other template-ids can get here but for them we should always return
false at this point since we already checked for type-dependence earlier
in the function (which also checks a_d_t_a_p).

Though to be extra safe I'm inclined to keep the check to avoid
potentially affecting non-concepts code when backporting the patch.

> 
> OK either way.
> 
> Jason
> 
>

Re: [PATCH] rs6000: Add psabi diagnostic for C++ zero-width bit field ABI change (PR102024)

2021-09-22 Thread will schmidt via Gcc-patches

On Tue, 2021-09-21 at 17:35 -0500, Bill Schmidt wrote:
> Hi!
> 
> Previously zero-width bit fields were removed from structs, so that otherwise
> homogeneous aggregates were treated as such and passed in FPRs and VSRs.
> This was incorrect behavior per the ELFv2 ABI.  Now that these fields are no
> longer being removed, we generate the correct parameter passing code.  Alert
> the unwary user in the rare cases where this behavior changes.
> 
> As noted in the PR, once the GCC 12 Changes page has text describing this 
> issue,
> we can update the diagnostic message to reference that URL.  I'll handle that
> in a follow-up patch.
> 
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
> Is this okay for trunk?

How previously?  is this one that will need all the backports? 

> 
> Thanks!
> Bill
> 
> 
> 2021-09-21  Bill Schmidt  
> 
> gcc/
>   PR target/102024
>   * config/rs6000/rs6000-call.c (rs6000_aggregate_candidate): Detect
>   zero-width bit fields and return indicator.
>   (rs6000_discover_homogeneous_aggregate): Diagnose when the
>   presence of a zero-width bit field changes parameter passing in
>   GCC 12.
> 
> gcc/testsuite/
>   PR target/102024
>   * g++.target/powerpc/pr102024.C: New.


ok

> ---
>  gcc/config/rs6000/rs6000-call.c | 39 ++---
>  gcc/testsuite/g++.target/powerpc/pr102024.C | 23 
>  2 files changed, 57 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/g++.target/powerpc/pr102024.C
> 
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index 7d485480225..c02b202b0cd 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -6227,7 +6227,7 @@ const struct altivec_builtin_types 
> altivec_overloaded_builtins[] = {
>  
>  static int
>  rs6000_aggregate_candidate (const_tree type, machine_mode *modep,
> - int *empty_base_seen)
> + int *empty_base_seen, int *zero_width_bf_seen)
>  {
>machine_mode mode;
>HOST_WIDE_INT size;
> @@ -6298,7 +6298,8 @@ rs6000_aggregate_candidate (const_tree type, 
> machine_mode *modep,
> return -1;
>  
>   count = rs6000_aggregate_candidate (TREE_TYPE (type), modep,
> - empty_base_seen);
> + empty_base_seen,
> + zero_width_bf_seen);
>   if (count == -1
>   || !index
>   || !TYPE_MAX_VALUE (index)
> @@ -6336,6 +6337,12 @@ rs6000_aggregate_candidate (const_tree type, 
> machine_mode *modep,
>   if (TREE_CODE (field) != FIELD_DECL)
> continue;
>  
> + if (DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD (field))
> +   {
> + *zero_width_bf_seen = 1;
> + continue;
> +   }
> +

Noting that the definition comes from tree.h and is 
#define SET_DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD(NODE, VAL) \
  do {  \
gcc_checking_assert (DECL_BIT_FIELD (NODE));\
FIELD_DECL_CHECK (NODE)->decl_common.decl_flag_0 = (VAL);   \
  } while (0)

ok.



>   if (DECL_FIELD_ABI_IGNORED (field))
> {
>   if (lookup_attribute ("no_unique_address",
> @@ -6347,7 +6354,8 @@ rs6000_aggregate_candidate (const_tree type, 
> machine_mode *modep,
> }
>  
>   sub_count = rs6000_aggregate_candidate (TREE_TYPE (field), modep,
> - empty_base_seen);
> + empty_base_seen,
> + zero_width_bf_seen);
>   if (sub_count < 0)
> return -1;
>   count += sub_count;
> @@ -6381,7 +6389,8 @@ rs6000_aggregate_candidate (const_tree type, 
> machine_mode *modep,
> continue;
>  
>   sub_count = rs6000_aggregate_candidate (TREE_TYPE (field), modep,
> - empty_base_seen);
> + empty_base_seen,
> + zero_width_bf_seen);
>   if (sub_count < 0)
> return -1;
>   count = count > sub_count ? count : sub_count;
> @@ -6423,8 +6432,10 @@ rs6000_discover_homogeneous_aggregate (machine_mode 
> mode, const_tree type,
>  {
>machine_mode field_mode = VOIDmode;
>int empty_base_seen = 0;
> +  int zero_width_bf_seen = 0;
>int field_count = rs6000_aggregate_candidate (type, _mode,
> - _base_seen);
> + _base_seen,
> + _width_bf_seen);
>  

That appears to be all of the callers of rs6000_aggregate_candidate. 
(ok).

>if (field_count > 0)
>   {
>

Re: [PATCH] Enable auto-vectorization at O2 with very-cheap cost model.

2021-09-22 Thread Martin Sebor via Gcc-patches


On 9/21/21 7:38 PM, Hongtao Liu wrote:

On Mon, Sep 20, 2021 at 4:13 AM Martin Sebor  wrote:

...

diff --git a/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c 
b/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c
index 1d79930cd58..9351f7e7a1a 100644
--- a/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c
+++ b/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c
@@ -1,7 +1,7 @@
   /* PR middle-end/91458 - inconsistent warning for writing past the end
  of an array member
  { dg-do compile }
-   { dg-options "-O2 -Wall -Wno-array-bounds -fno-ipa-icf" } */
+   { dg-options "-O2 -Wall -Wno-array-bounds -fno-ipa-icf -fno-tree-vectorize" 
} */


The testcase is large - what part requires this change?  Given the
testcase was added for inconsistent warnings do they now become
inconsistent again as we enable vectorization at -O2?

That said, the testcase adjustments need some explaining - I suppose
you didn't just slap -fno-tree-vectorize to all of those changing
behavior?


void ga1_ (void)
{
a1_.a[0] = 0;
a1_.a[1] = 1; // { dg-warning "\\\[-Wstringop-overflow" }
a1_.a[2] = 2; // { dg-warning "\\\[-Wstringop-overflow" }

struct A1 a;
a.a[0] = 0;
a.a[1] = 1;   // { dg-warning "\\\[-Wstringop-overflow" }
a.a[2] = 2;   // { dg-warning "\\\[-Wstringop-overflow" }
sink ();
}

It's supposed to be 2 warning for a.a[1] = 1 and a.a[2] = 1 since
there are 2 accesses, but after enabling vectorization, there's only
one access, so one warning is missing which causes the failure.


With the stores vectorized, is the warning on the correct line or
does it point to the first store, the one that's in bounds, as
it does with -O3?  The latter would be a regression at -O2.



I would find it preferable to change the test code over disabling
optimizations that are on by default.  My concern is that the test
would no longer exercise the default behavior.  (The same goes for
the -fno-ipa-icf option.)

Hmm, it's a middle-end test, for some backend, it may not do
vectorization(it depends on TARGET_VECTOR_MODE_SUPPORTED_P and
relative cost model).


Yes, there are quite a few warning tests like that.  Their main
purpose is to verify that in common GCC invocations (i.e., without
any special options) warnings are a) issued when expected and b)
not issued when not expected.  Otherwise, middle end warnings are
known to have both false positives and false negatives in some
invocations, depending on what optimizations are in effect.
Indiscriminately disabling common optimizations for these large
tests and invoking them under artificial conditions would
compromise this goal and hide the problems.

If enabling vectorization at -O2 causes regressions in the quality
of diagnostics (as the test failure above indicates seems to be
happening) we should investigate these and open bugs for them so
they can be fixed.  We can then tweak the specific failing test
cases to avoid the failures until they are fixed.

Martin

PING [PATCH] doc: improve -fsanitize=undefined description

2021-09-22 Thread Diane Meirowitz via Gcc-patches

Please review my patch. It is tiny. Thank you.

Diane

On 9/15/21, 5:02 PM, "Diane Meirowitz"  wrote:


doc: improve -fsanitize=undefined description

gcc/ChangeLog:
* doc/invoke.texi: add link to UndefinedBehaviorSanitizer 
documentation,
mention UBSAN_OPTIONS, similar to what is done for 
AddressSanitizer.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 78cfc100ac2..f022885edf8 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15200,7 +15200,8 @@ The option cannot be combined with 
@option{-fsanitize=thread}.
@opindex fsanitize=undefined
Enable UndefinedBehaviorSanitizer, a fast undefined behavior detector.
Various computations are instrumented to detect undefined behavior
-at runtime.  Current suboptions are:
+at runtime.  See 
@uref{https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html} for more 
details.   The run-time behavior can be influenced using the
+@env{UBSAN_OPTIONS} environment variable.  Current suboptions are:

@table @gcctabopt

Re: [COMMITTED] Use EDGE_EXECUTABLE in ranger and return UNDEFINED for those edges.

2021-09-22 Thread Andrew MacLeod via Gcc-patches


On 9/22/21 4:25 AM, Richard Biener wrote:

On Tue, Sep 21, 2021 at 3:50 PM Andrew MacLeod  wrote:

On 9/21/21 9:32 AM, Richard Biener wrote:

On Tue, Sep 21, 2021 at 2:57 PM Andrew MacLeod  wrote:

On 9/21/21 2:14 AM, Richard Biener wrote:

On Tue, Sep 21, 2021 at 8:09 AM Richard Biener
 wrote:

On Tue, Sep 21, 2021 at 12:01 AM Andrew MacLeod via Gcc-patches
 wrote:

The patch sets the EXECUTABLE property on edges like VRP does, and then
removes that flag when an edge is determined to be un-executable.

This information is then used to return UNDEFINED for any requests on
un-executable edges, and to register equivalencies if all executable
edges of a PHI node are the same SSA_NAME.

This catches up a number of the cases VRP gets that ranger was missing,
and reduces the EVRP discrepancies to almost 0.

On a side note,  is there any interest/value in reversing the meaning of
that flag?  It seems to me that we could assume edges are EXECUTABLE by
default, then set a NON_EXECUTABLE flag when a pass determines the edge
cannot be executed.  This would rpevent a number fo passes from having
to loop through all the edges and set the EXECUTABLE property...   It
just seems backwards to me.

The flag is simply not kept up-to-date and it's the passes responsibility to
make use of it (aka install a default state upon entry).

To me not having EDGE_EXECUTABLE set on entry is more natural
for optimistic propagation passes, but yes, if you do on-demand greedy
processing then you need a conservative default.  But then how do you
denote a 'VARYING' (executable) state that may not drop back to 'CONSTANT"
(not executable)?  For optimistic propagation EDGE_EXECUTABLE set is
simply the varying state and since we never clear it again there's no chance
of oscillation.

Different model, we dont have a lattice whereby we track state and move
form one to another.. we just track currently best known values for
everything and recalculate them when the old values are stale.   We move
the edge to unexecutable when those values allow us to rewrite a branch
such that an edge can no longer be taken. everything else is executable.
 Any values on an unexecutable edge are then considered UNDEFINED when
combined with other values..


Btw, I fail to see how the patch makes ranger assure a sane initial state of
EDGE_EXECUTABLE (or make its use conditional).  Is the code you patched
not also used on-demand?

THe constructor for a ranger makes everything executable to start.
Calls the same routine VRP does.

gimple_ranger::gimple_ranger () : tracer ("")
{
@@ -41,6 +42,7 @@ gimple_ranger::gimple_ranger () : tracer ("")
  m_oracle = m_cache.oracle ();
  if (dump_file && (param_evrp_mode & EVRP_MODE_TRACE))
tracer.enable_trace ();
+  set_all_edges_as_executable (cfun);
}

Ah, I see.  I had the impression that with ranger we can now
do a cheap query everywhere on the range of an SSA name.  But then
the above is O(CFG size)...

One of the reasons I'd like to see it persistent :-)  We could
alternatively add another new one, something like EDGE_NEVER_EXECUTED
which is cleared by default when created and only ranger/other
interested passes utilize it and it is kept persistent.   Just seems
more appropriate to "fix" the current flag. I took a quick look at that,
but it seemed like one or more of the propagation passes may use the
flag for other nefarious purposes. It would require fixing everyone to
maintain the value properly.

   Queries are still "cheap", but there are varying amounts of lookups
and allocations that are done.  If the lack of a persistent EXECUTABLE
edge flag continues, I may make some further tweaks and make it
sensitive to whether EXECUTABLE is to be looked at or not and perhaps
only have the VRPs initiate that.  I prefer avoiding different modes
when possible tho.

Currently most/all uses of ranger are instantiated and used for the
duration of a pass, so the O(cfg) is pretty minimal with all the CFG
traversing and caching required.

Btw, there's auto_edge_flag (fun) that gets you a new flag
allocated and it's supposed to be cleared on all edges
(but I don't think we actually verify that - I suppose we should).
The downside is you have to clear it after use - but it would
in theory be possible to elide that by keeping a set of
"dirty" flags and only clear all of those when we run out of
non-dirty free flags.

Richard.


Huh, I did not know that.

Yeah, that looks like it might be a decent solution..  I'll take a 
closer look today.


Thanks

Andrew

Re: [PATCH] Obsolete hppa[12]--hpux10* and hppa[12]--hpux11*

2021-09-22 Thread Richard Biener via Gcc-patches

On Wed, 22 Sep 2021, John David Anglin wrote:

> On 2021-09-20 3:00 a.m., Richard Biener wrote:
> > As discussed, I'm going to push this (and a changes.html entry) when
> > it was included in a bootstrap/regtest cycle.
> GCC 12 still builds on hppa2.0w-hp-hpux11.11 with --enable-obsolete:
> https://gcc.gnu.org/pipermail/gcc-testresults/2021-September/722961.html

Yes, that's how it's supposed to be, but it will lose the ability to
emit (stabs) debug info when GCC 13 stage1 opens and we get along to
remove stabs support.  At that point maintaining the configuration
might cease to make sense.

Richard.

Re: [PATCH] Obsolete hppa[12]--hpux10* and hppa[12]--hpux11*

2021-09-22 Thread John David Anglin

On 2021-09-20 3:00 a.m., Richard Biener wrote:
> As discussed, I'm going to push this (and a changes.html entry) when
> it was included in a bootstrap/regtest cycle.
GCC 12 still builds on hppa2.0w-hp-hpux11.11 with --enable-obsolete:
https://gcc.gnu.org/pipermail/gcc-testresults/2021-September/722961.html

-- 
John David Anglin  dave.ang...@bell.net

[COMMITTED] Check for BB before calling register_outgoing_edges.

2021-09-22 Thread Aldy Hernandez via Gcc-patches

We may be asked to fold an artificial statement not in the CFG.  Since
there are no outgoing edges from those, avoid calling
register_outgoing_edges.

Tested on x86-64 Linux.

gcc/ChangeLog:

* gimple-range-fold.cc (fold_using_range::range_of_range_op):
Move check for non-empty BB here.
(fur_source::register_outgoing_edges): ...from here.
---
 gcc/gimple-range-fold.cc | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc
index d7fa0f2c86e..1da1befa9a2 100644
--- a/gcc/gimple-range-fold.cc
+++ b/gcc/gimple-range-fold.cc
@@ -650,7 +650,9 @@ fold_using_range::range_of_range_op (irange , gimple *s, 
fur_source )
src.register_relation (s, rel, lhs, op2);
}
}
- else if (is_a (s))
+ // Check for an existing BB, as we maybe asked to fold an
+ // artificial statement not in the CFG.
+ else if (is_a (s) && gimple_bb (s))
{
  basic_block bb = gimple_bb (s);
  edge e0 = EDGE_SUCC (bb, 0);
@@ -1404,10 +1406,6 @@ fur_source::register_outgoing_edges (gcond *s, irange 
_range, edge e0, edge
   range_operator *handler;
   basic_block bb = gimple_bb (s);
 
-  // We may get asked to fold an artificial statement not in the CFG.
-  if (!bb)
-return;
-
   if (e0)
 {
   // If this edge is never taken, ignore it.
-- 
2.31.1

[COMMITTED] path solver: Use range_on_path_entry instead of looking at equivalences.

2021-09-22 Thread Aldy Hernandez via Gcc-patches

[Thanks for spotting this Andrew.]

Cycling through equivalences to improve a range is nowhere near as
efficient as asking the ranger what the range on entry is.

Testing on a hybrid VRP threader, shows that this improves our VRP
threading benefit from 14.5% to 18.5% and our overall jump threads from
0.85% to 1.28%.

Tested on x86-64 Linux.

gcc/ChangeLog:

* gimple-range-path.cc (path_range_query::internal_range_of_expr):
Remove call to improve_range_with_equivs.
(path_range_query::improve_range_with_equivs): Remove
* gimple-range-path.h: Remove improve_range_with_equivs.
---
 gcc/gimple-range-path.cc | 33 +
 gcc/gimple-range-path.h  |  1 -
 2 files changed, 1 insertion(+), 33 deletions(-)

diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
index e65c7996bb7..d052ebd81fc 100644
--- a/gcc/gimple-range-path.cc
+++ b/gcc/gimple-range-path.cc
@@ -163,10 +163,6 @@ path_range_query::internal_range_of_expr (irange , tree 
name, gimple *stmt)
   if (m_resolve && defined_outside_path (name))
 {
   range_on_path_entry (r, name);
-
-  if (r.varying_p ())
-   improve_range_with_equivs (r, name);
-
   set_cache (r, name);
   return true;
 }
@@ -178,7 +174,7 @@ path_range_query::internal_range_of_expr (irange , tree 
name, gimple *stmt)
r.intersect (gimple_range_global (name));
 
   if (m_resolve && r.varying_p ())
-   improve_range_with_equivs (r, name);
+   range_on_path_entry (r, name);
 
   set_cache (r, name);
   return true;
@@ -201,33 +197,6 @@ path_range_query::range_of_expr (irange , tree name, 
gimple *stmt)
   return false;
 }
 
-// Improve the range of NAME with the range of any of its equivalences.
-
-void
-path_range_query::improve_range_with_equivs (irange , tree name)
-{
-  if (TREE_CODE (name) != SSA_NAME)
-return;
-
-  basic_block entry = entry_bb ();
-  relation_oracle *oracle = m_ranger.oracle ();
-
-  if (const bitmap_head *equivs = oracle->equiv_set (name, entry))
-{
-  int_range_max t;
-  bitmap_iterator bi;
-  unsigned i;
-
-  EXECUTE_IF_SET_IN_BITMAP (equivs, 0, i, bi)
-   if (i != SSA_NAME_VERSION (name) && r.varying_p ())
- {
-   tree equiv = ssa_name (i);
-   range_on_path_entry (t, equiv);
-   r.intersect (t);
- }
-}
-}
-
 bool
 path_range_query::unreachable_path_p ()
 {
diff --git a/gcc/gimple-range-path.h b/gcc/gimple-range-path.h
index 6f81f21d42f..f7d9832ac8c 100644
--- a/gcc/gimple-range-path.h
+++ b/gcc/gimple-range-path.h
@@ -63,7 +63,6 @@ private:
   void ssa_range_in_phi (irange , gphi *phi);
   void precompute_relations (const vec &);
   void precompute_phi_relations (basic_block bb, basic_block prev);
-  void improve_range_with_equivs (irange , tree name);
   void add_copies_to_imports ();
   bool add_to_imports (tree name, bitmap imports);
 
-- 
2.31.1

Re: [committed] Make test names unique for a couple of goacc tests

2021-09-22 Thread Thomas Schwinge

Hi!

On 2021-09-19T11:35:00-0600, Jeff Law via Gcc-patches  
wrote:
> A couple of goacc tests do not have unique names.

Thanks for fixing this up, and sorry, largely my "fault", I suppose.  ;-|

> This causes problems
> for the test comparison script when one of the test passes and the other
> fails -- in this scenario the test comparison script claims there is a
> regression.

So I understand correctly that this is a problem not just for actual
mixed PASS vs. FAIL (which we'd like you to report anyway!) that appear
for the same line, but also for mixed PASS vs. XFAIL?  (Because, the
latter appears to be what you're addressing with your commit here.)

> This slipped through for a while because I had turned off x86_64 testing
> (others test it regularly and I was revamping the tester's hardware
> requirements).  Now that I've acquired more x86_64 resources and turned
> on native x86 testing again, it's been flagged.

(I don't follow that argument -- these test cases should be all generic?
Anyway, not important, I guess.)

> This patch just adds a numeric suffix to the TODO string to disambiguate
> them.

So, instead of doing this manually (always error-prone!), like you've...

> Committed to the trunk,

> commit f75b237254f32d5be32c9d9610983b777abea633
> Author: Jeff Law 
> Date:   Sun Sep 19 13:31:32 2021 -0400
>
> [committed] Make test names unique for a couple of goacc tests

> --- a/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute.f90
> +++ b/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute.f90
> @@ -39,9 +39,9 @@ contains
>!$acc atomic write ! ... to force 'TREE_ADDRESSABLE'.
>y = a
>  !$acc end parallel
> -! { dg-note {variable 'i' in 'private' clause potentially has improper 
> OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } 
> l_compute$c_compute }
> -! { dg-note {variable 'j' in 'private' clause potentially has improper 
> OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } 
> l_compute$c_compute }
> -! { dg-note {variable 'a' in 'private' clause potentially has improper 
> OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } 
> l_compute$c_compute }
> +! { dg-note {variable 'i' in 'private' clause potentially has improper 
> OpenACC privatization level: 'parm_decl'} "TODO2" { xfail *-*-* } 
> l_compute$c_compute }
> +! { dg-note {variable 'j' in 'private' clause potentially has improper 
> OpenACC privatization level: 'parm_decl'} "TODO3" { xfail *-*-* } 
> l_compute$c_compute }
> +! { dg-note {variable 'a' in 'private' clause potentially has improper 
> OpenACC privatization level: 'parm_decl'} "TODO4" { xfail *-*-* } 
> l_compute$c_compute }

... etc. (also similarly in a handful of earlier commits, if I remember
correctly), why don't we do that programmatically, like in the attached
"Make sure that we get unique test names if several DejaGnu directives
refer to the same line", once and for all?  OK to push after proper
testing?


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 6e3ae5784888be70056ccc3bb7d379fa8e7f6fc0 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 22 Sep 2021 12:42:41 +0200
Subject: [PATCH] Make sure that we get unique test names if several DejaGnu
 directives refer to the same line

	gcc/testsuite/
	* lib/gcc-dg.exp (process-message): Make sure that we get unique
	test names.
---
 gcc/testsuite/lib/gcc-dg.exp | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index 7edd070d71d..78a6c3651ef 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -1191,8 +1191,18 @@ proc process-message { msgproc msgprefix dgargs } {
 upvar dg-messages dg-messages
 
 if { [llength $dgargs] == 5 } {
-	set num [get-absolute-line [lindex $dgargs 0] [lindex $dgargs 4]]
-	set dgargs [lreplace $dgargs 4 4 $num]
+	set useline [lindex $dgargs 0]
+
+	# Resolve absolute line number.
+	set line [get-absolute-line $useline [lindex $dgargs 4]]
+	set dgargs [lreplace $dgargs 4 4 $line]
+
+	if { $line != $useline } {
+	# Make sure that we get unique test names if different USELINEs
+	# refer to the same LINE.
+	set comment "[lindex $dgargs 2] at line $useline"
+	set dgargs [lreplace $dgargs 2 2 $comment]
+	}
 }
 
 # Process the dg- directive, including adding the regular expression
-- 
2.33.0

Re: [PATCH 59/62] AVX512FP16: Support load/store/abs intrinsics.

2021-09-22 Thread Hongtao Liu via Gcc-patches

I'm going to check in 4 patches.

[PATCH 59/62] AVX512FP16: Support load/store/abs intrinsics.
[PATCH 60/62] AVX512FP16: Add reduce operators(add/mul/min/max).
[PATCH 61/62] AVX512FP16: Add complex conjugation intrinsic instructions.
[PATCH 62/62] AVX512FP16: Add permutation and mask blend intrinsics.

  Bootstrapped and regtest on x86_64-pc-linux-gnu{-m32,}.
  Newly added runtime tests passed on sde{-m32,}.

On Thu, Jul 1, 2021 at 2:18 PM liuhongt  wrote:
>
> From: dianhong xu 
>
> gcc/ChangeLog:
>
> * config/i386/avx512fp16intrin.h (__m512h_u, __m256h_u,
> __m128h_u): New typedef.
> (_mm512_load_ph): New intrinsic.
> (_mm256_load_ph): Ditto.
> (_mm_load_ph): Ditto.
> (_mm512_loadu_ph): Ditto.
> (_mm256_loadu_ph): Ditto.
> (_mm_loadu_ph): Ditto.
> (_mm512_store_ph): Ditto.
> (_mm256_store_ph): Ditto.
> (_mm_store_ph): Ditto.
> (_mm512_storeu_ph): Ditto.
> (_mm256_storeu_ph): Ditto.
> (_mm_storeu_ph): Ditto.
> (_mm512_abs_ph): Ditto.
> * config/i386/avx512fp16vlintrin.h
> (_mm_abs_ph): Ditto.
> (_mm256_abs_ph): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/avx512fp16-13.c: New test.
> ---
>  gcc/config/i386/avx512fp16intrin.h|  97 
>  gcc/config/i386/avx512fp16vlintrin.h  |  16 ++
>  gcc/testsuite/gcc.target/i386/avx512fp16-13.c | 143 ++
>  3 files changed, 256 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-13.c
>
> diff --git a/gcc/config/i386/avx512fp16intrin.h 
> b/gcc/config/i386/avx512fp16intrin.h
> index 39c10beb1de..b8ca9201828 100644
> --- a/gcc/config/i386/avx512fp16intrin.h
> +++ b/gcc/config/i386/avx512fp16intrin.h
> @@ -45,6 +45,11 @@ typedef _Float16 __m128h __attribute__ ((__vector_size__ 
> (16), __may_alias__));
>  typedef _Float16 __m256h __attribute__ ((__vector_size__ (32), 
> __may_alias__));
>  typedef _Float16 __m512h __attribute__ ((__vector_size__ (64), 
> __may_alias__));
>
> +/* Unaligned version of the same type.  */
> +typedef _Float16 __m128h_u __attribute__ ((__vector_size__ (16), 
> __may_alias__, __aligned__ (1)));
> +typedef _Float16 __m256h_u __attribute__ ((__vector_size__ (32), 
> __may_alias__, __aligned__ (1)));
> +typedef _Float16 __m512h_u __attribute__ ((__vector_size__ (64), 
> __may_alias__, __aligned__ (1)));
> +
>  extern __inline __m128h
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm_set_ph (_Float16 __A7, _Float16 __A6, _Float16 __A5,
> @@ -362,6 +367,48 @@ _mm_load_sh (void const *__P)
>  *(_Float16 const *) __P);
>  }
>
> +extern __inline __m512h
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_load_ph (void const *__P)
> +{
> +  return *(const __m512h *) __P;
> +}
> +
> +extern __inline __m256h
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm256_load_ph (void const *__P)
> +{
> +  return *(const __m256h *) __P;
> +}
> +
> +extern __inline __m128h
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm_load_ph (void const *__P)
> +{
> +  return *(const __m128h *) __P;
> +}
> +
> +extern __inline __m512h
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_loadu_ph (void const *__P)
> +{
> +  return *(const __m512h_u *) __P;
> +}
> +
> +extern __inline __m256h
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm256_loadu_ph (void const *__P)
> +{
> +  return *(const __m256h_u *) __P;
> +}
> +
> +extern __inline __m128h
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm_loadu_ph (void const *__P)
> +{
> +  return *(const __m128h_u *) __P;
> +}
> +
>  /* Stores the lower _Float16 value.  */
>  extern __inline void
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> @@ -370,6 +417,56 @@ _mm_store_sh (void *__P, __m128h __A)
>*(_Float16 *) __P = ((__v8hf)__A)[0];
>  }
>
> +extern __inline void
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_store_ph (void *__P, __m512h __A)
> +{
> +   *(__m512h *) __P = __A;
> +}
> +
> +extern __inline void
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm256_store_ph (void *__P, __m256h __A)
> +{
> +   *(__m256h *) __P = __A;
> +}
> +
> +extern __inline void
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm_store_ph (void *__P, __m128h __A)
> +{
> +   *(__m128h *) __P = __A;
> +}
> +
> +extern __inline void
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_storeu_ph (void *__P, __m512h __A)
> +{
> +   *(__m512h_u *) __P = __A;
> +}
> +
> +extern __inline void
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm256_storeu_ph (void *__P, __m256h __A)
> +{
> +   *(__m256h_u *) __P = __A;
> +}
> +
> +extern __inline

[Committed] IBM Z: TPF: Add cc clobber to profiling expanders

2021-09-22 Thread Andreas Krebbel via Gcc-patches

The code sequence emitted uses CC internally.

gcc/ChangeLog:

* config/s390/tpf.md (prologue_tpf, epilogue_tpf): Add cc clobber.
---
 gcc/config/s390/tpf.md | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/config/s390/tpf.md b/gcc/config/s390/tpf.md
index 297e9d1f755..35b37190705 100644
--- a/gcc/config/s390/tpf.md
+++ b/gcc/config/s390/tpf.md
@@ -21,7 +21,8 @@ (define_insn "prologue_tpf"
   [(unspec_volatile [(match_operand 0 "const_int_operand" "J")
 (match_operand 1 "const_int_operand" "J")]
UNSPECV_TPF_PROLOGUE)
-   (clobber (reg:DI 1))]
+   (clobber (reg:DI 1))
+   (clobber (reg:CC CC_REGNUM))]
   "TARGET_TPF_PROFILING"
   "larl\t%%r1,.+14\;tm\t%0,255\;bnz\t%1"
   [(set_attr "length"   "14")])
@@ -31,7 +32,8 @@ (define_insn "epilogue_tpf"
   [(unspec_volatile [(match_operand 0 "const_int_operand" "J")
 (match_operand 1 "const_int_operand" "J")]
UNSPECV_TPF_EPILOGUE)
-   (clobber (reg:DI 1))]
+   (clobber (reg:DI 1))
+   (clobber (reg:CC CC_REGNUM))]
   "TARGET_TPF_PROFILING"
   "larl\t%%r1,.+14\;tm\t%0,255\;bnz\t%1"
   [(set_attr "length"   "14")])
-- 
2.31.1

Re: [PATCH 3/N] Come up with casm global state.

2021-09-22 Thread Richard Biener via Gcc-patches

On Thu, Sep 16, 2021 at 3:12 PM Martin Liška  wrote:
>
> This patch comes up with asm_out_state and a new global variable casm.
>
> Tested on all cross compilers.
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

* output.h (struct asm_out_file): New struct that replaces a
^^^
asm_out_state?

You replace a lot of asm_out_file - do we still need the macro then?
(I'd have simplified the patch leaving that replacement out at first)

You leave quite some global state out of the struct that is obviously
related, in the diff I see object_block_htab for example.  Basically
everything initialized in init_varasm_once is a candidate (which
then shows const_desc_htab and shared_constant_pool as well
as pending_assemble_externals_set).  For the goal of outputting
early DWARF to another file the state CTOR could have a mode
to not initialize those parts or we could have asm-out-state-with-sections
as base of asm-out-state.

In the end there will be a target part of the state so I think
construction needs to be defered to the target which can
derive from asm-out-state and initialize the part it needs.
That's currently what targetm.asm_out.init_sections () does
and we'd transform that to a targetm.asm_out.create () or so.
That might already be necessary for the DWARF stuff.

That said, dealing with the target stuff piecemail is OK, but maybe
try to make sure that init_varasm_once is actually identical
to what the CTOR does?

Richard.

> Thanks,
> Martin

Re: [PATCH][GCC] arm: Add Cortex-R52+ multilib

2021-09-22 Thread Richard Earnshaw via Gcc-patches

I think the RTEMS multilibs are based on the products that RTEMS 
supports, so this is really the RTEMS maintainers' call.


Joel?

On 22/09/2021 09:46, Przemyslaw Wirkus via Gcc-patches wrote:

Patch is adding multilib entries for `cortex-r52plus` CPU.

See: https://www.arm.com/products/silicon-ip-cpu/cortex-r/cortex-r52-plus

OK for master?

gcc/ChangeLog:

2021-09-16  Przemyslaw Wirkus  

* config/arm/t-rtems: Add "-mthumb -mcpu=cortex-r52plus
-mfloat-abi=hard" multilib.

Re: [PATCH 2/N] Do not hide asm_out_file in ASM_OUTPUT_ASCII.

2021-09-22 Thread Richard Biener via Gcc-patches

On Thu, Sep 16, 2021 at 12:01 PM Martin Liška  wrote:
>
> Again a preparation patch that was tested on all cross compilers.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

I think you want to retain

-FILE *_hide_asm_out_file = (MYFILE);

and use _hide_asm_out_file to preserve MYFILE execution counts in case
it contains side-effects.

OK with that change.

Richard.

> Thanks,
> Martin

Re: [PATCH Take 2] More NEGATE_EXPR folding in match.pd

2021-09-22 Thread Richard Biener via Gcc-patches

On Fri, Sep 10, 2021 at 11:22 AM Roger Sayle  wrote:
>
>
> Hi Richard,
> Thanks for suggestion, which cleanly solves the problem I was encountering.
> This revised patch adds a Boolean simplify argument to tree-ssa-sccvn.c's
> vn_nary_build_or_lookup_1 to control whether to simplification should be
> performed before value numbering, updating the callers, but then
> avoiding simplification when constructing/value-numbering NEGATE_EXPR.
> This avoids the regression of gcc.dg/tree-ssa/ssa-free-88.c, and enables the
> new test case(s) to pass.  Brilliant, thank you.
>
> This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap"
> and "make -k check" with no new failures.  Ok for mainline?

OK (sorry for the delay).

> 
>
> 2021-09-10  Roger Sayle  
> Richard Biener  
>
> gcc/ChangeLog
> * match.pd (negation simplifications): Implement some negation
> folding transformations from fold-const.c's fold_negate_expr.
> * tree-ssa-sccvn.c (vn_nary_build_or_lookup_1): Add a SIMPLIFY
> argument, to control whether the op should be simplified prior
> to looking up/assigning a value number.
> (vn_nary_build_or_lookup): Update call to vn_nary_build_or_lookup_1.
> (vn_nary_simplify): Likewise.
> (visit_nary_op): Likewise, but when constructing a NEGATE_EXPR
> now call vn_nary_build_or_lookup_1 disabling simplification.
>
> gcc/testsuite/ChangeLog
> * gcc.dg/fold-negate-1.c: New test case.
>
>
> One potential enhancement request it might be useful to file in Bugzilla
> (I'm not familiar enough with sccvn to investigate this myself), but there's
> a missed optimization opportunity when we recognize one value-number
> as the negation of another (and can therefore materialize one result from
> the other using a single negation instruction).  The opportunity is that we
> currently always select the first value number as the parent, and derive
> the second from it, ignoring the expressions themselves.   Sometimes, it
> may be profitable to use the second (negated) occurrence as the parent,
> and instead negate that to obtain the first.  One could use negate_expr_p
> to decide whether one expression is cheaper to negate than the other.

Note what VN does generally is limited by the order the expressions are
computed, we cannot generally derive something from an expression
computed later.

If we'd do the matching at elimination time we could possibly play
more tricks, like materializing the whole computation and hoping to
elide the redundant later one but that's currently not done.

Thanks,
Richard.

> Both examples in gcc.dg/tree-ssa/ssa-free-88.c would benefit from this:
> Firstly:
> void bar (double x, double z) {
>   y[0] = -z / x;
>   y[1] = z / x;
> }
> if we select "z / x" as the parent, and derive -(z/x) from it, we can avoid/
> eliminate a negation, over the current code that calculates "(-z)/x" and
> then derives "-((-z)/x)" from it.
> Secondly:
> void foo (double x) {
>   y[0] = x * -3.;
>   y[1] = x * 3.;
> }
> Following Richard's solution/workaround to PR 19988, we'd prefer to keep
> positive real constants in the constant pool, hence selecting "x * 3.0" as the
> parent and deriving "-(x * 3.0)" from it, would be slightly preferred over the
> current behaviour of placing -3 in the constant pool.
>
> Thanks again,
> --
> Roger
>
> -Original Message-
> From: Richard Biener 
> Sent: 09 September 2021 13:05
> To: Roger Sayle 
> Cc: GCC Patches 
> Subject: Re: [PATCH] More NEGATE_EXPR folding in match.pd
>
> On Thu, Sep 9, 2021 at 12:08 PM Roger Sayle  
> wrote:
> >
> >
> > As observed by Jakub in comment #2 of PR 98865, the expression
> > -(a>>63) is optimized in GENERIC but not in GIMPLE.  Investigating
> > further it turns out that this is one of a few transformations
> > performed by fold_negate_expr in fold-const.c that aren't yet performed by 
> > match.pd.
> > This patch moves/duplicates them there, and should be relatively safe
> > as these transformations are already performed by the compiler, but
> > just in different passes.
> >
> > Alas the one minor complication is that some of these transformations
> > are only wins, if the intermediate result (of the multiplication or
> > division) is only used once, to avoid duplication/performing them again.
> > See gcc.dg/tree-ssa/ssa-free-88.c.  Normally, this is the perfect
> > usage of match's single_use (aka SSA's has_single_use).  Alas,
> > single_use is not always accurate in match.pd, as some passes will
> > construct and simplify an expression/stmt before inserting it into
> > GIMPLE, and folding during this process sees the temporary undercount from 
> > the data-flow.
> > To solve this, this patch introduces a new single_use_is_op_p that
> > double checks that the single_use has the expected tree_code/operation
> > and skips the transformation if we can tell single_use might be invalid.
> >
> > A follow-up patch might be to investigate whether

1 2 >

1 - 100 of 121 matches

Mail list logo