date:20220920

Ping [PATCH v3, rs6000] Change mode and insn condition for VSX scalar extract/insert instructions

2022-09-20 Thread HAO CHEN GUI via Gcc-patches

Hi,
 Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601196.html
Thanks.

On 7/9/2022 下午 3:44, HAO CHEN GUI wrote:
> Hi,
> 
>   For scalar extract/insert instructions, exponent field can be stored in a
> 32-bit register. So this patch changes the mode of exponent field from DI to
> SI. The instructions using DI registers can be invoked with -mpowerpc64 in a
> 32-bit environment. The patch changes insn condition from TARGET_64BIT to
> TARGET_POWERPC64 for those instructions.
> 
>   This patch also changes prototypes of relevant built-ins and effective
> target of test cases.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
> 
> ChangeLog
> 2022-09-07  Haochen Gui  
> 
> gcc/
>   * config/rs6000/rs6000-builtins.def
>   (__builtin_vsx_scalar_extract_exp): Set return type to const unsigned
>   int.
>   (__builtin_vsx_scalar_extract_sig): Set return type to const unsigned
>   long long.
>   (__builtin_vsx_scalar_insert_exp): Set type of second argument to
>   unsigned int.
>   (__builtin_vsx_scalar_insert_exp_dp): Likewise.
>   * config/rs6000/vsx.md (xsxexpdp): Set mode of first operand to
>   SImode.  Remove TARGET_64BIT from insn condition.
>   (xsxsigdp): Change insn condition from TARGET_64BIT to TARGET_POWERPC64.
>   (xsiexpdp): Change insn condition from TARGET_64BIT to
>   TARGET_POWERPC64.  Set mode of third operand to SImode.
>   (xsiexpdpf): Set mode of third operand to SImode.  Remove TARGET_64BIT
>   from insn condition.
> 
> gcc/testsuite/
>   * gcc.target/powerpc/bfp/scalar-extract-exp-0.c: Change effective
>   target from lp64 to has_arch_ppc64.
>   * gcc.target/powerpc/bfp/scalar-extract-exp-6.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-extract-sig-0.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-extract-sig-6.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-0.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-12.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-13.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-3.c: Likewise.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index f76f54793d7..ca2a1d7657e 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -2847,17 +2847,17 @@
>pure vsc __builtin_vsx_lxvl (const void *, signed long);
>  LXVL lxvl {}
> 
> -  const signed long __builtin_vsx_scalar_extract_exp (double);
> +  const unsigned int __builtin_vsx_scalar_extract_exp (double);
>  VSEEDP xsxexpdp {}
> 
> -  const signed long __builtin_vsx_scalar_extract_sig (double);
> +  const unsigned long long __builtin_vsx_scalar_extract_sig (double);
>  VSESDP xsxsigdp {}
> 
>const double __builtin_vsx_scalar_insert_exp (unsigned long long, \
> -unsigned long long);
> + unsigned int);
>  VSIEDP xsiexpdp {}
> 
> -  const double __builtin_vsx_scalar_insert_exp_dp (double, unsigned long 
> long);
> +  const double __builtin_vsx_scalar_insert_exp_dp (double, unsigned int);
>  VSIEDPF xsiexpdpf {}
> 
>pure vsc __builtin_vsx_xl_len_r (void *, signed long);
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index e226a93bbe5..9d3a2340a79 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -5095,10 +5095,10 @@ (define_insn "xsxexpqp_"
> 
>  ;; VSX Scalar Extract Exponent Double-Precision
>  (define_insn "xsxexpdp"
> -  [(set (match_operand:DI 0 "register_operand" "=r")
> - (unspec:DI [(match_operand:DF 1 "vsx_register_operand" "wa")]
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> + (unspec:SI [(match_operand:DF 1 "vsx_register_operand" "wa")]
>UNSPEC_VSX_SXEXPDP))]
> -  "TARGET_P9_VECTOR && TARGET_64BIT"
> +  "TARGET_P9_VECTOR"
>"xsxexpdp %0,%x1"
>[(set_attr "type" "integer")])
> 
> @@ -5116,7 +5116,7 @@ (define_insn "xsxsigdp"
>[(set (match_operand:DI 0 "register_operand" "=r")
>   (unspec:DI [(match_operand:DF 1 "vsx_register_operand" "wa")]
>UNSPEC_VSX_SXSIG))]
> -  "TARGET_P9_VECTOR && TARGET_64BIT"
> +  "TARGET_P9_VECTOR && TARGET_POWERPC64"
>"xsxsigdp %0,%x1"
>[(set_attr "type" "integer")])
> 
> @@ -5145,9 +5145,9 @@ (define_insn "xsiexpqp_"
>  (define_insn "xsiexpdp"
>[(set (match_operand:DF 0 "vsx_register_operand" "=wa")
>   (unspec:DF [(match_operand:DI 1 "register_operand" "r")
> - (match_operand:DI 2 "register_operand" "r")]
> + (match_operand:SI 2 "register_operand" "r")]
>UNSPEC_VSX_SIEXPDP))]
> -  "TARGET_P9_VECTOR && TARGET_64BIT"
> +  "TARGET_P9_VECTOR && TARGET_POWERPC64"
>"xsiexpdp %x0,%1,%2"
>[(set_attr "type" "fpsimple")])
> 
> @@

Ping^3 [PATCH v2, rs6000] Use CC for BCD operations [PR100736]

2022-09-20 Thread HAO CHEN GUI via Gcc-patches

 Hi,
 Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597020.html
Thanks.

On 1/8/2022 上午 10:02, HAO CHEN GUI wrote:
> Hi,
> Gentle ping this:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597020.html
> Thanks.
> 
> On 4/7/2022 下午 2:33, HAO CHEN GUI wrote:
>> Hi,
>>Gentle ping this:
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597020.html
>> Thanks.
>>
>> On 22/6/2022 下午 4:26, HAO CHEN GUI wrote:
>>> Hi,
>>>   This patch uses CC instead of CCFP for all BCD operations. Thus, infinite
>>> math flag has no impact on BCD operations. To support BCD overflow and
>>> invalid coding, an UNSPEC is defined to move the bit to a general register.
>>> The patterns of condition branch and return with overflow bit are defined as
>>> the UNSPEC and branch/return can be combined to one jump insn. The split
>>> pattern of overflow bit extension is define for optimization.
>>>
>>>   This patch also replaces bcdadd with bcdsub for BCD invaliding coding
>>> expand.
>>>
>>> ChangeLog
>>> 2022-06-22 Haochen Gui 
>>>
>>> gcc/
>>> PR target/100736
>>> * config/rs6000/altivec.md (BCD_TEST): Remove unordered.
>>> (bcd_): Replace CCFP with CC.
>>> (*bcd_test_): Replace CCFP with CC.  Generate
>>> condition insn with CC mode.
>>> (bcd_overflow_): New.
>>> (*bcdoverflow_): New.
>>> (*bcdinvalid_): Removed.
>>> (bcdinvalid_): Implement by UNSPEC_BCDSUB and UNSPEC_BCD_OVERFLOW.
>>> (nuun): New.
>>> (*overflow_cbranch): New.
>>> (*overflow_creturn): New.
>>> (*overflow_extendsidi): New.
>>> (bcdshift_v16qi): Replace CCFP with CC.
>>> (bcdmul10_v16qi): Likewise.
>>> (bcddiv10_v16qi): Likewise.
>>> (peephole for bcd_add/sub): Likewise.
>>> * config/rs6000/rs6000-builtins.def (__builtin_bcdadd_ov_v1ti): Set
>>> pattern to bcdadd_overflow_v1ti.
>>> (__builtin_bcdadd_ov_v16qi): Set pattern to bcdadd_overflow_v16qi.
>>> (__builtin_bcdsub_ov_v1ti): Set pattern to bcdsub_overflow_v1ti.
>>> (__builtin_bcdsub_ov_v16qi): Set pattern to bcdsub_overflow_v16qi.
>>>
>>> gcc/testsuite/
>>> PR target/100736
>>> * gcc.target/powerpc/bcd-4.c: Adjust number of bcdadd and bcdsub.
>>> Scan no cror insns.
>>>
>>> patch.diff
>>> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
>>> index efc8ae35c2e..26f131e61ea 100644
>>> --- a/gcc/config/rs6000/altivec.md
>>> +++ b/gcc/config/rs6000/altivec.md
>>> @@ -4370,7 +4370,7 @@ (define_int_iterator UNSPEC_BCD_ADD_SUB 
>>> [UNSPEC_BCDADD UNSPEC_BCDSUB])
>>>  (define_int_attr bcd_add_sub [(UNSPEC_BCDADD "add")
>>>   (UNSPEC_BCDSUB "sub")])
>>>
>>> -(define_code_iterator BCD_TEST [eq lt le gt ge unordered])
>>> +(define_code_iterator BCD_TEST [eq lt le gt ge])
>>>  (define_mode_iterator VBCD [V1TI V16QI])
>>>
>>>  (define_insn "bcd_"
>>> @@ -4379,7 +4379,7 @@ (define_insn "bcd_"
>>>   (match_operand:VBCD 2 "register_operand" "v")
>>>   (match_operand:QI 3 "const_0_to_1_operand" "n")]
>>>  UNSPEC_BCD_ADD_SUB))
>>> -   (clobber (reg:CCFP CR6_REGNO))]
>>> +   (clobber (reg:CC CR6_REGNO))]
>>>"TARGET_P8_VECTOR"
>>>"bcd. %0,%1,%2,%3"
>>>[(set_attr "type" "vecsimple")])
>>> @@ -4389,9 +4389,9 @@ (define_insn "bcd_"
>>>  ;; UNORDERED test on an integer type (like V1TImode) is not defined.  The 
>>> type
>>>  ;; probably should be one that can go in the VMX (Altivec) registers, so we
>>>  ;; can't use DDmode or DFmode.
>>> -(define_insn "*bcd_test_"
>>> -  [(set (reg:CCFP CR6_REGNO)
>>> -   (compare:CCFP
>>> +(define_insn "bcd_test_"
>>> +  [(set (reg:CC CR6_REGNO)
>>> +   (compare:CC
>>>  (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")
>>>(match_operand:VBCD 2 "register_operand" "v")
>>>(match_operand:QI 3 "const_0_to_1_operand" "i")]
>>> @@ -4408,8 +4408,8 @@ (define_insn "*bcd_test2_"
>>>   (match_operand:VBCD 2 "register_operand" "v")
>>>   (match_operand:QI 3 "const_0_to_1_operand" "i")]
>>>  UNSPEC_BCD_ADD_SUB))
>>> -   (set (reg:CCFP CR6_REGNO)
>>> -   (compare:CCFP
>>> +   (set (reg:CC CR6_REGNO)
>>> +   (compare:CC
>>>  (unspec:V2DF [(match_dup 1)
>>>(match_dup 2)
>>>(match_dup 3)]
>>> @@ -4502,8 +4502,8 @@ (define_insn "vclrrb"
>>> [(set_attr "type" "vecsimple")])
>>>
>>>  (define_expand "bcd__"
>>> -  [(parallel [(set (reg:CCFP CR6_REGNO)
>>> -  (compare:CCFP
>>> +  [(parallel [(set (reg:CC CR6_REGNO)
>>> +  (compare:CC
>>> (unspec:V2DF [(match_operand:VBCD 1 "register_operand")
>>>   (match_operand:VBCD 2 "register_operand")
>>>   (match_operand:QI 3 "const_0_to_1_operand")]
>>> @@ -4511,46 +4511,138 @@ (define_expand "bcd__"
>>> (match_dup 4)))
>>>   (clobber (match_scratch:VBCD 5))])
>>>

Ping^3 [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]

2022-09-20 Thread HAO CHEN GUI via Gcc-patches

Hi,
Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html
Thanks.

On 1/8/2022 上午 10:03, HAO CHEN GUI wrote:
> Hi,
>Gentle ping this:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html
> Thanks.
> 
> 
> On 4/7/2022 下午 2:32, HAO CHEN GUI wrote:
>> Hi,
>>Gentle ping this:
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html
>> Thanks.
>>
>> On 24/6/2022 上午 10:02, HAO CHEN GUI wrote:
>>> Hi,
>>>   This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000.
>>> Tests show that outputs of xs[min/max]dp are consistent with the standard
>>> of C99 fmin/max.
>>>
>>>   This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
>>> of smin/max. So the builtins always generate xs[min/max]dp on all
>>> platforms.
>>>
>>>   Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
>>> Is this okay for trunk? Any recommendations? Thanks a lot.
>>>
>>> ChangeLog
>>> 2022-06-24 Haochen Gui 
>>>
>>> gcc/
>>> PR target/103605
>>> * config/rs6000/rs6000.md (FMINMAX): New.
>>> (minmax_op): New.
>>> (f3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN.
>>> * config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set
>>> pattern to fmaxdf3.
>>> (__builtin_vsx_xsmindp): Set pattern to fmindf3.
>>>
>>> gcc/testsuite/
>>> PR target/103605
>>> * gcc.dg/powerpc/pr103605.c: New.
>>>
>>>
>>> patch.diff
>>> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
>>> b/gcc/config/rs6000/rs6000-builtins.def
>>> index f4a9f24bcc5..8b735493b40 100644
>>> --- a/gcc/config/rs6000/rs6000-builtins.def
>>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>>> @@ -1613,10 +1613,10 @@
>>>  XSCVSPDP vsx_xscvspdp {}
>>>
>>>const double __builtin_vsx_xsmaxdp (double, double);
>>> -XSMAXDP smaxdf3 {}
>>> +XSMAXDP fmaxdf3 {}
>>>
>>>const double __builtin_vsx_xsmindp (double, double);
>>> -XSMINDP smindf3 {}
>>> +XSMINDP fmindf3 {}
>>>
>>>const double __builtin_vsx_xsrdpi (double);
>>>  XSRDPI vsx_xsrdpi {}
>>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>>> index bf85baa5370..ae0dd98f0f9 100644
>>> --- a/gcc/config/rs6000/rs6000.md
>>> +++ b/gcc/config/rs6000/rs6000.md
>>> @@ -158,6 +158,8 @@ (define_c_enum "unspec"
>>> UNSPEC_HASHCHK
>>> UNSPEC_XXSPLTIDP_CONST
>>> UNSPEC_XXSPLTIW_CONST
>>> +   UNSPEC_FMAX
>>> +   UNSPEC_FMIN
>>>])
>>>
>>>  ;;
>>> @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s3_fpr"
>>>DONE;
>>>  })
>>>
>>> +
>>> +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN])
>>> +
>>> +(define_int_attr  minmax_op [(UNSPEC_FMAX "max")
>>> +(UNSPEC_FMIN "min")])
>>> +
>>> +(define_insn "f3"
>>> +  [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa")
>>> +   (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa")
>>> + (match_operand:SFDF 2 "vsx_register_operand" "wa")]
>>> +FMINMAX))]
>>> +  "TARGET_VSX && !flag_finite_math_only"
>>> +  "xsdp %x0,%x1,%x2"
>>> +  [(set_attr "type" "fp")]
>>> +)
>>> +
>>>  (define_expand "movcc"
>>> [(set (match_operand:GPR 0 "gpc_reg_operand")
>>>  (if_then_else:GPR (match_operand 1 "comparison_operator")
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c 
>>> b/gcc/testsuite/gcc.target/powerpc/pr103605.c
>>> new file mode 100644
>>> index 000..1c938d40e61
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c
>>> @@ -0,0 +1,37 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-require-effective-target powerpc_vsx_ok } */
>>> +/* { dg-options "-O2 -mvsx" } */
>>> +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */
>>> +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */
>>> +
>>> +#include 
>>> +
>>> +double test1 (double d0, double d1)
>>> +{
>>> +  return fmin (d0, d1);
>>> +}
>>> +
>>> +float test2 (float d0, float d1)
>>> +{
>>> +  return fmin (d0, d1);
>>> +}
>>> +
>>> +double test3 (double d0, double d1)
>>> +{
>>> +  return fmax (d0, d1);
>>> +}
>>> +
>>> +float test4 (float d0, float d1)
>>> +{
>>> +  return fmax (d0, d1);
>>> +}
>>> +
>>> +double test5 (double d0, double d1)
>>> +{
>>> +  return __builtin_vsx_xsmindp (d0, d1);
>>> +}
>>> +
>>> +double test6 (double d0, double d1)
>>> +{
>>> +  return __builtin_vsx_xsmaxdp (d0, d1);
>>> +}

[PATCH, rs6000] Splat vector small V2DI constants with ISA 2.07 instructions [PR104124]

2022-09-20 Thread HAO CHEN GUI via Gcc-patches

Hi,
  This patch adds a new insn for vector splat with small V2DI constants on P8.
If the value of constant is in RANGE (-16, 15) and not 0 or -1, it can be loaded
with vspltisw and vupkhsw on P8. It should be efficient than loading vector from
TOC.

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.

ChangeLog
2022-09-21 Haochen Gui 

gcc/
PR target/104124
* config/rs6000/altivec.md (*altivec_vupkhs_direct): Renamed
to...
(altivec_vupkhs_direct): ...this.
* config/rs6000/constraints.md (wT constraint): New constant for a
vector constraint that can be loaded with vspltisw and vupkhsw.
* config/rs6000/predicates.md (vspltisw_constant_split): New
predicate for wT constraint.
* config/rs6000/rs6000-protos.h (vspltisw_constant_p): Add declaration.
* config/rs6000/rs6000.cc (easy_altivec_constant): Call
vspltisw_constant_p to judge if a V2DI constant can be synthesized with
a vspltisw and a vupkhsw.
* (vspltisw_constant_p): New function to return true if OP mode is
V2DI and can be synthesized with ISA 2.07 instruction vupkhsw and
vspltisw.
* gcc/config/rs6000/vsx.md (*vspltisw_v2di_split): New insn to load up
constants with vspltisw and vupkhsw.

gcc/testsuite/
PR target/104124
* gcc.target/powerpc/p8-splat.c: New.

patch.diff
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 2c4940f2e21..185414df021 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -2542,7 +2542,7 @@ (define_insn "altivec_vupkhs"
 }
   [(set_attr "type" "vecperm")])

-(define_insn "*altivec_vupkhs_direct"
+(define_insn "altivec_vupkhs_direct"
   [(set (match_operand:VP 0 "register_operand" "=v")
(unspec:VP [(match_operand: 1 "register_operand" "v")]
 UNSPEC_VUNPACK_HI_SIGN_DIRECT))]
diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 5a44a92142e..f65dea6e0c7 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -150,6 +150,10 @@ (define_constraint "wS"
   "@internal Vector constant that can be loaded with XXSPLTIB & sign 
extension."
   (match_test "xxspltib_constant_split (op, mode)"))

+(define_constraint "wT"
+  "@internal Vector constant that can be loaded with vspltisw & vupkhsw."
+  (match_test "vspltisw_constant_split (op, mode)"))
+
 ;; ISA 3.0 DS-form instruction that has the bottom 2 bits 0 and no update form.
 ;; Used by LXSD/STXSD/LXSSP/STXSSP.  In contrast to "Y", the multiple-of-four
 ;; offset is enforced for 32-bit too.
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index b1fcc69bb60..00cf60bbe58 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -694,6 +694,19 @@ (define_predicate "xxspltib_constant_split"
   return num_insns > 1;
 })

+;; Return true if the operand is a constant that can be loaded with a vspltisw
+;; instruction and then a vupkhsw instruction.
+
+(define_predicate "vspltisw_constant_split"
+  (match_code "const_vector,vec_duplicate")
+{
+  int value = 32;
+
+  if (!vspltisw_constant_p (op, mode, ))
+return false;
+
+  return true;
+})

 ;; Return 1 if the operand is constant that can loaded directly with a XXSPLTIB
 ;; instruction.
diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index b3c16e7448d..45f3d044eee 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, 
rtx, int, int, int,

 extern int easy_altivec_constant (rtx, machine_mode);
 extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
+extern bool vspltisw_constant_p (rtx, machine_mode, int *);
 extern int vspltis_shifted (rtx);
 extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
 extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index df491bee2ea..984624026c2 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -6292,6 +6292,12 @@ easy_altivec_constant (rtx op, machine_mode mode)
  && INTVAL (CONST_VECTOR_ELT (op, 1)) == -1)
return 8;

+  /* If V2DI constant is within RANGE (-16, 15), it can be synthesized with
+a vspltisw and a vupkhsw.  */
+  int value = 32;
+  if (vspltisw_constant_p (op, mode, ))
+   return 8;
+
   return 0;
 }

@@ -6494,6 +6500,69 @@ xxspltib_constant_p (rtx op,
   return true;
 }

+/* Return true if OP mode is V2DI and can be synthesized with ISA 2.07
+   instructions vupkhsw and vspltisw.
+
+   Return the constant that is being split via CONSTANT_PTR.  */
+
+bool
+vspltisw_constant_p (rtx op, machine_mode mode, int *constant_ptr)
+{
+  HOST_WIDE_INT

Re: [PATCH] RISC-V modified add3 for large stack frame optimization [PR105733]

2022-09-20 Thread Kevin Lee

The proposed patch only makes the difference if the operand 1 is an
eliminable register and operand 2 is a splittable const int. Otherwise, it
follows the original add3 pattern.

Besides the example from pr105733 shown on the first post,
#define BUF_SIZE 5012
void saxpy( float a )
{
  volatile float x[BUF_SIZE];
  volatile float y[BUF_SIZE];

  for (int i = 0; i < BUF_SIZE; ++i)
  y[i] = a*x[i] + y[i];
}
generates
Before:
saxpy:
li t0,-40960
li a2,40960
addi t0,t0,848
add sp,sp,t0
li a4,-40960
addi a3,a2,-864
add a3,a3,a4
addi a4,sp,16
add a4,a3,a4
sd a4,0(sp)
addi a3,a2,-864
li a4,-20480
add a3,a3,a4
addi a4,sp,16
add a4,a3,a4
li a2,4096
li a5,0
sd a4,8(sp)
addi a2,a2,916
.L2:
ld a4,8(sp)
ld a3,0(sp)
sh2add a4,a5,a4
sh2add a3,a5,a3
flw fa5,864(a3)
flw fa4,432(a4)
addiw a5,a5,1
fmadd.s fa5,fa5,fa0,fa4
fsw fa5,432(a4)
bne a5,a2,.L2
li t0,40960
addi t0,t0,-848
add sp,sp,t0
jr ra

After:
saxpy:
li t0,-40960
addi t0,t0,864
li a2,4096
add sp,sp,t0
li a5,0
addi a2,a2,916
.L2:
li a4,20480
addi a4,a4,-864
add a4,a4,sp
addi a3,sp,-864
sh2add a4,a5,a4
sh2add a3,a5,a3
flw fa5,864(a3)
flw fa4,432(a4)
addiw a5,a5,1
fmadd.s fa5,fa5,fa0,fa4
fsw fa5,432(a4)
bne a5,a2,.L2
li t0,40960
addi t0,t0,-864
add sp,sp,t0
jr ra

The number of instructions before .L2 is reduced from 19 to 6 after the
patch.
Moreover, the following example
#define limit 4096
void foo()
{
volatile int temp = 0;
volatile int buf[limit];
for(int i = 0; i < limit; ++i){
for(int j = 0; j < limit; ++j){
temp += buf[(i * 1234 + j) % limit];
}
}
}
generates
before:
foo:
li t0,-16384
addi t0,t0,-32
li a4,16384
add sp,sp,t0
li a5,-16384
addi a4,a4,16
add a4,a4,a5
addi a5,sp,16
add a5,a4,a5
li a1,4096
sd a5,8(sp)
sw zero,-4(a5)
li a7,-4096
addi a0,a1,-1
li a6,5058560
.L2:
addw a5,a7,a1
.L3:
ld a3,8(sp)
and a4,a5,a0
addiw a5,a5,1
sh2add a4,a4,a3
lw a2,0(a4)
lw a4,-4(a3)
addw a4,a4,a2
ld a2,8(sp)
sw a4,-4(a2)
bne a5,a1,.L3
addiw a1,a5,1234
bne a1,a6,.L2
li t0,16384
addi t0,t0,32
add sp,sp,t0
jr ra

After:
foo:
li t0,-16384
addi t0,t0,-16
add sp,sp,t0
li a1,4096
sw zero,12(sp)
li a7,-4096
addi a0,a1,-1
li a6,5058560
.L2:
addw a5,a7,a1
.L3:
and a4,a5,a0
addi a3,sp,16
sh2add a4,a4,a3
lw a2,0(a4)
lw a4,12(sp)
addiw a5,a5,1
addw a4,a4,a2
sw a4,12(sp)
bne a5,a1,.L3
addiw a1,a5,1234
bne a1,a6,.L2
li t0,16384
addi t0,t0,16
add sp,sp,t0
jr ra

This example also shows that the instructions before .L2 is reduced from 15
lines to 8 lines after the patch.

On Mon, Sep 19, 2022 at 3:16 PM Kito Cheng  wrote:

> Could you provide some data including code size and performance? add is
> frequently used patten, so we should more careful when changing that.
>
> Kevin Lee 於 2022年9月19日 週一，18:07寫道：
>
>> Hello GCC,
>>  Started from Jim Wilson's patch in
>>
>> https://github.com/riscv-admin/riscv-code-speed-optimization/blob/main/projects/gcc-optimizations.adoc
>> for the large stack frame optimization problem, this augmented patch
>> generates less instructions for cases such as
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105733.
>> Original:
>> foo:
>> li t0,-4096
>> addi t0,t0,2016
>> li a4,4096
>> add sp,sp,t0
>> li a5,-4096
>> addi a4,a4,-2032
>> add a4,a4,a5
>> addi a5,sp,16
>> add a5,a4,a5
>> add a0,a5,a0
>> li t0,4096
>> sd a5,8(sp)
>> sb zero,2032(a0)
>> addi t0,t0,-2016
>> add sp,sp,t0
>> jr ra
>> After Patch:
>> foo:
>> li t0,-4096
>> addi t0,t0,2032
>> add sp,sp,t0
>> addi a5,sp,-2032
>> add a0,a5,a0
>> li t0,4096
>> sb zero,2032(a0)
>> addi t0,t0,-2032
>> add sp,sp,t0
>> jr ra
>>
>>= Summary of gcc testsuite =
>> | # of unexpected case / # of unique
>>

[PATCH] Don't check can_vec_perm_const_p for nonlinear iv_init when it's constant.

2022-09-20 Thread liuhongt via Gcc-patches

When init_expr is INTEGER_CST or REAL_CST, can_vec_perm_const_p is not
necessary since there's no real vec_perm needed, but
vec_gen_perm_mask_checked will gcc_assert (can_vec_perm_const_p). So
it's better to use vec_gen_perm_mask_any in
vect_create_nonlinear_iv_init.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

PR tree-optimization/106963
* tree-vect-loop.cc (vect_create_nonlinear_iv_init): Use
vec_gen_perm_mask_any instead of vec_gen_perm_mask_check.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr106963.c: New test.
---
 gcc/testsuite/gcc.target/i386/pr106963.c | 14 ++
 gcc/tree-vect-loop.cc|  5 -
 2 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106963.c

diff --git a/gcc/testsuite/gcc.target/i386/pr106963.c 
b/gcc/testsuite/gcc.target/i386/pr106963.c
new file mode 100644
index 000..9f2d20e2523
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr106963.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx -mno-avx2" } */
+
+void
+foo_neg_const (int *a)
+{
+  int i, b = 1;
+
+  for (i = 0; i < 1000; i++)
+{
+  a[i] = b;
+  b = -b;
+}
+}
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 9c434b66c5b..aabdc6f2d81 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -8356,8 +8356,11 @@ vect_create_nonlinear_iv_init (gimple_seq* stmts, tree 
init_expr,
sel[2 * i + 1] = i + nunits;
  }
vec_perm_indices indices (sel, 2, nunits);
+   /* Don't use vect_gen_perm_mask_checked since can_vec_perm_const_p may
+  fail when vec_init is const vector. In that situation vec_perm is not
+  really needed.  */
tree perm_mask_even
- = vect_gen_perm_mask_checked (vectype, indices);
+ = vect_gen_perm_mask_any (vectype, indices);
vec_init = gimple_build (stmts, VEC_PERM_EXPR,
 vectype,
 vec_init, vec_neg,
-- 
2.18.1

Re: [PATCH, rs6000] Eliminate TARGET_CTZ, TARGET_FCTIDZ, FCTIWUZ defines

2022-09-20 Thread Segher Boessenkool

On Tue, Sep 20, 2022 at 05:01:53PM -0500, will schmidt wrote:
> On Tue, 2022-09-20 at 16:14 -0500, Segher Boessenkool wrote:
> > > TARGET_FCTIWUZ has a low number of uses, and can be directly
> > > replaced with TARGET_POPCNTD.
> > 
> > It is a p7 (ISA 2.06) insn.  Please make a TARGET_P7 or such?
> 
> Yes.  I do have a change later in the (unposted) series to replace
> POPCNTD with POWER7, at a glance thats #17 down the line. In review I
> agree with your comment that the in-between changes aren't the best
> choices. I'll see about skipping the in-between values and going
> straight for POPCNTD->POWER7.

First make new TARGET_Px and OPTION_MASK_Px for all "x" you want,
and do nothing else than enabling it in the respective CPUs in
rs6000-cpus.def .  This can be just one patch of course, it is
a) bloody simple and b) all is the same.  Have that as the very first
patch.  After that most things will be simple and obvious.  But please
do keep most later things split out, it is much easier to review.

> I am looking at the TARGET_POWER10 notation as the target style, versus
> TARGET_P7, but I can go that direction if we think that would be
> preferred.   Maybe it is since this is a retro-fix versus new. :-)

I think TARGET_P7 is a nicely shorter name.  It adds up :-)  The
existing TARGET_P10_SOMETHING do not write it out either btw (and same
for P9 and P8).

But this is not very important of course.  It helps to pick good names
from the get go of course, much less work than fixing things later.

> > (Don't let me dicsourage you btw, most is pretty straightforward).
> 
> Absolutely..   I do have this mostly covered locally, I just need to
> refine a few parts.  :-)

Looking forward to it!

Segher

Re: [PATCH, rs6000] Eliminate TARGET_CTZ,TARGET_FCTIDZ,FCTIWUZ defines

2022-09-20 Thread will schmidt via Gcc-patches

On Tue, 2022-09-20 at 16:14 -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Mon, Sep 19, 2022 at 06:19:15PM -0500, will schmidt wrote:
> >   This is the first of a batch of changes that eliminate a number
> > of define TARGET_foo entries we have collected over time.
> 
> Good good :-)
> 
> > TARGET_CTZ is defined as TARGET_MODULO, and has a low number
> > of uses.  References to TARGET_CTZ should be safe to replace
> > with TARGET_MODULO throughout.
> 
> No, please don't.  This has nothing to with "modulo".  If you want to
> say this is just whether we have ISA 3.0 or p9, make a new target
> macro
> for *that* and use that everywhere.
> 
> This is a general issue, that will make the code much more sane if
> you
> can fix it!

> 
> > TARGET_FCTIDZ is entirely unused, and safe to remove.
> 
> Please make separate patches for separate issues.  This makes it much
> easier to review, and MUCH easier for all other ways we need to
> handle
> it (backports, reverts, everything else).  With Git it is *easier* to
> keep separate patches separate than it is to lump it all
> together.  So,
> the trick is to keep things in separate commits during development
> already (and you will find more benefits doing that, too!)

Yup, I actually developed these three (plus a bunch more) separately,
but combined the first three for posting.   I'll split them back out
and repost after a bit. 

> 
> TARGET_FCTIDZ was never used, it always used TARGET_FCFID directly.
> 
> The original PEM mistakenly said this insn is "64-bit only".  This
> was
> fixed in ISA 2.01 .
> 
> > TARGET_FCTIWUZ has a low number of uses, and can be directly
> > replaced with TARGET_POPCNTD.
> 
> It is a p7 (ISA 2.06) insn.  Please make a TARGET_P7 or such?


Yes.  I do have a change later in the (unposted) series to replace
POPCNTD with POWER7, at a glance thats #17 down the line. In review I
agree with your comment that the in-between changes aren't the best
choices. I'll see about skipping the in-between values and going
straight for POPCNTD->POWER7.

I am looking at the TARGET_POWER10 notation as the target style, versus
TARGET_P7, but I can go that direction if we think that would be
preferred.   Maybe it is since this is a retro-fix versus new. :-)


> 
> In the current situation target macros like TARGET_POPCNTD are abused
> to
> mean either "can we use the popcntd insn", or to mean "can we use
> insn
> new on p7".  Or sometimes something in between, or something in this
> general neighbourhood.  It is never clear which is meant, which makes
> it
> very hard to untangle this.  But thanks for trying!  :-)
>
> (Don't let me dicsourage you btw, most is pretty straightforward).

Absolutely..   I do have this mostly covered locally, I just need to
refine a few parts.  :-)

> 
> 
> > * config/rs6000/rs6000.h (TARGET_CTZ): Replace with
> > TARGET_MODULO.
> 
> Changelogs are indented with tabs, and this fits on one line.
> 
> So, please make TARGET_P7 and such, and OPTION_MASKs for those in
> rs6000-cpus.def?

willdo, 
thanks
-Will


> 
> 
> Segher

Re: [PATCH, rs6000] Eliminate TARGET_CTZ, TARGET_FCTIDZ, FCTIWUZ defines

2022-09-20 Thread Segher Boessenkool

Hi!

On Mon, Sep 19, 2022 at 06:19:15PM -0500, will schmidt wrote:
>   This is the first of a batch of changes that eliminate a number
> of define TARGET_foo entries we have collected over time.

Good good :-)

> TARGET_CTZ is defined as TARGET_MODULO, and has a low number
> of uses.  References to TARGET_CTZ should be safe to replace
> with TARGET_MODULO throughout.

No, please don't.  This has nothing to with "modulo".  If you want to
say this is just whether we have ISA 3.0 or p9, make a new target macro
for *that* and use that everywhere.

This is a general issue, that will make the code much more sane if you
can fix it!

> TARGET_FCTIDZ is entirely unused, and safe to remove.

Please make separate patches for separate issues.  This makes it much
easier to review, and MUCH easier for all other ways we need to handle
it (backports, reverts, everything else).  With Git it is *easier* to
keep separate patches separate than it is to lump it all together.  So,
the trick is to keep things in separate commits during development
already (and you will find more benefits doing that, too!)

TARGET_FCTIDZ was never used, it always used TARGET_FCFID directly.

The original PEM mistakenly said this insn is "64-bit only".  This was
fixed in ISA 2.01 .

> TARGET_FCTIWUZ has a low number of uses, and can be directly
> replaced with TARGET_POPCNTD.

It is a p7 (ISA 2.06) insn.  Please make a TARGET_P7 or such?

In the current situation target macros like TARGET_POPCNTD are abused to
mean either "can we use the popcntd insn", or to mean "can we use insn
new on p7".  Or sometimes something in between, or something in this
general neighbourhood.  It is never clear which is meant, which makes it
very hard to untangle this.  But thanks for trying!  :-)

(Don't let me dicsourage you btw, most is pretty straightforward).

> * config/rs6000/rs6000.h (TARGET_CTZ): Replace with
> TARGET_MODULO.

Changelogs are indented with tabs, and this fits on one line.

So, please make TARGET_P7 and such, and OPTION_MASKs for those in
rs6000-cpus.def?

Segher

Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-20 Thread Harald Anlauf via Gcc-patches


Am 19.09.22 um 22:50 schrieb Mikael Morin:

Le 19/09/2022 à 21:46, Harald Anlauf a écrit :

Am 18.09.22 um 22:55 schrieb Mikael Morin:

Le 18/09/2022 à 20:32, Harald Anlauf a écrit :


Assumed shape will be on the easy side,
while assumed size likely needs to be excluded for clobbering.


Isn’t it the converse that is true?
Assumed shape can be non-contiguous so have to be excluded, but assumed
size are contiguous, so valid candidates for clobbering. No?


I really was referring here to *dummies*, as in the following example:

program p
   integer :: a(4)
   a = 1
   call sub (a(1), 2)
   print *, a
contains
   subroutine sub (b, k)
 integer, intent(in)  :: k
 integer, intent(out) :: b(*)
!   integer, intent(out) :: b(k)
 if (k > 2) b(k) = k
   end subroutine sub
end program p

Assumed size (*) is just a contiguous hunk of memory of possibly
unknown size, which can be zero.  So you couldn't set a clobber
for the a(1) actual argument.


Couldn't you clobber A entirely?  If no element of B is initialized in
SUB, well, A has undefined values on return from SUB.  That's how
INTENT(OUT) works.



I think I understand much of what is said, but I feel that I do
not really understand what *clobber* means for the different
beasts we are discussing (although I have an impression of what
it means for a scalar object).

[PATCH] c++: ICE-on-invalid with designated initializer [PR106983]

2022-09-20 Thread Marek Polacek via Gcc-patches

We ICE in the code added in r12-7117: type_build_dtor_call gets
the error_mark_node because the type of 'prev' wasn't declared.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/106983

gcc/cp/ChangeLog:

* typeck2.cc (split_nonconstant_init_1): Check TYPE_P.

gcc/testsuite/ChangeLog:

* g++.dg/other/error36.C: New test.
---
 gcc/cp/typeck2.cc|  2 +-
 gcc/testsuite/g++.dg/other/error36.C | 13 +
 2 files changed, 14 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/other/error36.C

diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
index 688e9c15326..75fd0e2a9bf 100644
--- a/gcc/cp/typeck2.cc
+++ b/gcc/cp/typeck2.cc
@@ -597,7 +597,7 @@ split_nonconstant_init_1 (tree dest, tree init, bool last,
if (prev == field_index)
  break;
tree ptype = TREE_TYPE (prev);
-   if (type_build_dtor_call (ptype))
+   if (TYPE_P (ptype) && type_build_dtor_call (ptype))
  {
tree pcref = build3 (COMPONENT_REF, ptype, dest, prev,
 NULL_TREE);
diff --git a/gcc/testsuite/g++.dg/other/error36.C 
b/gcc/testsuite/g++.dg/other/error36.C
new file mode 100644
index 000..556287816fd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/other/error36.C
@@ -0,0 +1,13 @@
+// PR c++/106983
+// { dg-do compile { target c++20 } }
+
+typedef unsigned long long A;
+typedef union
+{
+  struct B s; // { dg-error "incomplete" }
+  A a;
+} U;
+void f (A x, unsigned int b)
+{
+  const U y = {.a = x};
+}

base-commit: be60aa5b608b5f09fadfeff852a46589ac311a42
-- 
2.37.3

[PATCH, committed] Fortran: error recovery on invalid ARRAY argument to FINDLOC [PR106986]

2022-09-20 Thread Harald Anlauf via Gcc-patches

Dear all,

we ICE'd in the simplification of FINDLOC when the passed
ARRAY argument had an invalid declaration.  The reason was
a reference to array->shape which was NULL.

Obvious solution: then just don't attempt to simplify.

Regtested on x86_64-pc-linux-gnu and pushed to mainline as

https://gcc.gnu.org/g:5976fbf9d5dd9542fcb82eebb2185886fd52d000

The PR is marked as a 10/11/12/13 regression, thus I plan to
backport.

Thanks,
Harald

From 5976fbf9d5dd9542fcb82eebb2185886fd52d000 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Tue, 20 Sep 2022 22:41:48 +0200
Subject: [PATCH] Fortran: error recovery on invalid ARRAY argument to FINDLOC
 [PR106986]

gcc/fortran/ChangeLog:

	PR fortran/106986
	* simplify.cc (gfc_simplify_findloc): Do not try to simplify
	intrinsic FINDLOC when the ARRAY argument has a NULL shape.

gcc/testsuite/ChangeLog:

	PR fortran/106986
	* gfortran.dg/pr106986.f90: New test.
---
 gcc/fortran/simplify.cc| 1 +
 gcc/testsuite/gfortran.dg/pr106986.f90 | 8 
 2 files changed, 9 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/pr106986.f90

diff --git a/gcc/fortran/simplify.cc b/gcc/fortran/simplify.cc
index 140c17721a7..c0fbd0ed7c2 100644
--- a/gcc/fortran/simplify.cc
+++ b/gcc/fortran/simplify.cc
@@ -5895,6 +5895,7 @@ gfc_simplify_findloc (gfc_expr *array, gfc_expr *value, gfc_expr *dim,
   bool back_val = false;

   if (!is_constant_array_expr (array)
+  || array->shape == NULL
   || !gfc_is_constant_expr (dim))
 return NULL;

diff --git a/gcc/testsuite/gfortran.dg/pr106986.f90 b/gcc/testsuite/gfortran.dg/pr106986.f90
new file mode 100644
index 000..a309b25d181
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr106986.f90
@@ -0,0 +1,8 @@
+! { dg-do compile }
+! PR fortran/106986 - ICE in simplify_findloc_nodim
+! Contributed by G.Steinmetz
+
+program p
+  integer, parameter :: a(:) = [1] ! { dg-error "deferred shape" }
+  print *, findloc (a, 1)
+end
--
2.35.3

[PATCH, committed] Fortran: NULL pointer dereference in invalid simplification [PR106985]

2022-09-20 Thread Harald Anlauf via Gcc-patches

Dear all,

Gerhard found a NULL pointer dereference in a PARAMETER declaration
that referenced the same declared parameter.

Simple & obvious enough, see attached patch.

Regtested on x86_64-pc-linux-gnu, and pushed to mainline:

https://gcc.gnu.org/g:8dbb15bc2d019488240c1e69d93121b0347ac092

Thanks,
Harald

From 8dbb15bc2d019488240c1e69d93121b0347ac092 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Tue, 20 Sep 2022 22:23:43 +0200
Subject: [PATCH] Fortran: NULL pointer dereference in invalid simplification
 [PR106985]

gcc/fortran/ChangeLog:

	PR fortran/106985
	* expr.cc (gfc_simplify_expr): Avoid NULL pointer dereference.

gcc/testsuite/ChangeLog:

	PR fortran/106985
	* gfortran.dg/pr106985.f90: New test.
---
 gcc/fortran/expr.cc| 3 ++-
 gcc/testsuite/gfortran.dg/pr106985.f90 | 8 
 2 files changed, 10 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr106985.f90

diff --git a/gcc/fortran/expr.cc b/gcc/fortran/expr.cc
index be94c18c836..290ddf360c8 100644
--- a/gcc/fortran/expr.cc
+++ b/gcc/fortran/expr.cc
@@ -2287,7 +2287,8 @@ gfc_simplify_expr (gfc_expr *p, int type)
 	 initialization expression, or we want a subsection.  */
   if (p->symtree->n.sym->attr.flavor == FL_PARAMETER
 	  && (gfc_init_expr_flag || p->ref
-	  || p->symtree->n.sym->value->expr_type != EXPR_ARRAY))
+	  || (p->symtree->n.sym->value
+		  && p->symtree->n.sym->value->expr_type != EXPR_ARRAY)))
 	{
 	  if (!simplify_parameter_variable (p, type))
 	return false;
diff --git a/gcc/testsuite/gfortran.dg/pr106985.f90 b/gcc/testsuite/gfortran.dg/pr106985.f90
new file mode 100644
index 000..f4ed92577a3
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr106985.f90
@@ -0,0 +1,8 @@
+! { dg-do compile }
+! PR fortran/106985 - ICE in gfc_simplify_expr
+! Contributed by G.Steinmetz
+
+program p
+  integer, parameter :: a(2) = 1
+  integer, parameter :: b = a(2) + b ! { dg-error "before its definition is complete" }
+end
--
2.35.3

Re: [PATCH 2/2] c++: xtreme-header modules tests cleanups

2022-09-20 Thread Nathan Sidwell via Gcc-patches


On 9/20/22 15:54, Patrick Palka wrote:

This adds some recently implemented C++20/23 library headers to the
xtreme-header tests as appropriate.  Also, it looks like we can safely
re-add  and remove the NO_ASSOCIATED_LAMBDA workaround.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk?


cool, more bits working.  thanks!



gcc/testsuite/ChangeLog:

* g++.dg/modules/xtreme-header-2.h: Include .
* g++.dg/modules/xtreme-header-6.h: Include , ,
,  and .
* g++.dg/modules/xtreme-header.h: Likewise.  Remove
NO_ASSOCIATED_LAMBDA workaround.  Include implemented C++23
library headers.
---
  .../g++.dg/modules/xtreme-header-2.h  |  3 +-
  .../g++.dg/modules/xtreme-header-6.h  | 10 ++--
  gcc/testsuite/g++.dg/modules/xtreme-header.h  | 60 +++
  3 files changed, 29 insertions(+), 44 deletions(-)

diff --git a/gcc/testsuite/g++.dg/modules/xtreme-header-2.h 
b/gcc/testsuite/g++.dg/modules/xtreme-header-2.h
index ded093e533c..dfe94aa6988 100644
--- a/gcc/testsuite/g++.dg/modules/xtreme-header-2.h
+++ b/gcc/testsuite/g++.dg/modules/xtreme-header-2.h
@@ -1,8 +1,7 @@
  // Everything that transitively includes 
  
  #include 

-// FIXME: PR 97549
-// #include 
+#include 
  #include 
  #include 
  #include 
diff --git a/gcc/testsuite/g++.dg/modules/xtreme-header-6.h 
b/gcc/testsuite/g++.dg/modules/xtreme-header-6.h
index 85894b2b20a..8d024b69bac 100644
--- a/gcc/testsuite/g++.dg/modules/xtreme-header-6.h
+++ b/gcc/testsuite/g++.dg/modules/xtreme-header-6.h
@@ -1,22 +1,22 @@
  // C++20 headers
  #if __cplusplus > 201703
  #include 
+#include 
  #include 
  #include 
  #include 
  #if __cpp_coroutines
  #include 
  #endif
+#include 
  #include 
+#include 
+#include 
  #include 
  #include 
+#include 
  #if 0
  // Unimplemented
-#include 
  #include 
-#include 
-#include 
-#include 
-#include 
  #endif
  #endif
diff --git a/gcc/testsuite/g++.dg/modules/xtreme-header.h 
b/gcc/testsuite/g++.dg/modules/xtreme-header.h
index 41302c780b5..124e2f82277 100644
--- a/gcc/testsuite/g++.dg/modules/xtreme-header.h
+++ b/gcc/testsuite/g++.dg/modules/xtreme-header.h
@@ -1,17 +1,8 @@
  // All the headers!
  
-#if __cplusplus > 201703L

-// FIXME: if we include everything, something goes wrong with location
-// information.  We used to not handle lambdas attached to global
-// vars, and this is a convienient flag to stop including everything.
-#define NO_ASSOCIATED_LAMBDA 1
-#endif
-
  // C++ 17 and below
  #if 1
-#if !NO_ASSOCIATED_LAMBDA
  #include 
-#endif
  #include 
  #include 
  #include 
@@ -26,19 +17,12 @@
  #include 
  #include 
  #include 
-#if !NO_ASSOCIATED_LAMBDA
-// FIXME: PR 97549
-//#include 
-#endif
+#include 
  #include 
  #include 
  #include 
-#if !NO_ASSOCIATED_LAMBDA
  #include 
-#endif
-#if !NO_ASSOCIATED_LAMBDA
  #include 
-#endif
  #include 
  #include 
  #include 
@@ -49,12 +33,8 @@
  #include 
  #include 
  #include 
-#if !NO_ASSOCIATED_LAMBDA
  #include 
-#endif
-#if !NO_ASSOCIATED_LAMBDA
  #include 
-#endif
  #include 
  #include 
  #include 
@@ -63,12 +43,8 @@
  #include 
  #include 
  #include 
-#if !NO_ASSOCIATED_LAMBDA
  #include 
-#endif
-#if !NO_ASSOCIATED_LAMBDA
  #include 
-#endif
  #include 
  #include 
  #include 
@@ -78,9 +54,7 @@
  #include 
  #include 
  #include 
-#if !NO_ASSOCIATED_LAMBDA
  #include 
-#endif
  #include 
  #include 
  #include 
@@ -88,9 +62,7 @@
  #include 
  #include 
  #include 
-#if !NO_ASSOCIATED_LAMBDA
  #include 
-#endif
  #include 
  #include 
  #endif
@@ -119,26 +91,40 @@
  #if __cplusplus > 201703
  #if 1
  #include 
+#include 
  #include 
  #include 
  #include 
  #if __cpp_coroutines
  #include 
  #endif
-#if !NO_ASSOCIATED_LAMBDA
-#include 
-#endif
+#include 
  #include 
+#include 
+#include 
+#include 
  #include 
  #include 
+#include 
  #if 0
  // Unimplemented
-#include 
  #include 
-#include 
-#include 
-#include 
-#include 
  #endif
  #endif
  #endif
+
+// C++23
+#if __cplusplus > 202002L
+#include 
+#include 
+#include 
+#if 0
+// Unimplemented
+#include 
+#include 
+#include 
+#include 
+#include 
+#endif
+#endif
+


--
Nathan Sidwell

Re: [PATCH 1/2] c++: modules and non-dependent auto deduction

2022-09-20 Thread Nathan Sidwell via Gcc-patches


On 9/20/22 15:54, Patrick Palka wrote:

The modules streaming code seems to rely on the invariant that a
TEMPLATE_DECL and its DECL_TEMPLATE_RESULT have the same TREE_TYPE.


It does indeed.


But for a templated VAR_DECL with deduced non-dependent type, the two
TREE_TYPEs end up diverging: cp_finish_decl deduces the type of the
initializer ahead of time and updates the TREE_TYPE of the VAR_DECL, but
neglects to update the corresponding TEMPLATE_DECL as well, which leads
to a "conflicting global module declaration" error for each of the
__phase_alignment decls in the below testcase (and for the xtreme-header
testcases if we try including ).

This patch makes cp_finish_decl update the TREE_TYPE of the corresponding
TEMPLATE_DECL so that the invariant is maintained >
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


Ok, thanks




gcc/cp/ChangeLog:

* decl.cc (cp_finish_decl): After updating the deduced type of a
VAR_DECL, also update the corresponding TEMPLATE_DECL if there
is one.

gcc/testsuite/ChangeLog:

* g++.dg/modules/auto-3.h: New test.
* g++.dg/modules/auto-3_a.H: New test.
* g++.dg/modules/auto-3_b.C: New test.
---
  gcc/cp/decl.cc  |  6 ++
  gcc/testsuite/g++.dg/modules/auto-3.h   | 10 ++
  gcc/testsuite/g++.dg/modules/auto-3_a.H |  4 
  gcc/testsuite/g++.dg/modules/auto-3_b.C |  4 
  4 files changed, 24 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/modules/auto-3.h
  create mode 100644 gcc/testsuite/g++.dg/modules/auto-3_a.H
  create mode 100644 gcc/testsuite/g++.dg/modules/auto-3_b.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 070f673c3a2..80467c19254 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -8180,6 +8180,12 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
  return;
}
cp_apply_type_quals_to_decl (cp_type_quals (type), decl);
+
+  /* Update the type of the corresponding TEMPLATE_DECL to match.  */
+  if (DECL_LANG_SPECIFIC (decl)
+ && DECL_TEMPLATE_INFO (decl)
+ && DECL_TEMPLATE_RESULT (DECL_TI_TEMPLATE (decl)) == decl)
+   TREE_TYPE (DECL_TI_TEMPLATE (decl)) = type;
  }
  
if (ensure_literal_type_for_constexpr_object (decl) == error_mark_node)

diff --git a/gcc/testsuite/g++.dg/modules/auto-3.h 
b/gcc/testsuite/g++.dg/modules/auto-3.h
new file mode 100644
index 000..f129433cbcb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/auto-3.h
@@ -0,0 +1,10 @@
+template
+struct __tree_barrier {
+  static const auto __phase_alignment_1 = 0;
+
+  template
+  static const auto __phase_alignment_2 = 0;
+};
+
+template
+inline auto __phase_alignment_3 = 0;
diff --git a/gcc/testsuite/g++.dg/modules/auto-3_a.H 
b/gcc/testsuite/g++.dg/modules/auto-3_a.H
new file mode 100644
index 000..25a7a73e73e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/auto-3_a.H
@@ -0,0 +1,4 @@
+// { dg-additional-options "-fmodule-header" }
+// { dg-module-cmi {} }
+
+#include "auto-3.h"
diff --git a/gcc/testsuite/g++.dg/modules/auto-3_b.C 
b/gcc/testsuite/g++.dg/modules/auto-3_b.C
new file mode 100644
index 000..03b6d46f476
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/auto-3_b.C
@@ -0,0 +1,4 @@
+// { dg-additional-options "-fmodules-ts -fno-module-lazy" }
+
+#include "auto-3.h"
+import "auto-3_a.H";


--
Nathan Sidwell

Re: [Patch] Fortran: F2018 type(),dimension() with scalars [PR104143]

2022-09-20 Thread Harald Anlauf via Gcc-patches


Am 20.09.22 um 13:51 schrieb Tobias Burnus:

In several cases, one just wants to have the address where an object starts
without requiring the detour via 'c_loc' and the (locally) required
'target'
attribute.

In principle,  type(*),dimension(*)  of TS29113 permits this, except that
'dimension(*)' only permits arrays and array elements but not scalars.

Fortran 2018 modified this such that with 'type(*)' also scalars are
permitted.
(See PR for the quotes.)

This patch implements this simple change. Before, implementations like MPI
had to use '!GCC$ attribute NO_ARG_CHECK ::' in addition to
type(*),dimension(*)
to achieve this. In GCC, we do likewise, but that's at least inside the
compiler,
cf. libgomp/openacc{.f90,_lib.h}.

OK for mainline?


LGTM.

Thanks for the patch!

[PATCH 2/2] c++: xtreme-header modules tests cleanups

2022-09-20 Thread Patrick Palka via Gcc-patches

This adds some recently implemented C++20/23 library headers to the
xtreme-header tests as appropriate.  Also, it looks like we can safely
re-add  and remove the NO_ASSOCIATED_LAMBDA workaround.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

gcc/testsuite/ChangeLog:

* g++.dg/modules/xtreme-header-2.h: Include .
* g++.dg/modules/xtreme-header-6.h: Include , ,
,  and .
* g++.dg/modules/xtreme-header.h: Likewise.  Remove
NO_ASSOCIATED_LAMBDA workaround.  Include implemented C++23
library headers.
---
 .../g++.dg/modules/xtreme-header-2.h  |  3 +-
 .../g++.dg/modules/xtreme-header-6.h  | 10 ++--
 gcc/testsuite/g++.dg/modules/xtreme-header.h  | 60 +++
 3 files changed, 29 insertions(+), 44 deletions(-)

diff --git a/gcc/testsuite/g++.dg/modules/xtreme-header-2.h 
b/gcc/testsuite/g++.dg/modules/xtreme-header-2.h
index ded093e533c..dfe94aa6988 100644
--- a/gcc/testsuite/g++.dg/modules/xtreme-header-2.h
+++ b/gcc/testsuite/g++.dg/modules/xtreme-header-2.h
@@ -1,8 +1,7 @@
 // Everything that transitively includes 
 
 #include 
-// FIXME: PR 97549
-// #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/gcc/testsuite/g++.dg/modules/xtreme-header-6.h 
b/gcc/testsuite/g++.dg/modules/xtreme-header-6.h
index 85894b2b20a..8d024b69bac 100644
--- a/gcc/testsuite/g++.dg/modules/xtreme-header-6.h
+++ b/gcc/testsuite/g++.dg/modules/xtreme-header-6.h
@@ -1,22 +1,22 @@
 // C++20 headers
 #if __cplusplus > 201703
 #include 
+#include 
 #include 
 #include 
 #include 
 #if __cpp_coroutines
 #include 
 #endif
+#include 
 #include 
+#include 
+#include 
 #include 
 #include 
+#include 
 #if 0
 // Unimplemented
-#include 
 #include 
-#include 
-#include 
-#include 
-#include 
 #endif
 #endif
diff --git a/gcc/testsuite/g++.dg/modules/xtreme-header.h 
b/gcc/testsuite/g++.dg/modules/xtreme-header.h
index 41302c780b5..124e2f82277 100644
--- a/gcc/testsuite/g++.dg/modules/xtreme-header.h
+++ b/gcc/testsuite/g++.dg/modules/xtreme-header.h
@@ -1,17 +1,8 @@
 // All the headers!
 
-#if __cplusplus > 201703L
-// FIXME: if we include everything, something goes wrong with location
-// information.  We used to not handle lambdas attached to global
-// vars, and this is a convienient flag to stop including everything.
-#define NO_ASSOCIATED_LAMBDA 1
-#endif
-
 // C++ 17 and below
 #if 1
-#if !NO_ASSOCIATED_LAMBDA
 #include 
-#endif
 #include 
 #include 
 #include 
@@ -26,19 +17,12 @@
 #include 
 #include 
 #include 
-#if !NO_ASSOCIATED_LAMBDA
-// FIXME: PR 97549
-//#include 
-#endif
+#include 
 #include 
 #include 
 #include 
-#if !NO_ASSOCIATED_LAMBDA
 #include 
-#endif
-#if !NO_ASSOCIATED_LAMBDA
 #include 
-#endif
 #include 
 #include 
 #include 
@@ -49,12 +33,8 @@
 #include 
 #include 
 #include 
-#if !NO_ASSOCIATED_LAMBDA
 #include 
-#endif
-#if !NO_ASSOCIATED_LAMBDA
 #include 
-#endif
 #include 
 #include 
 #include 
@@ -63,12 +43,8 @@
 #include 
 #include 
 #include 
-#if !NO_ASSOCIATED_LAMBDA
 #include 
-#endif
-#if !NO_ASSOCIATED_LAMBDA
 #include 
-#endif
 #include 
 #include 
 #include 
@@ -78,9 +54,7 @@
 #include 
 #include 
 #include 
-#if !NO_ASSOCIATED_LAMBDA
 #include 
-#endif
 #include 
 #include 
 #include 
@@ -88,9 +62,7 @@
 #include 
 #include 
 #include 
-#if !NO_ASSOCIATED_LAMBDA
 #include 
-#endif
 #include 
 #include 
 #endif
@@ -119,26 +91,40 @@
 #if __cplusplus > 201703
 #if 1
 #include 
+#include 
 #include 
 #include 
 #include 
 #if __cpp_coroutines
 #include 
 #endif
-#if !NO_ASSOCIATED_LAMBDA
-#include 
-#endif
+#include 
 #include 
+#include 
+#include 
+#include 
 #include 
 #include 
+#include 
 #if 0
 // Unimplemented
-#include 
 #include 
-#include 
-#include 
-#include 
-#include 
 #endif
 #endif
 #endif
+
+// C++23
+#if __cplusplus > 202002L
+#include 
+#include 
+#include 
+#if 0
+// Unimplemented
+#include 
+#include 
+#include 
+#include 
+#include 
+#endif
+#endif
+
-- 
2.38.0.rc0.52.gdda7228a83

[PATCH 1/2] c++: modules and non-dependent auto deduction

2022-09-20 Thread Patrick Palka via Gcc-patches

The modules streaming code seems to rely on the invariant that a
TEMPLATE_DECL and its DECL_TEMPLATE_RESULT have the same TREE_TYPE.
But for a templated VAR_DECL with deduced non-dependent type, the two
TREE_TYPEs end up diverging: cp_finish_decl deduces the type of the
initializer ahead of time and updates the TREE_TYPE of the VAR_DECL, but
neglects to update the corresponding TEMPLATE_DECL as well, which leads
to a "conflicting global module declaration" error for each of the
__phase_alignment decls in the below testcase (and for the xtreme-header
testcases if we try including ).

This patch makes cp_finish_decl update the TREE_TYPE of the corresponding
TEMPLATE_DECL so that the invariant is maintained.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

* decl.cc (cp_finish_decl): After updating the deduced type of a
VAR_DECL, also update the corresponding TEMPLATE_DECL if there
is one.

gcc/testsuite/ChangeLog:

* g++.dg/modules/auto-3.h: New test.
* g++.dg/modules/auto-3_a.H: New test.
* g++.dg/modules/auto-3_b.C: New test.
---
 gcc/cp/decl.cc  |  6 ++
 gcc/testsuite/g++.dg/modules/auto-3.h   | 10 ++
 gcc/testsuite/g++.dg/modules/auto-3_a.H |  4 
 gcc/testsuite/g++.dg/modules/auto-3_b.C |  4 
 4 files changed, 24 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/modules/auto-3.h
 create mode 100644 gcc/testsuite/g++.dg/modules/auto-3_a.H
 create mode 100644 gcc/testsuite/g++.dg/modules/auto-3_b.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 070f673c3a2..80467c19254 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -8180,6 +8180,12 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
  return;
}
   cp_apply_type_quals_to_decl (cp_type_quals (type), decl);
+
+  /* Update the type of the corresponding TEMPLATE_DECL to match.  */
+  if (DECL_LANG_SPECIFIC (decl)
+ && DECL_TEMPLATE_INFO (decl)
+ && DECL_TEMPLATE_RESULT (DECL_TI_TEMPLATE (decl)) == decl)
+   TREE_TYPE (DECL_TI_TEMPLATE (decl)) = type;
 }
 
   if (ensure_literal_type_for_constexpr_object (decl) == error_mark_node)
diff --git a/gcc/testsuite/g++.dg/modules/auto-3.h 
b/gcc/testsuite/g++.dg/modules/auto-3.h
new file mode 100644
index 000..f129433cbcb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/auto-3.h
@@ -0,0 +1,10 @@
+template
+struct __tree_barrier {
+  static const auto __phase_alignment_1 = 0;
+
+  template
+  static const auto __phase_alignment_2 = 0;
+};
+
+template
+inline auto __phase_alignment_3 = 0;
diff --git a/gcc/testsuite/g++.dg/modules/auto-3_a.H 
b/gcc/testsuite/g++.dg/modules/auto-3_a.H
new file mode 100644
index 000..25a7a73e73e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/auto-3_a.H
@@ -0,0 +1,4 @@
+// { dg-additional-options "-fmodule-header" }
+// { dg-module-cmi {} }
+
+#include "auto-3.h"
diff --git a/gcc/testsuite/g++.dg/modules/auto-3_b.C 
b/gcc/testsuite/g++.dg/modules/auto-3_b.C
new file mode 100644
index 000..03b6d46f476
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/auto-3_b.C
@@ -0,0 +1,4 @@
+// { dg-additional-options "-fmodules-ts -fno-module-lazy" }
+
+#include "auto-3.h"
+import "auto-3_a.H";
-- 
2.38.0.rc0.52.gdda7228a83

Re: [PATCH] btf: Add support to BTF_KIND_ENUM64 type

2022-09-20 Thread Guillermo E. Martinez via Gcc-patches


ping


On 8/29/22 16:11, Guillermo E. Martinez wrote:

Hello GCC team,

The following patch update BTF/CTF backend to support
BTF_KIND_ENUM64 type.

Comments will be welcomed and appreciated!,

Kind regards,
guillermo
--

BTF supports 64-bits enumerators with following encoding:

   struct btf_type:
 name_off: 0 or offset to a valid C identifier
 info.kind_flag: 0 for unsigned, 1 for signed
 info.kind: BTF_KIND_ENUM64
 info.vlen: number of enum values
 size: 1/2/4/8

The btf_type is followed by info.vlen number of:

 struct btf_enum64
 {
   uint32_t name_off;   /* Offset in string section of enumerator name.  */
   uint32_t val_lo32;   /* lower 32-bit value for a 64-bit value Enumerator 
*/
   uint32_t val_hi32;   /* high 32-bit value for a 64-bit value Enumerator 
*/
 };

So, a new btf_enum64 structure was added to represent BTF_KIND_ENUM64
and a new field in ctf_dtdef to represent specific type's properties, in
the particular case for CTF enums it helps to distinguish when its
enumerators values are signed or unsigned, later that information is
used to encode the BTF enum type.

gcc/ChangeLog:

* btfout.cc (btf_calc_num_vbytes): Compute enumeration size depending of
enumerator type btf_enum{,64}.
(btf_asm_type): Update btf_kflag according to enumerators sign,
using correct BPF type in BTF_KIND_ENUMi{,64}.
(btf_asm_enum_const): New argument to represent the size of
the BTF enum type.
* ctfc.cc (ctf_add_enum): Use and initialization of flag field to
CTF_ENUM_F_NONE.
(ctf_add_enumerator): New argument to represent CTF flags,
updating the comment and flag vaue according to enumerators
sing.
* ctfc.h (ctf_dmdef): Update dmd_value to HOST_WIDE_INT to allow
use 32/64 bits enumerators.
(ctf_dtdef): Add flags to to describe specifyc type's properties.
* dwarf2ctf.cc (gen_ctf_enumeration_type): Update flags field
depending when a signed enumerator value is found.
include/btf.h (btf_enum64): Add new definition and new symbolic
constant to BTF_KIND_ENUM64 and BTF_KF_ENUM_{UN,}SIGNED.

gcc/testsuite/ChangeLog:

gcc.dg/debug/btf/btf-enum-1.c: Update testcase, with correct
info.kflags encoding.
gcc.dg/debug/btf/btf-enum64-1.c: New testcase.
---
  gcc/btfout.cc | 24 ---
  gcc/ctfc.cc   | 14 ---
  gcc/ctfc.h|  9 +++-
  gcc/dwarf2ctf.cc  |  9 +++-
  gcc/testsuite/gcc.dg/debug/btf/btf-enum-1.c   |  2 +-
  gcc/testsuite/gcc.dg/debug/btf/btf-enum64-1.c | 41 +++
  include/btf.h | 19 +++--
  7 files changed, 99 insertions(+), 19 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-enum64-1.c

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index 997a33fa089..4b11c867c23 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -223,7 +223,9 @@ btf_calc_num_vbytes (ctf_dtdef_ref dtd)
break;
  
  case BTF_KIND_ENUM:

-  vlen_bytes += vlen * sizeof (struct btf_enum);
+  vlen_bytes += (dtd->dtd_data.ctti_size == 0x8)
+   ? vlen * sizeof (struct btf_enum64)
+   : vlen * sizeof (struct btf_enum);
break;
  
  case BTF_KIND_FUNC_PROTO:

@@ -622,6 +624,15 @@ btf_asm_type (ctf_container_ref ctfc, ctf_dtdef_ref dtd)
btf_size_type = 0;
  }
  
+ if (btf_kind == BTF_KIND_ENUM)

+   {
+ btf_kflag = (dtd->flags & CTF_ENUM_F_ENUMERATORS_SIGNED)
+   ? BTF_KF_ENUM_SIGNED
+   : BTF_KF_ENUM_UNSIGNED;
+ if (dtd->dtd_data.ctti_size == 0x8)
+   btf_kind = BTF_KIND_ENUM64;
+   }
+
dw2_asm_output_data (4, dtd->dtd_data.ctti_name, "btt_name");
dw2_asm_output_data (4, BTF_TYPE_INFO (btf_kind, btf_kflag, btf_vlen),
   "btt_info: kind=%u, kflag=%u, vlen=%u",
@@ -634,6 +645,7 @@ btf_asm_type (ctf_container_ref ctfc, ctf_dtdef_ref dtd)
  case BTF_KIND_UNION:
  case BTF_KIND_ENUM:
  case BTF_KIND_DATASEC:
+case BTF_KIND_ENUM64:
dw2_asm_output_data (4, dtd->dtd_data.ctti_size, "btt_size: %uB",
   dtd->dtd_data.ctti_size);
return;
@@ -707,13 +719,13 @@ btf_asm_sou_member (ctf_container_ref ctfc, ctf_dmdef_t * 
dmd)
  }
  }
  
-/* Asm'out an enum constant following a BTF_KIND_ENUM.  */

+/* Asm'out an enum constant following a BTF_KIND_ENUM{,64}.  */
  
  static void

-btf_asm_enum_const (ctf_dmdef_t * dmd)
+btf_asm_enum_const (unsigned int size, ctf_dmdef_t * dmd)
  {
dw2_asm_output_data (4, dmd->dmd_name_offset, "bte_name");
-  dw2_asm_output_data (4, dmd->dmd_value, "bte_value");
+  dw2_asm_output_data (size, dmd->dmd_value, "bte_value");
  }
  
  /* Asm'out a function parameter description following a BTF_KIND_FUNC_PROTO.  */

@@ -871,7

[COMMITTED] frange::maybe_isnan() should return FALSE for undefined ranges.

2022-09-20 Thread Aldy Hernandez via Gcc-patches

Undefined ranges have undefined NAN bits.  We can't depend on them,
as they may contain garbage.  This patch returns false from
maybe_isnan() for undefined ranges (the empty set).

gcc/ChangeLog:

* value-range.h (frange::maybe_isnan): Return false for
undefined ranges.
---
 gcc/value-range.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/value-range.h b/gcc/value-range.h
index 7d5584a9294..325ed08f290 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -1210,6 +1210,8 @@ frange::known_isinf () const
 inline bool
 frange::maybe_isnan () const
 {
+  if (undefined_p ())
+return false;
   return m_pos_nan || m_neg_nan;
 }
 
-- 
2.37.1

Re: [PATCH] c++: Implement C++23 P2266R1, Simpler implicit move [PR101165]

2022-09-20 Thread Marek Polacek via Gcc-patches

On Mon, Sep 12, 2022 at 04:27:27PM -0400, Jason Merrill wrote:
> On 9/8/22 18:54, Marek Polacek wrote:
> > On Tue, Sep 06, 2022 at 10:38:12PM -0400, Jason Merrill wrote:
> > > On 9/3/22 12:42, Marek Polacek wrote:
> > > > This patch implements https://wg21.link/p2266, which, once again,
> > > > changes the implicit move rules.  Here's a brief summary of various
> > > > changes in this area:
> > > > 
> > > > r125211: Introduced moving from certain lvalues when returning them
> > > > r171071: CWG 1148, enable move from value parameter on return
> > > > r212099: CWG 1579, it's OK to call a converting ctor taking an rvalue
> > > > r251035: CWG 1579, do maybe-rvalue overload resolution twice
> > > > r11-2411: Avoid calling const copy ctor on implicit move
> > > > r11-2412: C++20 implicit move changes, remove the fallback overload
> > > > resolution, allow move on throw of parameters and implicit
> > > >   move of rvalue references
> > > > 
> > > > P2266 enables the implicit move for functions that return references.  
> > > > This
> > > > was a one-line change: check TYPE_REF_P.  That is, we will now perform
> > > > a move in
> > > > 
> > > > X&& foo (X&& x) {
> > > >   return x;
> > > > }
> > > > 
> > > > P2266 also removes the fallback overload resolution, but this was
> > > > resolved by r11-2412: we only do convert_for_initialization with
> > > > LOOKUP_PREFER_RVALUE in C++17 and older.
> > > 
> > > I wonder if we want to extend the current C++20 handling to the older 
> > > modes
> > > for GCC 13?  Not in this patch, but as a followup.
> > > 
> > > > P2266 also says that a returned move-eligible id-expression is always an
> > > > xvalue.  This required some further short, but nontrivial changes,
> > > > especially when it comes to deduction, because we have to pay attention
> > > > to whether we have auto, auto&& (which is like T&&), or decltype(auto)
> > > > with (un)parenthesized argument.  In C++23,
> > > > 
> > > > decltype(auto) f(int&& x) { return (x); }
> > > > auto&& f(int x) { return x; }
> > > > 
> > > > both should deduce to 'int&&' but
> > > > 
> > > > decltype(auto) f(int x) { return x; }
> > > > 
> > > > should deduce to 'int'.  A cornucopia of tests attached.  I've also
> > > > verified that we behave like clang++.
> > > > 
> > > > xvalue_p seemed to be broken: since the introduction of 
> > > > clk_implicit_rval,
> > > > it cannot use '==' when checking for clk_rvalueref.
> > > > 
> > > > Since this change breaks code, it's only enabled in C++23.  In
> > > > particular, this code will not compile in C++23:
> > > > 
> > > > int& g(int&& x) { return x; }
> > > 
> > > Nice that the C++20 compatibility is so simple!
> > > 
> > > > because x is now treated as an rvalue, and you can't bind a non-const 
> > > > lvalue
> > > > reference to an rvalue.
> > > > 
> > > > There's one FIXME in elision1.C:five, which we should compile but reject
> > > > with "passing 'Mutt' as 'this' argument discards qualifiers".  That
> > > > looks bogus to me, I think I'll open a PR for it.
> > > 
> > > Let's fix that now, I think.
> > 
> > Can of worms.   The test is
> > 
> >struct Mutt {
> >operator int*() &&;
> >};
> > 
> >int* five(Mutt x) {
> >return x;  // OK since C++20 because P1155
> >}
> > 
> > 'x' should be treated as an rvalue, therefore the operator fn taking
> > an rvalue ref to Mutt should be used to convert 'x' to int*.  We fail
> > because we don't treat 'x' as an rvalue because the function doesn't
> > return a class.  So the patch should be just
> > 
> > --- a/gcc/cp/typeck.cc
> > +++ b/gcc/cp/typeck.cc
> > @@ -10875,10 +10875,7 @@ check_return_expr (tree retval, bool *no_warning)
> >Note that these conditions are similar to, but not as strict as,
> >   the conditions for the named return value optimization.  */
> > bool converted = false;
> > -  tree moved;
> > -  /* This is only interesting for class type.  */
> > -  if (CLASS_TYPE_P (functype)
> > - && (moved = treat_lvalue_as_rvalue_p (retval, /*return*/true)))
> > +  if (tree moved = treat_lvalue_as_rvalue_p (retval, /*return*/true))
> >  {
> >if (cxx_dialect < cxx20)
> >  {
> > 
> > which fixes the test, but breaks a lot of middle-end warnings.  For instance
> > g++.dg/warn/nonnull3.C, where the patch above changes .gimple:
> > 
> >   bool A::foo (struct A * const this, <<< Unknown tree: offset_type >>> 
> > p)
> >   {
> > -  bool D.2146;
> > +  bool D.2150;
> > -  D.2146 = p != -1;
> > -  return D.2146;
> > +  p.0_1 = p;
> > +  D.2150 = p.0_1 != -1;
> > +  return D.2150;
> >   }
> > 
> > and we no longer get the warning.  I thought maybe I could undo the implicit
> > rvalue conversion in cp_fold, when it sees implicit_rvalue_p, but that 
> > didn't
> > work.  So currently I'm stuck.  Should we try to figure this out or push 
> > aside?
> 
> Can you undo the implicit rvalue conversion within

[PATCH v2] c++: Implement C++23 P2266R1, Simpler implicit move [PR101165]

2022-09-20 Thread Marek Polacek via Gcc-patches

On Tue, Sep 06, 2022 at 10:38:12PM -0400, Jason Merrill wrote:
> On 9/3/22 12:42, Marek Polacek wrote:
> > This patch implements https://wg21.link/p2266, which, once again,
> > changes the implicit move rules.  Here's a brief summary of various
> > changes in this area:
> > 
> > r125211: Introduced moving from certain lvalues when returning them
> > r171071: CWG 1148, enable move from value parameter on return
> > r212099: CWG 1579, it's OK to call a converting ctor taking an rvalue
> > r251035: CWG 1579, do maybe-rvalue overload resolution twice
> > r11-2411: Avoid calling const copy ctor on implicit move
> > r11-2412: C++20 implicit move changes, remove the fallback overload
> >resolution, allow move on throw of parameters and implicit
> >   move of rvalue references
> > 
> > P2266 enables the implicit move for functions that return references.  This
> > was a one-line change: check TYPE_REF_P.  That is, we will now perform
> > a move in
> > 
> >X&& foo (X&& x) {
> >  return x;
> >}
> > 
> > P2266 also removes the fallback overload resolution, but this was
> > resolved by r11-2412: we only do convert_for_initialization with
> > LOOKUP_PREFER_RVALUE in C++17 and older.
> 
> I wonder if we want to extend the current C++20 handling to the older modes
> for GCC 13?  Not in this patch, but as a followup.

Yes, I think that would be very nice if we removed that code.
 
> > P2266 also says that a returned move-eligible id-expression is always an
> > xvalue.  This required some further short, but nontrivial changes,
> > especially when it comes to deduction, because we have to pay attention
> > to whether we have auto, auto&& (which is like T&&), or decltype(auto)
> > with (un)parenthesized argument.  In C++23,
> > 
> >decltype(auto) f(int&& x) { return (x); }
> >auto&& f(int x) { return x; }
> > 
> > both should deduce to 'int&&' but
> > 
> >decltype(auto) f(int x) { return x; }
> > 
> > should deduce to 'int'.  A cornucopia of tests attached.  I've also
> > verified that we behave like clang++.
> > 
> > xvalue_p seemed to be broken: since the introduction of clk_implicit_rval,
> > it cannot use '==' when checking for clk_rvalueref.
> > 
> > Since this change breaks code, it's only enabled in C++23.  In
> > particular, this code will not compile in C++23:
> > 
> >int& g(int&& x) { return x; }
> 
> Nice that the C++20 compatibility is so simple!
> 
> > because x is now treated as an rvalue, and you can't bind a non-const lvalue
> > reference to an rvalue.
> > 
> > There's one FIXME in elision1.C:five, which we should compile but reject
> > with "passing 'Mutt' as 'this' argument discards qualifiers".  That
> > looks bogus to me, I think I'll open a PR for it.
> 
> Let's fix that now, I think.

OK, copypasting this bit from the other email so that we can have one
thread:

> Can of worms.   The test is
> 
>struct Mutt {
>operator int*() &&;
>};
> 
>int* five(Mutt x) {
>return x;  // OK since C++20 because P1155
>}
> 
> 'x' should be treated as an rvalue, therefore the operator fn taking
> an rvalue ref to Mutt should be used to convert 'x' to int*.  We fail
> because we don't treat 'x' as an rvalue because the function doesn't
> return a class.  So the patch should be just
> 
> --- a/gcc/cp/typeck.cc
> +++ b/gcc/cp/typeck.cc
> @@ -10875,10 +10875,7 @@ check_return_expr (tree retval, bool *no_warning)
>Note that these conditions are similar to, but not as strict as,
>   the conditions for the named return value optimization.  */
> bool converted = false;
> -  tree moved;
> -  /* This is only interesting for class type.  */
> -  if (CLASS_TYPE_P (functype)
> - && (moved = treat_lvalue_as_rvalue_p (retval, /*return*/true)))
> +  if (tree moved = treat_lvalue_as_rvalue_p (retval, /*return*/true))
>  {
>if (cxx_dialect < cxx20)
>  {
> 
> which fixes the test, but breaks a lot of middle-end warnings.  For instance
> g++.dg/warn/nonnull3.C, where the patch above changes .gimple:
> 
>   bool A::foo (struct A * const this, <<< Unknown tree: offset_type >>> p)
>   {
> -  bool D.2146;
> +  bool D.2150;
>   
> -  D.2146 = p != -1;
> -  return D.2146;
> +  p.0_1 = p;
> +  D.2150 = p.0_1 != -1;
> +  return D.2150;
>   }
> 
> and we no longer get the warning.  I thought maybe I could undo the implicit
> rvalue conversion in cp_fold, when it sees implicit_rvalue_p, but that didn't
> work.  So currently I'm stuck.  Should we try to figure this out or push 
> aside?

> Can you undo the implicit rvalue conversion within check_return_expr, 
> where we can still refer back to the original expression?

Unfortunately no, one problem is that treat_lvalue_as_rvalue_p modifies
the underlying decl by setting TREE_ADDRESSABLE, which then presumably
breaks warnings.  That is, treat_ can get 'VCE(x)' and produce
'*NLE<(X&) >' where 'x' flags have been modified, since we're taking
x's address.

> Or avoid

Re: [PATCH] [PR68097] frange::set_nonnegative should not contain -NAN.

2022-09-20 Thread Aldy Hernandez via Gcc-patches

On Tue, Sep 20, 2022 at 5:10 PM Jakub Jelinek  wrote:
>
> On Tue, Sep 20, 2022 at 04:58:38PM +0200, Aldy Hernandez wrote:
> > > > > deal with NaNs just fine and is required to correctly capture the 
> > > > > sign of
> > > > > 'x'.  If frange::set_nonnegative is supposed to be used in such 
> > > > > contexts
> > > > > (and I think it's a good idea if that were the case), then 
> > > > > set_nonnegative
> > > > > does _not_ imply no-NaN.
> > > > >
> > > > > In particular I would assume that, given an VAYRING frange FR, that
> > > > > FR.set_nonnegative() would result in an frange {[+0.0,+inf],+nan} .
> > > >
> > > > That was my understanding as well, and what my original patch did.
> > > > But again, I'm just the messenger.
> > >
> > > Ah, I obviously haven't followed the thread carefully then.  If that's
> > > what it was doing then IMO it was the right thing.
> >
> > This brings me back to my original patch :).
> >
> > Richard, do you agree nonnegative should be [0.0, +INF] U +NAN.
>
> I agree with that.  And similarly if there is negative that does the
> opposite [-INF, -0.0] U -NAN.
> Though, in most other places when we see that something may be a NaN, I
> think we need to set both +NAN and -NAN, because at least the 2008 version
> of IEEE 754 says:

Yeah, every other place does update_nan() with no arguments which sets
+-NAN.  The only use of update_nan(bool signbit) is this patch.

>
> "When either an input or result is NaN, this standard does not interpret the 
> sign of a NaN. Note, however,
> that operations on bit strings — copy, negate, abs, copySign — specify the 
> sign bit of a NaN result,
> sometimes based upon the sign bit of a NaN operand. The logical predicate 
> totalOrder is also affected by
> the sign bit of a NaN operand. For all other operations, this standard does 
> not specify the sign bit of a NaN
> result, even when there is only one input NaN, or when the NaN is produced 
> from an invalid
> operation."

Ughh, that means that my upcoming PLUS_EXPR implementation will have
to keep better track of NAN signs.

Pushed original patch.

Thanks.
Aldy

>
> So not sure if we should count on what NaN sign bit we get normally and what
> we get for canonical NaN.  If we could rely on it, then the rule is
> that if at least one input to binary operation is NaN, then that NaN is
> copied to result, but if both are NaNs, which one is picked isn't specified,
> so we might need just union the +NAN and -NAN bits from the operands.
> But there are still sNaNs and those ought to be turned into some qNaN and
> dunno if that can change the NaN bit (say turn the sNaN into canonical
> qNaN).
> If neither operand is NaN, but result is NaN because of invalid operation
> (0/0, inf-inf, inf+-inf, sqrt (-1) and the like),
> the result is qNaN, but dunno if we can rely that it will be one with
> positive sign.
>
> Jakub
>

Re: [PATCH] Add attribute hot judgement for INLINE_HINT_known_hot hint.

2022-09-20 Thread Jan Hubicka via Gcc-patches

> Hi Honza,
> 
> This patch is to add attribute hot judgement for INLINE_HINT_known_hot hint.
> 
> We set up INLINE_HINT_known_hot hint only when we have profile feedback,
> now add function attribute judgement for it, when both caller and callee
> have __attribute__((hot)), we will also set up INLINE_HINT_known_hot hint
> for it.
> 
> With this patch applied
>  Ratio   Codesize
> ADL Multi-copy:538.imagic_r  16.7%1.6%
> SPR Multi-copy:538.imagic_r  15%  1.7%
> ICX Multi-copy:538.imagic_r  15.2%1.7%
> CLX Multi-copy:538.imagic_r  12.7%1.7%
> Znver3 Multi-copy: 538.imagic_r  10.6%1.5%
> 
> Bootstrap and regtest pending on x86_64-unknown-linux-gnu.
> OK for trunk?
> 
> Thanks,
> Lili.
> 
> gcc/ChangeLog
> 
>   * ipa-inline-analysis.cc (do_estimate_edge_time): Add function attribute
>   judgement for INLINE_HINT_known_hot hint.

Thank you.  Can you please also add a testcase that tests for this.
So you modify imagemagick marking attribute hot on the specific inline?
I will try to also look again at your earlier patch - I had very busy
summer and unfortunately lost track on this one.

Honza
> ---
>  gcc/ipa-inline-analysis.cc | 13 +
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/ipa-inline-analysis.cc b/gcc/ipa-inline-analysis.cc
> index 1ca685d1b0e..7bd29c36590 100644
> --- a/gcc/ipa-inline-analysis.cc
> +++ b/gcc/ipa-inline-analysis.cc
> @@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "ipa-utils.h"
>  #include "cfgexpand.h"
>  #include "gimplify.h"
> +#include "attribs.h"
>  
>  /* Cached node/edge growths.  */
>  fast_call_summary *edge_growth_cache = 
> NULL;
> @@ -249,15 +250,19 @@ do_estimate_edge_time (struct cgraph_edge *edge, sreal 
> *ret_nonspec_time)
>hints = estimates.hints;
>  }
>  
> -  /* When we have profile feedback, we can quite safely identify hot
> - edges and for those we disable size limits.  Don't do that when
> - probability that caller will call the callee is low however, since it
> +  /* When we have profile feedback or function attribute, we can quite safely
> + identify hot edges and for those we disable size limits.  Don't do that
> + when probability that caller will call the callee is low however, since 
> it
>   may hurt optimization of the caller's hot path.  */
> -  if (edge->count.ipa ().initialized_p () && edge->maybe_hot_p ()
> +  if ((edge->count.ipa ().initialized_p () && edge->maybe_hot_p ()
>&& (edge->count.ipa () * 2
> > (edge->caller->inlined_to
>? edge->caller->inlined_to->count.ipa ()
>: edge->caller->count.ipa (
> +  || (lookup_attribute ("hot", DECL_ATTRIBUTES (edge->caller->decl))
> +   != NULL
> +  && lookup_attribute ("hot", DECL_ATTRIBUTES (edge->callee->decl))
> +   != NULL))
>  hints |= INLINE_HINT_known_hot;
>  
>gcc_checking_assert (size >= 0);
> -- 
> 2.17.1
>

Re: Make 'autoreconf' work for 'gcc', 'libobjc' (was: [PATCH] regenerate configure files and config.h.in files)

2022-09-20 Thread Martin Liška

On 9/20/22 15:37, Thomas Schwinge wrote:
> |Then, make it simply call 'autoreconf' for all 'config_folders'. (Also, I'm 
> not running into the issue you've stated in the script that "apparently 
> automake is somehow unstable -> skip it for gotools".)|

I do support that as well.

What will be the only command invocation that will be needed once you're done?

Thanks and cheers,
Martin

Re: [PATCH] [PR68097] frange::set_nonnegative should not contain -NAN.

2022-09-20 Thread Jakub Jelinek via Gcc-patches

On Tue, Sep 20, 2022 at 04:58:38PM +0200, Aldy Hernandez wrote:
> > > > deal with NaNs just fine and is required to correctly capture the sign 
> > > > of
> > > > 'x'.  If frange::set_nonnegative is supposed to be used in such contexts
> > > > (and I think it's a good idea if that were the case), then 
> > > > set_nonnegative
> > > > does _not_ imply no-NaN.
> > > >
> > > > In particular I would assume that, given an VAYRING frange FR, that
> > > > FR.set_nonnegative() would result in an frange {[+0.0,+inf],+nan} .
> > >
> > > That was my understanding as well, and what my original patch did.
> > > But again, I'm just the messenger.
> >
> > Ah, I obviously haven't followed the thread carefully then.  If that's
> > what it was doing then IMO it was the right thing.
> 
> This brings me back to my original patch :).
> 
> Richard, do you agree nonnegative should be [0.0, +INF] U +NAN.

I agree with that.  And similarly if there is negative that does the
opposite [-INF, -0.0] U -NAN.
Though, in most other places when we see that something may be a NaN, I
think we need to set both +NAN and -NAN, because at least the 2008 version
of IEEE 754 says:

"When either an input or result is NaN, this standard does not interpret the 
sign of a NaN. Note, however,
that operations on bit strings — copy, negate, abs, copySign — specify the sign 
bit of a NaN result,
sometimes based upon the sign bit of a NaN operand. The logical predicate 
totalOrder is also affected by
the sign bit of a NaN operand. For all other operations, this standard does not 
specify the sign bit of a NaN
result, even when there is only one input NaN, or when the NaN is produced from 
an invalid
operation."

So not sure if we should count on what NaN sign bit we get normally and what
we get for canonical NaN.  If we could rely on it, then the rule is
that if at least one input to binary operation is NaN, then that NaN is
copied to result, but if both are NaNs, which one is picked isn't specified,
so we might need just union the +NAN and -NAN bits from the operands.
But there are still sNaNs and those ought to be turned into some qNaN and
dunno if that can change the NaN bit (say turn the sNaN into canonical
qNaN).
If neither operand is NaN, but result is NaN because of invalid operation
(0/0, inf-inf, inf+-inf, sqrt (-1) and the like),
the result is qNaN, but dunno if we can rely that it will be one with
positive sign.

Jakub

Re: [PATCH] [PR68097] frange::set_nonnegative should not contain -NAN.

2022-09-20 Thread Aldy Hernandez via Gcc-patches

On Tue, Sep 20, 2022 at 2:51 PM Michael Matz  wrote:
>
> Hello,
>
> On Tue, 20 Sep 2022, Aldy Hernandez wrote:
>
> > > FWIW, in IEEE, 'abs' (like 'copy, 'copysign' and 'negate') are not
> > > arithmetic, they are quiet-computational.  Hence they don't rise
> > > anything, not even for sNaNs; they copy the input bits and appropriately
> > > modify the bit pattern according to the specification (i.e. fiddle the
> > > sign bit).
> > >
> > > That also means that a predicate like negative_p(x) that would be
> > > implemented ala
> > >
> > >   copysign(1.0, x) < 0.0
> >
> > I suppose this means -0.0 is not considered negative,
>
> It would be considered negative if the predicate is implemented like
> above:
>copysign(1.0, -0.0) == -1.0
>
> But really, that depends on what _our_ definition of negative_p is
> supposed to be.  I think the most reasonable definition is indeed similar
> to above, which in turn is equivalent to simply looking at the sign bit
> (which is what copysign() does), i.e. ...
>
> > though it has
> > the signbit set?  FWIW, on real_value's real_isneg() returns true for
> > -0.0 because it only looks at the sign.
>
> ... this seems the sensible thing.  I just wanted to argue the case that
> set_negative (or the like) which "sets" the sign bit does not make the
> nan-ness go away.  They are orthogonal.
>
> > > deal with NaNs just fine and is required to correctly capture the sign of
> > > 'x'.  If frange::set_nonnegative is supposed to be used in such contexts
> > > (and I think it's a good idea if that were the case), then set_nonnegative
> > > does _not_ imply no-NaN.
> > >
> > > In particular I would assume that, given an VAYRING frange FR, that
> > > FR.set_nonnegative() would result in an frange {[+0.0,+inf],+nan} .
> >
> > That was my understanding as well, and what my original patch did.
> > But again, I'm just the messenger.
>
> Ah, I obviously haven't followed the thread carefully then.  If that's
> what it was doing then IMO it was the right thing.

This brings me back to my original patch :).

Richard, do you agree nonnegative should be [0.0, +INF] U +NAN.

Thanks.
Aldy

[pushed] aarch64: Fix GTY markup for arm_sve.h [PR106491]

2022-09-20 Thread Richard Sandiford via Gcc-patches

It turns out that GTY(()) markers in definitions like:

  GTY(()) tree scalar_types[NUM_VECTOR_TYPES];

are not effective and are silently ignored.  The GTY(()) has
to come after an extern or static.

The externs associated with the SVE ACLE GTY variables are in
aarch64-sve-builtins.h.  This file is not in tm_include_list because
we don't want every target-facing file to include it.  It therefore
isn't in the list of GC header files either.

In this case that's a blessing in disguise, since the variables
belong to a namespace and gengtype doesn't understand namespaces.
I think the fix is instead to add an extra extern before each
variable declaration, similarly to varasm.cc and vtable-verify.cc.
(This works due to a "using namespace" at the end of the file.)

Tested on aarch64-linux-gnu & pushed.  I'll backport to branches
over the next few days.

Richard


gcc/
PR target/106491
* config/aarch64/aarch64-sve-builtins.cc (scalar_types)
(acle_vector_types, acle_svpattern, acle_svprfop): Add GTY
markup to (new) extern declarations instead of to the main
definition.
---
 gcc/config/aarch64/aarch64-sve-builtins.cc | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
b/gcc/config/aarch64/aarch64-sve-builtins.cc
index 9d78b270e47..12d9beee4da 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -530,7 +530,8 @@ static CONSTEXPR const function_group_info 
function_groups[] = {
 };
 
 /* The scalar type associated with each vector type.  */
-GTY(()) tree scalar_types[NUM_VECTOR_TYPES];
+extern GTY(()) tree scalar_types[NUM_VECTOR_TYPES];
+tree scalar_types[NUM_VECTOR_TYPES];
 
 /* The single-predicate and single-vector types, with their built-in
"__SV..._t" name.  Allow an index of NUM_VECTOR_TYPES, which always
@@ -538,13 +539,16 @@ GTY(()) tree scalar_types[NUM_VECTOR_TYPES];
 static GTY(()) tree abi_vector_types[NUM_VECTOR_TYPES + 1];
 
 /* Same, but with the arm_sve.h "sv..._t" name.  */
-GTY(()) tree acle_vector_types[MAX_TUPLE_SIZE][NUM_VECTOR_TYPES + 1];
+extern GTY(()) tree acle_vector_types[MAX_TUPLE_SIZE][NUM_VECTOR_TYPES + 1];
+tree acle_vector_types[MAX_TUPLE_SIZE][NUM_VECTOR_TYPES + 1];
 
 /* The svpattern enum type.  */
-GTY(()) tree acle_svpattern;
+extern GTY(()) tree acle_svpattern;
+tree acle_svpattern;
 
 /* The svprfop enum type.  */
-GTY(()) tree acle_svprfop;
+extern GTY(()) tree acle_svprfop;
+tree acle_svprfop;
 
 /* The list of all registered function decls, indexed by code.  */
 static GTY(()) vec *registered_functions;
-- 
2.25.1

[COMMITTED] [PR106970] New test for PR that has already been fixed.

2022-09-20 Thread Aldy Hernandez via Gcc-patches

PR tree-optimization/106970

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr106970.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr106970.c | 9 +
 1 file changed, 9 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr106970.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr106970.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr106970.c
new file mode 100644
index 000..cda9bd46910
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr106970.c
@@ -0,0 +1,9 @@
+// { dg-do compile }
+// { dg-options "-O1 -fno-signed-zeros" }
+
+void
+foo (double x, double y)
+{
+  if (!x == !y * -1.0)
+__builtin_trap ();
+}
-- 
2.37.1

Re: [PATCH] c++: stream PACK_EXPANSION_EXTRA_ARGS [PR106761]

2022-09-20 Thread Nathan Sidwell via Gcc-patches


On 9/20/22 10:08, Patrick Palka wrote:

On Tue, 20 Sep 2022, Nathan Sidwell wrote:


On 9/19/22 09:52, Patrick Palka wrote:

It looks like some xtreme-header-* tests are failing after the libstdc++
change r13-2158-g02f6b405f0e9dc ultimately because we're neglecting
to stream PACK_EXPANSION_EXTRA_ARGS, which leads to false equivalences
of different partial instantiations of _TupleConstraints::__constructible.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

PR c++/106761

gcc/cp/ChangeLog:

* module.cc (trees_out::type_node) :
Stream PACK_EXPANSION_EXTRA_ARGS.
(trees_in::tree_node) : Likewise.



Looks good, I wonder why I missed that.  (I guess extracting a testcase out of
the headers was too tricky?)


Many thanks.  I managed to produce a small testcase which mirrors the
format of the xtreme-header-2* testcase.  Does the following look OK?


yup, thanks for extracting that!

nathan


-- >8 --

PR c++/106761

gcc/cp/ChangeLog:

* module.cc (trees_out::type_node) :
Stream PACK_EXPANSION_EXTRA_ARGS.
(trees_in::tree_node) : Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr106761.h: New test.
* g++.dg/modules/pr106761_a.H: New test.
* g++.dg/modules/pr106761_b.C: New test.
---
  gcc/cp/module.cc  |  3 +++
  gcc/testsuite/g++.dg/modules/pr106761.h   | 22 ++
  gcc/testsuite/g++.dg/modules/pr106761_a.H |  5 +
  gcc/testsuite/g++.dg/modules/pr106761_b.C |  7 +++
  4 files changed, 37 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/modules/pr106761.h
  create mode 100644 gcc/testsuite/g++.dg/modules/pr106761_a.H
  create mode 100644 gcc/testsuite/g++.dg/modules/pr106761_b.C

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 1a1ff5be574..9a9ef4e3332 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -8922,6 +8922,7 @@ trees_out::type_node (tree type)
if (streaming_p ())
u (PACK_EXPANSION_LOCAL_P (type));
tree_node (PACK_EXPANSION_PARAMETER_PACKS (type));
+  tree_node (PACK_EXPANSION_EXTRA_ARGS (type));
break;
  
  case TYPENAME_TYPE:

@@ -9455,12 +9456,14 @@ trees_in::tree_node (bool is_use)
{
  bool local = u ();
  tree param_packs = tree_node ();
+ tree extra_args = tree_node ();
  if (!get_overrun ())
{
  tree expn = cxx_make_type (TYPE_PACK_EXPANSION);
  SET_TYPE_STRUCTURAL_EQUALITY (expn);
  PACK_EXPANSION_PATTERN (expn) = res;
  PACK_EXPANSION_PARAMETER_PACKS (expn) = param_packs;
+ PACK_EXPANSION_EXTRA_ARGS (expn) = extra_args;
  PACK_EXPANSION_LOCAL_P (expn) = local;
  res = expn;
}
diff --git a/gcc/testsuite/g++.dg/modules/pr106761.h 
b/gcc/testsuite/g++.dg/modules/pr106761.h
new file mode 100644
index 000..9f22a22a45d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/pr106761.h
@@ -0,0 +1,22 @@
+// PR c++/106761
+
+template
+struct __and_;
+
+template
+struct is_convertible;
+
+template
+struct _TupleConstraints {
+  template
+  using __constructible = __and_...>;
+};
+
+template
+struct tuple {
+  template
+  using __constructible
+= typename _TupleConstraints::template __constructible;
+};
+
+tuple t;
diff --git a/gcc/testsuite/g++.dg/modules/pr106761_a.H 
b/gcc/testsuite/g++.dg/modules/pr106761_a.H
new file mode 100644
index 000..8ad116412af
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/pr106761_a.H
@@ -0,0 +1,5 @@
+// PR c++/106761
+// { dg-additional-options -fmodule-header }
+
+// { dg-module-cmi {} }
+#include "pr106761.h"
diff --git a/gcc/testsuite/g++.dg/modules/pr106761_b.C 
b/gcc/testsuite/g++.dg/modules/pr106761_b.C
new file mode 100644
index 000..336cb12757e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/pr106761_b.C
@@ -0,0 +1,7 @@
+// PR c++/106761
+// { dg-additional-options -fmodules-ts }
+
+#include "pr106761.h"
+import "pr106761_a.H";
+
+tuple u = t;


--
Nathan Sidwell

Re: [PATCH] c++: stream PACK_EXPANSION_EXTRA_ARGS [PR106761]

2022-09-20 Thread Patrick Palka via Gcc-patches

On Tue, 20 Sep 2022, Nathan Sidwell wrote:

> On 9/19/22 09:52, Patrick Palka wrote:
> > It looks like some xtreme-header-* tests are failing after the libstdc++
> > change r13-2158-g02f6b405f0e9dc ultimately because we're neglecting
> > to stream PACK_EXPANSION_EXTRA_ARGS, which leads to false equivalences
> > of different partial instantiations of _TupleConstraints::__constructible.
> > 
> > Tested on x86_64-pc-linux-gnu, does this look OK for trunk?
> > 
> > PR c++/106761
> > 
> > gcc/cp/ChangeLog:
> > 
> > * module.cc (trees_out::type_node) :
> > Stream PACK_EXPANSION_EXTRA_ARGS.
> > (trees_in::tree_node) : Likewise.
> 
> 
> Looks good, I wonder why I missed that.  (I guess extracting a testcase out of
> the headers was too tricky?)

Many thanks.  I managed to produce a small testcase which mirrors the
format of the xtreme-header-2* testcase.  Does the following look OK?

-- >8 --

PR c++/106761

gcc/cp/ChangeLog:

* module.cc (trees_out::type_node) :
Stream PACK_EXPANSION_EXTRA_ARGS.
(trees_in::tree_node) : Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr106761.h: New test.
* g++.dg/modules/pr106761_a.H: New test.
* g++.dg/modules/pr106761_b.C: New test.
---
 gcc/cp/module.cc  |  3 +++
 gcc/testsuite/g++.dg/modules/pr106761.h   | 22 ++
 gcc/testsuite/g++.dg/modules/pr106761_a.H |  5 +
 gcc/testsuite/g++.dg/modules/pr106761_b.C |  7 +++
 4 files changed, 37 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/modules/pr106761.h
 create mode 100644 gcc/testsuite/g++.dg/modules/pr106761_a.H
 create mode 100644 gcc/testsuite/g++.dg/modules/pr106761_b.C

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 1a1ff5be574..9a9ef4e3332 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -8922,6 +8922,7 @@ trees_out::type_node (tree type)
   if (streaming_p ())
u (PACK_EXPANSION_LOCAL_P (type));
   tree_node (PACK_EXPANSION_PARAMETER_PACKS (type));
+  tree_node (PACK_EXPANSION_EXTRA_ARGS (type));
   break;
 
 case TYPENAME_TYPE:
@@ -9455,12 +9456,14 @@ trees_in::tree_node (bool is_use)
{
  bool local = u ();
  tree param_packs = tree_node ();
+ tree extra_args = tree_node ();
  if (!get_overrun ())
{
  tree expn = cxx_make_type (TYPE_PACK_EXPANSION);
  SET_TYPE_STRUCTURAL_EQUALITY (expn);
  PACK_EXPANSION_PATTERN (expn) = res;
  PACK_EXPANSION_PARAMETER_PACKS (expn) = param_packs;
+ PACK_EXPANSION_EXTRA_ARGS (expn) = extra_args;
  PACK_EXPANSION_LOCAL_P (expn) = local;
  res = expn;
}
diff --git a/gcc/testsuite/g++.dg/modules/pr106761.h 
b/gcc/testsuite/g++.dg/modules/pr106761.h
new file mode 100644
index 000..9f22a22a45d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/pr106761.h
@@ -0,0 +1,22 @@
+// PR c++/106761
+
+template
+struct __and_;
+
+template
+struct is_convertible;
+
+template
+struct _TupleConstraints {
+  template
+  using __constructible = __and_...>;
+};
+
+template
+struct tuple {
+  template
+  using __constructible
+= typename _TupleConstraints::template __constructible;
+};
+
+tuple t;
diff --git a/gcc/testsuite/g++.dg/modules/pr106761_a.H 
b/gcc/testsuite/g++.dg/modules/pr106761_a.H
new file mode 100644
index 000..8ad116412af
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/pr106761_a.H
@@ -0,0 +1,5 @@
+// PR c++/106761
+// { dg-additional-options -fmodule-header }
+
+// { dg-module-cmi {} }
+#include "pr106761.h"
diff --git a/gcc/testsuite/g++.dg/modules/pr106761_b.C 
b/gcc/testsuite/g++.dg/modules/pr106761_b.C
new file mode 100644
index 000..336cb12757e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/pr106761_b.C
@@ -0,0 +1,7 @@
+// PR c++/106761
+// { dg-additional-options -fmodules-ts }
+
+#include "pr106761.h"
+import "pr106761_a.H";
+
+tuple u = t;
-- 
2.38.0.rc0.52.gdda7228a83

Re: Make 'autoreconf' work for 'gcc', 'libobjc' (was: [PATCH] regenerate configure files and config.h.in files)

2022-09-20 Thread Iain Sandoe via Gcc-patches

Hi!

> On 20 Sep 2022, at 14:37, Thomas Schwinge  wrote:

> On 2022-08-25T11:42:01+0200, Martin Liška  wrote:
>> I wrote a scipt that runs autoconf in all folders that have configure.ac
>> file and same for autoheader (where AC_CONFIG_HEADERS is present) and
>> this is the output.
>> 
>> The script can be seen here:
>> https://github.com/marxin/script-misc/blob/master/gcc-autoconf-all.py
> 
> That's similar to what I maintain at
> .
> However, I now found that both our two's approaches are incomplete.  ;-)
> Yours is missing calling 'aclocal', mine 'aclocal' and 'autoheader' (for
> GCC subpackages not using Automake).
> 
> What we really should be doing, in my opinion, is making 'autoreconf'
> work for all GCC subpackages, and that's exactly what the attached patch
> "Make 'autoreconf' work for 'gcc', 'libobjc'" does.  OK to push?

+1 from me …
..  I have been maintaining something similar locally.
Iain

> 
>> I'm going to add the script to my daily Builbot tester.
> 
> Then, make it simply call 'autoreconf' for all 'config_folders'.
> 
> (Also, I'm not running into the issue you've stated in the script that
> "apparently automake is somehow unstable -> skip it for gotools".)
> 
> 
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>> 
>> Ready to be installed?
> 
> That said, I do confirm the changes of your recent
> commit r13-2200-gb1a3d2b778168341c617aaee6541c66239a198d2
> "regenerate configure files and config.h.in files", and per the attached
> patch "'autoreconf' all of GCC", I'll then be changing a few additional
> pieces.
> 
> 
> Grüße
> Thomas
> 
> 
>> fixincludes/ChangeLog:
>> 
>>  * config.h.in: Regenerate.
>>  * configure: Regenerate.
>> 
>> libada/ChangeLog:
>> 
>>  * configure: Regenerate.
>> 
>> libiberty/ChangeLog:
>> 
>>  * configure: Regenerate.
>> 
>> libobjc/ChangeLog:
>> 
>>  * configure: Regenerate.
>> 
>> liboffloadmic/ChangeLog:
>> 
>>  * configure: Regenerate.
>>  * plugin/configure: Regenerate.
>> 
>> libquadmath/ChangeLog:
>> 
>>  * configure: Regenerate.
>> 
>> libssp/ChangeLog:
>> 
>>  * configure: Regenerate.
>> 
>> libvtv/ChangeLog:
>> 
>>  * configure: Regenerate.
>> 
>> zlib/ChangeLog:
>> 
>>  * configure: Regenerate.
>> ---
>> fixincludes/config.h.in| 204 -
>> fixincludes/configure  |   2 +-
>> libada/configure   |   3 +
>> libiberty/configure|   3 +
>> libobjc/configure  |   6 +-
>> liboffloadmic/configure|  46 +++-
>> liboffloadmic/plugin/configure |  46 +++-
>> libquadmath/configure  |   6 +-
>> libssp/configure   |   6 +-
>> libvtv/configure   |  18 +--
>> zlib/configure |   6 +-
>> 11 files changed, 61 insertions(+), 285 deletions(-)
>> 
>> diff --git a/fixincludes/config.h.in b/fixincludes/config.h.in
>> index 3f6cf1e574e..69a67f5f116 100644
>> --- a/fixincludes/config.h.in
>> +++ b/fixincludes/config.h.in
>> @@ -1,397 +1,211 @@
>> /* config.h.in.  Generated from configure.ac by autoheader.  */
>> 
>> /* Defined to the executable file extension on the host system */
>> -#ifndef USED_FOR_TARGET
>> #undef EXE_EXT
>> -#endif
>> -
>> 
>> /* Define to 1 if you have the `clearerr_unlocked' function. */
>> -#ifndef USED_FOR_TARGET
>> #undef HAVE_CLEARERR_UNLOCKED
>> -#endif
>> -
>> 
>> /* Define to 1 if you have the declaration of `abort', and to 0 if you don't.
>>*/
>> -#ifndef USED_FOR_TARGET
>> #undef HAVE_DECL_ABORT
>> -#endif
>> -
>> 
>> /* Define to 1 if you have the declaration of `asprintf', and to 0 if you
>>don't. */
>> -#ifndef USED_FOR_TARGET
>> #undef HAVE_DECL_ASPRINTF
>> -#endif
>> -
>> 
>> /* Define to 1 if you have the declaration of `basename(char *)', and to 0 if
>>you don't. */
>> -#ifndef USED_FOR_TARGET
>> #undef HAVE_DECL_BASENAME
>> -#endif
>> -
>> 
>> /* Define to 1 if you have the declaration of `clearerr_unlocked', and to 0 
>> if
>>you don't. */
>> -#ifndef USED_FOR_TARGET
>> #undef HAVE_DECL_CLEARERR_UNLOCKED
>> -#endif
>> -
>> 
>> /* Define to 1 if you have the declaration of `errno', and to 0 if you don't.
>>*/
>> -#ifndef USED_FOR_TARGET
>> #undef HAVE_DECL_ERRNO
>> -#endif
>> -
>> 
>> /* Define to 1 if you have the declaration of `feof_unlocked', and to 0 if 
>> you
>>don't. */
>> -#ifndef USED_FOR_TARGET
>> #undef HAVE_DECL_FEOF_UNLOCKED
>> -#endif
>> -
>> 
>> /* Define to 1 if you have the declaration of `ferror_unlocked', and to 0 if
>>you don't. */
>> -#ifndef USED_FOR_TARGET
>> #undef HAVE_DECL_FERROR_UNLOCKED
>> -#endif
>> -
>> 
>> /* Define to 1 if you have the declaration of `fflush_unlocked', and to 0 if
>>you don't. */
>> -#ifndef USED_FOR_TARGET
>> #undef HAVE_DECL_FFLUSH_UNLOCKED
>> -#endif
>> -
>> 
>> /* Define to 1 if you have the declaration of `fgetc_unlocked', and to 0 if
>>you don't.

Re: [PATCH][pushed] fortran: remove 2 dead links [PR106636]

2022-09-20 Thread Tobias Burnus


Hi Martin,

On 20.09.22 14:24, Martin Liška wrote:

On 9/20/22 14:17, Tobias Burnus wrote:

Instead of removing the links, can we rather replace it by an updated link?
[...]

Thanks for the archeological work you did.
Sure, what about the suggested patch?


LGTM. Thanks!

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Make 'autoreconf' work for 'gcc', 'libobjc' (was: [PATCH] regenerate configure files and config.h.in files)

2022-09-20 Thread Thomas Schwinge

Hi!

On 2022-08-25T11:42:01+0200, Martin Liška  wrote:
> I wrote a scipt that runs autoconf in all folders that have configure.ac
> file and same for autoheader (where AC_CONFIG_HEADERS is present) and
> this is the output.
>
> The script can be seen here:
> https://github.com/marxin/script-misc/blob/master/gcc-autoconf-all.py

That's similar to what I maintain at
.
However, I now found that both our two's approaches are incomplete.  ;-)
Yours is missing calling 'aclocal', mine 'aclocal' and 'autoheader' (for
GCC subpackages not using Automake).

What we really should be doing, in my opinion, is making 'autoreconf'
work for all GCC subpackages, and that's exactly what the attached patch
"Make 'autoreconf' work for 'gcc', 'libobjc'" does.  OK to push?

> I'm going to add the script to my daily Builbot tester.

Then, make it simply call 'autoreconf' for all 'config_folders'.

(Also, I'm not running into the issue you've stated in the script that
"apparently automake is somehow unstable -> skip it for gotools".)


> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

That said, I do confirm the changes of your recent
commit r13-2200-gb1a3d2b778168341c617aaee6541c66239a198d2
"regenerate configure files and config.h.in files", and per the attached
patch "'autoreconf' all of GCC", I'll then be changing a few additional
pieces.


Grüße
 Thomas


> fixincludes/ChangeLog:
>
>   * config.h.in: Regenerate.
>   * configure: Regenerate.
>
> libada/ChangeLog:
>
>   * configure: Regenerate.
>
> libiberty/ChangeLog:
>
>   * configure: Regenerate.
>
> libobjc/ChangeLog:
>
>   * configure: Regenerate.
>
> liboffloadmic/ChangeLog:
>
>   * configure: Regenerate.
>   * plugin/configure: Regenerate.
>
> libquadmath/ChangeLog:
>
>   * configure: Regenerate.
>
> libssp/ChangeLog:
>
>   * configure: Regenerate.
>
> libvtv/ChangeLog:
>
>   * configure: Regenerate.
>
> zlib/ChangeLog:
>
>   * configure: Regenerate.
> ---
>  fixincludes/config.h.in| 204 -
>  fixincludes/configure  |   2 +-
>  libada/configure   |   3 +
>  libiberty/configure|   3 +
>  libobjc/configure  |   6 +-
>  liboffloadmic/configure|  46 +++-
>  liboffloadmic/plugin/configure |  46 +++-
>  libquadmath/configure  |   6 +-
>  libssp/configure   |   6 +-
>  libvtv/configure   |  18 +--
>  zlib/configure |   6 +-
>  11 files changed, 61 insertions(+), 285 deletions(-)
>
> diff --git a/fixincludes/config.h.in b/fixincludes/config.h.in
> index 3f6cf1e574e..69a67f5f116 100644
> --- a/fixincludes/config.h.in
> +++ b/fixincludes/config.h.in
> @@ -1,397 +1,211 @@
>  /* config.h.in.  Generated from configure.ac by autoheader.  */
>
>  /* Defined to the executable file extension on the host system */
> -#ifndef USED_FOR_TARGET
>  #undef EXE_EXT
> -#endif
> -
>
>  /* Define to 1 if you have the `clearerr_unlocked' function. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_CLEARERR_UNLOCKED
> -#endif
> -
>
>  /* Define to 1 if you have the declaration of `abort', and to 0 if you don't.
> */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_ABORT
> -#endif
> -
>
>  /* Define to 1 if you have the declaration of `asprintf', and to 0 if you
> don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_ASPRINTF
> -#endif
> -
>
>  /* Define to 1 if you have the declaration of `basename(char *)', and to 0 if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_BASENAME
> -#endif
> -
>
>  /* Define to 1 if you have the declaration of `clearerr_unlocked', and to 0 
> if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_CLEARERR_UNLOCKED
> -#endif
> -
>
>  /* Define to 1 if you have the declaration of `errno', and to 0 if you don't.
> */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_ERRNO
> -#endif
> -
>
>  /* Define to 1 if you have the declaration of `feof_unlocked', and to 0 if 
> you
> don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_FEOF_UNLOCKED
> -#endif
> -
>
>  /* Define to 1 if you have the declaration of `ferror_unlocked', and to 0 if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_FERROR_UNLOCKED
> -#endif
> -
>
>  /* Define to 1 if you have the declaration of `fflush_unlocked', and to 0 if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_FFLUSH_UNLOCKED
> -#endif
> -
>
>  /* Define to 1 if you have the declaration of `fgetc_unlocked', and to 0 if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_FGETC_UNLOCKED
> -#endif
> -
>
>  /* Define to 1 if you have the declaration of `fgets_unlocked', and to 0 if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_FGETS_UNLOCKED
> -#endif
> -
>
>  /* Define to 1 if you have the declaration

[PATCH v2] testsuite: Only run test on target if VMA == LMA

2022-09-20 Thread Torbjörn SVENSSON via Gcc-patches

Checking that the triplet matches arm*-*-eabi (or msp430-*-*) is not
enough to know if the execution will enter an endless loop, or if it
will give a meaningful result. As the execution test only work when
VMA and LMA are equal, make sure that this condition is met.

2022-09-16  Torbjörn SVENSSON  

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_effective_target_vma_equals_lma): New.
* c-c++-common/torture/attr-noinit-1.c: Requre VMA == LMA to run.
* c-c++-common/torture/attr-noinit-2.c: Likewise.
* c-c++-common/torture/attr-noinit-3.c: Likewise.
* c-c++-common/torture/attr-persistent-1.c: Likewise.
* c-c++-common/torture/attr-persistent-3.c: Likewise.

Co-Authored-By: Yvan ROUX  
Signed-off-by: Torbjörn SVENSSON  
---
 .../c-c++-common/torture/attr-noinit-1.c  |  3 +-
 .../c-c++-common/torture/attr-noinit-2.c  |  3 +-
 .../c-c++-common/torture/attr-noinit-3.c  |  3 +-
 .../c-c++-common/torture/attr-persistent-1.c  |  3 +-
 .../c-c++-common/torture/attr-persistent-3.c  |  3 +-
 gcc/testsuite/lib/target-supports.exp | 49 +++
 6 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/torture/attr-noinit-1.c 
b/gcc/testsuite/c-c++-common/torture/attr-noinit-1.c
index 877e7647ac9..f84eba0b649 100644
--- a/gcc/testsuite/c-c++-common/torture/attr-noinit-1.c
+++ b/gcc/testsuite/c-c++-common/torture/attr-noinit-1.c
@@ -1,4 +1,5 @@
-/* { dg-do run } */
+/* { dg-do link } */
+/* { dg-do run { target { vma_equals_lma } } } */
 /* { dg-require-effective-target noinit } */
 /* { dg-skip-if "data LMA != VMA" { msp430-*-* } { "-mlarge" } } */
 /* { dg-options "-save-temps" } */
diff --git a/gcc/testsuite/c-c++-common/torture/attr-noinit-2.c 
b/gcc/testsuite/c-c++-common/torture/attr-noinit-2.c
index befa2a0bd52..4528b9e3cfa 100644
--- a/gcc/testsuite/c-c++-common/torture/attr-noinit-2.c
+++ b/gcc/testsuite/c-c++-common/torture/attr-noinit-2.c
@@ -1,4 +1,5 @@
-/* { dg-do run } */
+/* { dg-do link } */
+/* { dg-do run { target { vma_equals_lma } } } */
 /* { dg-require-effective-target noinit } */
 /* { dg-options "-fdata-sections -save-temps" } */
 /* { dg-skip-if "data LMA != VMA" { msp430-*-* } { "-mlarge" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/attr-noinit-3.c 
b/gcc/testsuite/c-c++-common/torture/attr-noinit-3.c
index 519e88a59a6..2f1745694c9 100644
--- a/gcc/testsuite/c-c++-common/torture/attr-noinit-3.c
+++ b/gcc/testsuite/c-c++-common/torture/attr-noinit-3.c
@@ -1,4 +1,5 @@
-/* { dg-do run } */
+/* { dg-do link } */
+/* { dg-do run { target { vma_equals_lma } } } */
 /* { dg-require-effective-target noinit } */
 /* { dg-options "-flto -save-temps" } */
 /* { dg-skip-if "data LMA != VMA" { msp430-*-* } { "-mlarge" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/attr-persistent-1.c 
b/gcc/testsuite/c-c++-common/torture/attr-persistent-1.c
index 72dc3c27192..b11a515cef8 100644
--- a/gcc/testsuite/c-c++-common/torture/attr-persistent-1.c
+++ b/gcc/testsuite/c-c++-common/torture/attr-persistent-1.c
@@ -1,4 +1,5 @@
-/* { dg-do run } */
+/* { dg-do link } */
+/* { dg-do run { target { vma_equals_lma } } } */
 /* { dg-require-effective-target persistent } */
 /* { dg-skip-if "data LMA != VMA" { msp430-*-* } { "-mlarge" } } */
 /* { dg-options "-save-temps" } */
diff --git a/gcc/testsuite/c-c++-common/torture/attr-persistent-3.c 
b/gcc/testsuite/c-c++-common/torture/attr-persistent-3.c
index 3e4fd28618d..068a72af5c8 100644
--- a/gcc/testsuite/c-c++-common/torture/attr-persistent-3.c
+++ b/gcc/testsuite/c-c++-common/torture/attr-persistent-3.c
@@ -1,4 +1,5 @@
-/* { dg-do run } */
+/* { dg-do link } */
+/* { dg-do run { target { vma_equals_lma } } } */
 /* { dg-require-effective-target persistent } */
 /* { dg-options "-flto -save-temps" } */
 /* { dg-skip-if "data LMA != VMA" { msp430-*-* } { "-mlarge" } } */
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 703aba412a6..df8141a15d8 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -370,6 +370,55 @@ proc check_weak_override_available { } {
 return [check_weak_available]
 }
 
+# Return 1 if VMA is equal to LMA for the .data section, 0
+# otherwise.  Cache the result.
+
+proc check_effective_target_vma_equals_lma { } {
+global tool
+
+return [check_cached_effective_target vma_equals_lma {
+   set src vma_equals_lma[pid].c
+   set exe vma_equals_lma[pid].exe
+   verbose "check_effective_target_vma_equals_lma  compiling testfile 
$src" 2
+   set f [open $src "w"]
+   puts $f "#ifdef __cplusplus\nextern \"C\"\n#endif\n"
+   puts $f "int foo = 42; void main() {}"
+   close $f
+   set lines [${tool}_target_compile $src $exe executable ""]
+   file delete $src
+
+   if [string match "" $lines] then {
+   # No error messages
+
+set objdump_name [find_binutils_prog

[PATCH v2] testsuite: Skip intrinsics test if arm

2022-09-20 Thread Torbjörn SVENSSON via Gcc-patches

In the test cases, it's clearly written that intrinsics is not
implemented on arm*. A simple xfail does not help since there are
link error and that would cause an UNRESOLVED testcase rather than
XFAIL.
By changing to dg-skip-if, the entire test case is omitted.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: Replace
dg-xfail-if with gd-skip-if.
* gcc.target/aarch64/advsimd-intrinsics/vld1x3.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld1x4.c: Likewise.

Co-Authored-By: Yvan ROUX  
Signed-off-by: Torbjörn SVENSSON  
---
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c | 2 +-
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c | 2 +-
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
index 92a139bc523..f933102be47 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
@@ -1,6 +1,6 @@
 /* We haven't implemented these intrinsics for arm yet.  */
-/* { dg-xfail-if "" { arm*-*-* } } */
 /* { dg-do run } */
+/* { dg-skip-if "unsupported" { arm*-*-* } } */
 /* { dg-options "-O3" } */
 
 #include 
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c
index 6ddd507d9cf..b20dec061b5 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c
@@ -1,6 +1,6 @@
 /* We haven't implemented these intrinsics for arm yet.  */
-/* { dg-xfail-if "" { arm*-*-* } } */
 /* { dg-do run } */
+/* { dg-skip-if "unsupported" { arm*-*-* } } */
 /* { dg-options "-O3" } */
 
 #include 
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
index 451a0afc6aa..e59f845880e 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
@@ -1,6 +1,6 @@
 /* We haven't implemented these intrinsics for arm yet.  */
-/* { dg-xfail-if "" { arm*-*-* } } */
 /* { dg-do run } */
+/* { dg-skip-if "unsupported" { arm*-*-* } } */
 /* { dg-options "-O3" } */
 
 #include 
-- 
2.25.1

[PATCH][pushed] replace "the the" typos

2022-09-20 Thread Martin Liška

gcc/ada/ChangeLog:

* exp_ch6.adb: Replace "the the" with "the".
* sem_ch6.adb: Likewise.
* sem_disp.ads: Likewise.

gcc/ChangeLog:

* ctfc.cc (ctf_add_string): Replace "the the" with "the".
* doc/md.texi: Likewise.
* gimple-range-infer.cc (non_null_loadstore): Likewise.

gcc/fortran/ChangeLog:

* gfortran.texi: Replace "the the" with "the".

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wclass-memaccess.C: Replace "the the" with "the".
* g++.dg/warn/Wconversion-real-integer2.C: Likewise.
* gcc.target/powerpc/p9-extract-1.c: Likewise.
* gcc.target/s390/s390.exp: Likewise.
* gcc.target/s390/zvector/vec-cmp-2.c: Likewise.
* gdc.dg/torture/simd_store.d: Likewise.
* gfortran.dg/actual_array_offset_1.f90: Likewise.
* gfortran.dg/pdt_15.f03: Likewise.
* gfortran.dg/pointer_array_8.f90: Likewise.
---
 gcc/ada/exp_ch6.adb   | 2 +-
 gcc/ada/sem_ch6.adb   | 2 +-
 gcc/ada/sem_disp.ads  | 2 +-
 gcc/ctfc.cc   | 2 +-
 gcc/doc/md.texi   | 2 +-
 gcc/fortran/gfortran.texi | 2 +-
 gcc/gimple-range-infer.cc | 2 +-
 gcc/testsuite/g++.dg/warn/Wclass-memaccess.C  | 2 +-
 gcc/testsuite/g++.dg/warn/Wconversion-real-integer2.C | 2 +-
 gcc/testsuite/gcc.target/powerpc/p9-extract-1.c   | 2 +-
 gcc/testsuite/gcc.target/s390/s390.exp| 2 +-
 gcc/testsuite/gcc.target/s390/zvector/vec-cmp-2.c | 2 +-
 gcc/testsuite/gdc.dg/torture/simd_store.d | 2 +-
 gcc/testsuite/gfortran.dg/actual_array_offset_1.f90   | 2 +-
 gcc/testsuite/gfortran.dg/pdt_15.f03  | 2 +-
 gcc/testsuite/gfortran.dg/pointer_array_8.f90 | 4 ++--
 16 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
index 0873191bf47..ce1a7525fa2 100644
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -6582,7 +6582,7 @@ package body Exp_Ch6 is
 
  --  but optimize the case where the result is a function call that
  --  also needs finalization. In this case the result can directly be
- --  allocated on the the return stack of the caller and no further
+ --  allocated on the return stack of the caller and no further
  --  processing is required.
 
  if Present (Utyp)
diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
index 045905825ad..7db0cb7c08f 100644
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -505,7 +505,7 @@ package body Sem_Ch6 is
  --  this because it is not part of the original source.
  --  If this is an ignored Ghost entity, analysis of the generated
  --  body is needed to hide external references (as is done in
- --  Analyze_Subprogram_Body) after which the the subprogram profile
+ --  Analyze_Subprogram_Body) after which the subprogram profile
  --  can be frozen, which is needed to expand calls to such an ignored
  --  Ghost subprogram.
 
diff --git a/gcc/ada/sem_disp.ads b/gcc/ada/sem_disp.ads
index 563b7f34e7f..841fc741dfc 100644
--- a/gcc/ada/sem_disp.ads
+++ b/gcc/ada/sem_disp.ads
@@ -63,7 +63,7 @@ package Sem_Disp is
--  the inherited subprogram will have been hidden by the current one at
--  the point of the type derivation, so it does not appear in the list
--  of primitive operations of the type, and this procedure inserts the
-   --  overriding subprogram in the the full type's list of primitives by
+   --  overriding subprogram in the full type's list of primitives by
--  iterating over the list for the parent type. If instead Subp is a new
--  primitive, then it's simply appended to the primitive list.
 
diff --git a/gcc/ctfc.cc b/gcc/ctfc.cc
index 9773358a475..09645436fdd 100644
--- a/gcc/ctfc.cc
+++ b/gcc/ctfc.cc
@@ -324,7 +324,7 @@ ctf_add_string (ctf_container_ref ctfc, const char * name,
   return ctfc_strtable_add_str (str_table, name, name_offset);
 }
 
-/* Add the compilation unit (CU) name string to the the CTF string table.  The
+/* Add the compilation unit (CU) name string to the CTF string table.  The
CU name has a prepended pwd string if it is a relative path.  Also set the
CU name offset in the CTF container.  */
 
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 34825549ed4..d46963f468c 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -2282,7 +2282,7 @@ This constraint won't match unless 
@option{-mprefer-short-insn-regs} is
 in effect.
 
 @item Rsc
-The the register class of registers that can be used to hold a
+The register class of registers that can be used to hold a
 sibcall call address.  I.e., a caller-saved register.
 
 @item Rct
diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index

Re: [PATCH] Remove legacy -gz=zlib-gnu

2022-09-20 Thread Martin Liška

On 7/1/22 09:20, Fangrui Song via Gcc-patches wrote:
> On 2022-07-01, Andrew Pinski wrote:
>> On Thu, Jun 30, 2022 at 11:58 PM Fangrui Song via Gcc-patches
>>  wrote:
>>>
>>> From: Fangrui Song 
>>>
>>> SHF_COMPRESSED style zlib has been supported since binutils 2.26
>>> and the legacy zlib-gnu option hasn't gain adoption.
>>> According to Debian Code Search (`gz=zlib-gnu`), no project uses
>>> -gz=zlib-gnu (valgrind has a configure to use -gz=zlib).
>>> Remove support for the legacy zlib-gnu and simplify configure.ac by
>>> removing zlib-gnu ld/as check.
>>
>> A couple of things, you are missing a changelog.
> 
> Sorry.
> 
>> Second, why remove something which is still working?

Hi.

I do support the option removal, while I would replace the removal with a 
warning
saying no compression will be used.

> 
> It's unused and its existence causes confusion: the paradox of choice.
> People may assume the support may be good but newer DWARF consumers may
> not support the legacy format.

Agree, the compression format is legacy. I verified all openSUSE packages (15k)
and there's no project actively using it.

> 
> The other motivation is to clean up it a bit.  I foresee that someone
> will add --compress-debug-sections=zstd to binutils and configure.ac and
> gcc/gcc.cc would become more messy.

The argument makes sense, it will be even bigger mess.

@Richi: Is it something we can deprecate for GCC 13?

Martin

> 
>> Third, why not just make gz=zlib-gnu as an alias to gz=zlib instead so
>> if someone used it before it will still work. we try not to remove
>> options; have them emit a warning and be ignored (or moved over to the
>> closed option).
> 
> Changing the semantics of -gz=zlib-gnu would be even more confusing.
> 
>> Thanks,
>> Andrew
>>
>>> ---
>>>  gcc/common.opt  |  3 ---
>>>  gcc/configure   | 33 ++---
>>>  gcc/configure.ac    | 29 -
>>>  gcc/doc/invoke.texi | 11 +--
>>>  gcc/gcc.cc  | 22 ++
>>>  5 files changed, 17 insertions(+), 81 deletions(-)
>>>
>>> diff --git a/gcc/common.opt b/gcc/common.opt
>>> index e7a51e882ba..8754d93d545 100644
>>> --- a/gcc/common.opt
>>> +++ b/gcc/common.opt
>>> @@ -3424,9 +3424,6 @@ Enum(compressed_debug_sections) String(none) Value(0)
>>>  EnumValue
>>>  Enum(compressed_debug_sections) String(zlib) Value(1)
>>>
>>> -EnumValue
>>> -Enum(compressed_debug_sections) String(zlib-gnu) Value(2)
>>> -
>>>  gz
>>>  Common Driver
>>>  Generate compressed debug sections.
>>> diff --git a/gcc/configure b/gcc/configure
>>> index 62872d132ea..ca87e875e9d 100755
>>> --- a/gcc/configure
>>> +++ b/gcc/configure
>>> @@ -19674,7 +19674,7 @@ else
>>>    lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>>>    lt_status=$lt_dlunknown
>>>    cat > conftest.$ac_ext <<_LT_EOF
>>> -#line 19679 "configure"
>>> +#line 19677 "configure"
>>>  #include "confdefs.h"
>>>
>>>  #if HAVE_DLFCN_H
>>> @@ -19780,7 +19780,7 @@ else
>>>    lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>>>    lt_status=$lt_dlunknown
>>>    cat > conftest.$ac_ext <<_LT_EOF
>>> -#line 19785 "configure"
>>> +#line 19783 "configure"
>>>  #include "confdefs.h"
>>>
>>>  #if HAVE_DLFCN_H
>>> @@ -29711,20 +29711,13 @@ else
>>>     if $gcc_cv_as --compress-debug-sections -o conftest.o conftest.s 2>&1 | 
>>> grep -i warning > /dev/null
>>>     then
>>>   gcc_cv_as_compress_debug=0
>>> -   # Since binutils 2.26, gas supports --compress-debug-sections=type,
>>> +   # Since binutils 2.26, gas supports --compress-debug-sections=zlib,
>>>     # defaulting to the ELF gABI format.
>>> -   elif $gcc_cv_as --compress-debug-sections=zlib-gnu -o conftest.o 
>>> conftest.s > /dev/null 2>&1
>>> +   elif $gcc_cv_as --compress-debug-sections=zlib -o conftest.o conftest.s 
>>> > /dev/null 2>&1
>>>     then
>>>   gcc_cv_as_compress_debug=2
>>>   gcc_cv_as_compress_debug_option="--compress-debug-sections"
>>>   gcc_cv_as_no_compress_debug_option="--nocompress-debug-sections"
>>> -   # Before binutils 2.26, gas only supported --compress-debug-options and
>>> -   # emitted the traditional GNU format.
>>> -   elif $gcc_cv_as --compress-debug-sections -o conftest.o conftest.s > 
>>> /dev/null 2>&1
>>> -   then
>>> - gcc_cv_as_compress_debug=1
>>> - gcc_cv_as_compress_debug_option="--compress-debug-sections"
>>> - gcc_cv_as_no_compress_debug_option="--nocompress-debug-sections"
>>>     else
>>>   gcc_cv_as_compress_debug=0
>>>     fi
>>> @@ -30238,42 +30231,28 @@ $as_echo "$gcc_cv_ld_eh_gc_sections_bug" >&6; }
>>>
>>>  { $as_echo "$as_me:${as_lineno-$LINENO}: checking linker for compressed 
>>> debug sections" >&5
>>>  $as_echo_n "checking linker for compressed debug sections... " >&6; }
>>> -# gold/gld support compressed debug sections since binutils 2.19/2.21
>>> -# In binutils 2.26, gld gained support for the ELF gABI format.
>>> +# GNU ld/gold support --compressed-debug-sections=zlib since binutils 2.26.
>>>  if

Re: [PATCH] [PR68097] frange::set_nonnegative should not contain -NAN.

2022-09-20 Thread Michael Matz via Gcc-patches

Hello,

On Tue, 20 Sep 2022, Aldy Hernandez wrote:

> > FWIW, in IEEE, 'abs' (like 'copy, 'copysign' and 'negate') are not
> > arithmetic, they are quiet-computational.  Hence they don't rise
> > anything, not even for sNaNs; they copy the input bits and appropriately
> > modify the bit pattern according to the specification (i.e. fiddle the
> > sign bit).
> >
> > That also means that a predicate like negative_p(x) that would be
> > implemented ala
> >
> >   copysign(1.0, x) < 0.0
> 
> I suppose this means -0.0 is not considered negative,

It would be considered negative if the predicate is implemented like 
above:
   copysign(1.0, -0.0) == -1.0

But really, that depends on what _our_ definition of negative_p is 
supposed to be.  I think the most reasonable definition is indeed similar 
to above, which in turn is equivalent to simply looking at the sign bit 
(which is what copysign() does), i.e. ...

> though it has
> the signbit set?  FWIW, on real_value's real_isneg() returns true for
> -0.0 because it only looks at the sign.

... this seems the sensible thing.  I just wanted to argue the case that 
set_negative (or the like) which "sets" the sign bit does not make the 
nan-ness go away.  They are orthogonal.

> > deal with NaNs just fine and is required to correctly capture the sign of
> > 'x'.  If frange::set_nonnegative is supposed to be used in such contexts
> > (and I think it's a good idea if that were the case), then set_nonnegative
> > does _not_ imply no-NaN.
> >
> > In particular I would assume that, given an VAYRING frange FR, that
> > FR.set_nonnegative() would result in an frange {[+0.0,+inf],+nan} .
> 
> That was my understanding as well, and what my original patch did.
> But again, I'm just the messenger.

Ah, I obviously haven't followed the thread carefully then.  If that's 
what it was doing then IMO it was the right thing.

Ciao,
Michael.

Re: Extend fold_vec_perm to fold VEC_PERM_EXPR in VLA manner

2022-09-20 Thread Richard Sandiford via Gcc-patches

Prathamesh Kulkarni  writes:
> On Mon, 12 Sept 2022 at 19:57, Richard Sandiford
>  wrote:
>>
>> Prathamesh Kulkarni  writes:
>> >> The VLA encoding encodes the first N patterns explicitly.  The
>> >> npatterns/nelts_per_pattern values then describe how to extend that
>> >> initial sequence to an arbitrary number of elements.  So when performing
>> >> an operation on (potentially) variable-length vectors, the questions is:
>> >>
>> >> * Can we work out an initial sequence and npatterns/nelts_per_pattern
>> >>   pair that will be correct for all elements of the result?
>> >>
>> >> This depends on the operation that we're performing.  E.g. it's
>> >> different for unary operations (vector_builder::new_unary_operation)
>> >> and binary operations (vector_builder::new_binary_operations).  It also
>> >> varies between unary operations and between binary operations, hence
>> >> the allow_stepped_p parameters.
>> >>
>> >> For VEC_PERM_EXPR, I think the key requirement is that:
>> >>
>> >> (R) Each individual selector pattern must always select from the same 
>> >> vector.
>> >>
>> >> Whether this condition is met depends both on the pattern itself and on
>> >> the number of patterns that it's combined with.
>> >>
>> >> E.g. suppose we had the selector pattern:
>> >>
>> >>   { 0, 1, 4, ... }   i.e. 3x - 2 for x > 0
>> >>
>> >> If the arguments and selector are n elements then this pattern on its
>> >> own would select from more than one argument if 3(n-1) - 2 >= n.
>> >> This is clearly true for large enough n.  So if n is variable then
>> >> we cannot represent this.
>> >>
>> >> If the pattern above is one of two patterns, so interleaved as:
>> >>
>> >>  { 0, _, 1, _, 4, _, ... }  o=0
>> >>   or { _, 0, _, 1, _, 4, ... }  o=1
>> >>
>> >> then the pattern would select from more than one argument if
>> >> 3(n/2-1) - 2 + o >= n.  This too would be a problem for variable n.
>> >>
>> >> But if the pattern above is one of four patterns then it selects
>> >> from more than one argument if 3(n/4-1) - 2 + o >= n.  This is not
>> >> true for any valid n or o, so the pattern is OK.
>> >>
>> >> So let's define some ad hoc terminology:
>> >>
>> >> * Px is the number of patterns in x
>> >> * Ex is the number of elements per pattern in x
>> >>
>> >> where x can be:
>> >>
>> >> * 1: first argument
>> >> * 2: second argument
>> >> * s: selector
>> >> * r: result
>> >>
>> >> Then:
>> >>
>> >> (1) The number of elements encoded explicitly for x is Ex*Px
>> >>
>> >> (2) The explicit encoding can be used to produce a sequence of N*Ex*Px
>> >> elements for any integer N.  This extended sequence can be reencoded
>> >> as having N*Px patterns, with Ex staying the same.
>> >>
>> >> (3) If Ex < 3, Ex can be increased by 1 by repeating the final Px elements
>> >> of the explicit encoding.
>> >>
>> >> So let's assume (optimistically) that we can produce the result
>> >> by calculating the first Pr*Er elements and using the Pr,Er encoding
>> >> to imply the rest.  Then:
>> >>
>> >> * (2) means that, when combining multiple input operands with potentially
>> >>   different encodings, we can set the number of patterns in the result
>> >>   to the least common multiple of the number of patterns in the inputs.
>> >>   In this case:
>> >>
>> >>   Pr = least_common_multiple(P1, P2, Ps)
>> >>
>> >>   is a valid number of patterns.
>> >>
>> >> * (3) means that the number of elements per pattern of the result can
>> >>   be the maximum of the number of elements per pattern in the inputs.
>> >>   (Alternatively, we could always use 3.)  In this case:
>> >>
>> >>   Er = max(E1, E2, Es)
>> >>
>> >>   is a valid number of elements per pattern.
>> >>
>> >> So if (R) holds we can compute the result -- for both VLA and VLS -- by
>> >> calculating the first Pr*Er elements of the result and using the
>> >> encoding to derive the rest.  If (R) doesn't hold then we need the
>> >> selector to be constant-length.  We should then fill in the result
>> >> based on:
>> >>
>> >> - Pr == number of elements in the result
>> >> - Er == 1
>> >>
>> >> But this should be the fallback option, even for VLS.
>> >>
>> >> As far as the arguments go: we should reject CONSTRUCTORs for
>> >> variable-length types.  After doing that, we can treat a CONSTRUCTOR
>> >> for an N-element vector type by setting the number of patterns to N
>> >> and the number of elements per pattern to 1.
>> > Hi Richard,
>> > Thanks for the suggestions, and sorry for late response.
>> > I have a couple of very elementary questions:
>> >
>> > 1: Consider following inputs to VEC_PERM_EXPR:
>> > op1: P_op1 == 4, E_op1 == 1
>> > {1, 2, 3, 4, ...}
>> >
>> > op2: P_op2 == 2, E_op2 == 2
>> > {11, 21, 12, 22, ...}
>> >
>> > sel: P_sel == 3, E_sel == 1
>> > {0, 4, 5, ...}
>> >
>> > What shall be the result in this case ?
>> > P_res = lcm(4, 2, 3) == 12
>> > E_res = max(1, 2, 1) == 2.
>>
>> Yeah, that looks right.  Of course, since sel is just repeating
>> every three elements, it could just be

[committed] libstdc++: Fix typo in for freestanding

2022-09-20 Thread Jonathan Wakely via Gcc-patches

I messed this up last week. Tested x86_64-linux, pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* include/c_global/cstdlib [!_GLIBCXX_HOSTED] (quick_exit): Fix
missing space.
---
 libstdc++-v3/include/c_global/cstdlib | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/c_global/cstdlib 
b/libstdc++-v3/include/c_global/cstdlib
index 0f7362e263f..917dbe2e47a 100644
--- a/libstdc++-v3/include/c_global/cstdlib
+++ b/libstdc++-v3/include/c_global/cstdlib
@@ -63,7 +63,7 @@ namespace std
   extern "C" int at_quick_exit(void (*)(void)) _GLIBCXX_NOTHROW;
 # endif
 # ifdef _GLIBCXX_HAVE_QUICK_EXIT
-  extern "C" void quick_exit(int) _GLIBCXX_NOTHROW_GLIBCXX_NORETURN;
+  extern "C" void quick_exit(int) _GLIBCXX_NOTHROW _GLIBCXX_NORETURN;
 # endif
 #if _GLIBCXX_USE_C99_STDLIB
   extern "C" void _Exit(int) _GLIBCXX_NOTHROW _GLIBCXX_NORETURN;
-- 
2.37.3

Re: [PATCH][pushed] fortran: remove 2 dead links [PR106636]

2022-09-20 Thread Martin Liška

On 9/20/22 14:17, Tobias Burnus wrote:
> Hi Martin,
> 
> On 20.09.22 14:02, Martin Liška wrote:
>> diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
>> @@ -455,9 +455,7 @@ version 2.6, @uref{https://www.openacc.org/}).  See
>>   The Fortran 95 standard specifies in Part 2 (ISO/IEC 1539-2:2000)
>>   varying length character strings.  While GNU Fortran currently does not
>>   support such strings directly, there exist two Fortran implementations
>> -for them, which work with GNU Fortran.  They can be found at
>> -@uref{https://www.fortran.com/@/iso_varying_string.f95} and at
>> -@uref{ftp://ftp.nag.co.uk/@/sc22wg5/@/ISO_VARYING_STRING/}.
>> +for them, which work with GNU Fortran.
> 
> Instead of removing the links, can we rather replace it by an updated link?
> 
> Richard Townsend has implemented the fortran.com version; webarchive
> show that was the 1.3-F version that is still available from his own
> homepage at:
> 
> http://user.astro.wisc.edu/~townsend/static.php?ref=iso-varying-string
> (BTW: I now also added Richard's website to web.archive.org.)
> 
> I don't think most users need this module, but still if it is mentioned,
> I does not harm to have the link available.

Thanks for the archeological work you did.
Sure, what about the suggested patch?

Martin

> 
> Tobias
> 
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955
From 64376d0f450b03e2bdaad486aaa5df8141b2dde1 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 20 Sep 2022 14:23:29 +0200
Subject: [PATCH] fortran: add link to ISO_VARYING_STRING module [PR106636]

	PR fortran/106636

gcc/fortran/ChangeLog:

	* gfortran.texi: Add back link to ISO_VARYING_STRING.
---
 gcc/fortran/gfortran.texi | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 25410e6088d..4ab67700362 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -455,7 +455,8 @@ version 2.6, @uref{https://www.openacc.org/}).  See
 The Fortran 95 standard specifies in Part 2 (ISO/IEC 1539-2:2000)
 varying length character strings.  While GNU Fortran currently does not
 support such strings directly, there exist two Fortran implementations
-for them, which work with GNU Fortran.
+for them, which work with GNU Fortran. One can be found at
+@uref{http://user.astro.wisc.edu/~townsend/static.php?ref=iso-varying-string}.
 
 Deferred-length character strings of Fortran 2003 supports part of
 the features of @code{ISO_VARYING_STRING} and should be considered as
-- 
2.37.3

Re: [PATCH][pushed] fortran: remove 2 dead links [PR106636]

2022-09-20 Thread Tobias Burnus


Hi Martin,

On 20.09.22 14:02, Martin Liška wrote:

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
@@ -455,9 +455,7 @@ version 2.6, @uref{https://www.openacc.org/}).  See
  The Fortran 95 standard specifies in Part 2 (ISO/IEC 1539-2:2000)
  varying length character strings.  While GNU Fortran currently does not
  support such strings directly, there exist two Fortran implementations
-for them, which work with GNU Fortran.  They can be found at
-@uref{https://www.fortran.com/@/iso_varying_string.f95} and at
-@uref{ftp://ftp.nag.co.uk/@/sc22wg5/@/ISO_VARYING_STRING/}.
+for them, which work with GNU Fortran.


Instead of removing the links, can we rather replace it by an updated link?

Richard Townsend has implemented the fortran.com version; webarchive
show that was the 1.3-F version that is still available from his own
homepage at:

http://user.astro.wisc.edu/~townsend/static.php?ref=iso-varying-string
(BTW: I now also added Richard's website to web.archive.org.)

I don't think most users need this module, but still if it is mentioned,
I does not harm to have the link available.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

[PATCH][pushed] fortran: remove 2 dead links [PR106636]

2022-09-20 Thread Martin Liška

PR fortran/106636

gcc/fortran/ChangeLog:

* gfortran.texi: Remove 2 dead links.
---
 gcc/fortran/gfortran.texi | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 59d673bfc03..25410e6088d 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -455,9 +455,7 @@ version 2.6, @uref{https://www.openacc.org/}).  See
 The Fortran 95 standard specifies in Part 2 (ISO/IEC 1539-2:2000)
 varying length character strings.  While GNU Fortran currently does not
 support such strings directly, there exist two Fortran implementations
-for them, which work with GNU Fortran.  They can be found at
-@uref{https://www.fortran.com/@/iso_varying_string.f95} and at
-@uref{ftp://ftp.nag.co.uk/@/sc22wg5/@/ISO_VARYING_STRING/}.
+for them, which work with GNU Fortran.
 
 Deferred-length character strings of Fortran 2003 supports part of
 the features of @code{ISO_VARYING_STRING} and should be considered as
-- 
2.37.3

[Patch] Fortran: F2018 type(),dimension() with scalars [PR104143]

2022-09-20 Thread Tobias Burnus


In several cases, one just wants to have the address where an object starts
without requiring the detour via 'c_loc' and the (locally) required 'target'
attribute.

In principle,  type(*),dimension(*)  of TS29113 permits this, except that
'dimension(*)' only permits arrays and array elements but not scalars.

Fortran 2018 modified this such that with 'type(*)' also scalars are permitted.
(See PR for the quotes.)

This patch implements this simple change. Before, implementations like MPI
had to use '!GCC$ attribute NO_ARG_CHECK ::' in addition to type(*),dimension(*)
to achieve this. In GCC, we do likewise, but that's at least inside the 
compiler,
cf. libgomp/openacc{.f90,_lib.h}.

OK for mainline?

Tobias

PS: I know that there are still patches to be reviewed; I am not sure wrt IEEE
but I think most of the clobber patches still need a review and likely also some
of Harald's patches. I think we also need to take care of some more of the ready
or nearly ready patches by José. (I somewhere have a list that could dig it out 
quickly,
if someone want to do some work on this. However, some were already handled by 
Harald.)

Unfortunately, I am currently too busy with other things (OpenMP, looking at
issues in mostly OpenMP-related testsuites, OpenMP spec issues, a bunch of odd
things) to really work on Fortran, especially as too many of the other listed
items are likewise non-primary work items and I shouldn't really keep increasing
the time spend on work-related-but-not-to-be-focused-on items...

Otherwise: Last weekend was the GNU Tools Cauldron, 
https://gcc.gnu.org/wiki/cauldron2022
A few slides are already online (including mine) and the recordings should 
become
available soon, in case you are interested.


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran: F2018 type(*),dimension(*) with scalars [PR104143]

Assumed-size dummy arguments accept arrays and array elements as actual
arguments. There are also a few exceptions when real scalars are permitted.
Since F2018, this includes scalar arguments to assumed-type dummies; while
type(*) was added in TS29113, this change is only in F2018 itself.

	PR fortran/104143

gcc/fortran/ChangeLog:

	* interface.cc (compare_parameter): Permit scalar args to
	'type(*), dimension(*)'.

gcc/testsuite/ChangeLog:

	* gfortran.dg/c-interop/c407b-2.f90: Remove dg-error.
	* gfortran.dg/assumed_type_16.f90: New test.
	* gfortran.dg/assumed_type_17.f90: New test.

 gcc/fortran/interface.cc| 11 ++-
 gcc/testsuite/gfortran.dg/assumed_type_16.f90   | 14 ++
 gcc/testsuite/gfortran.dg/assumed_type_17.f90   | 18 ++
 gcc/testsuite/gfortran.dg/c-interop/c407b-2.f90 |  2 +-
 4 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/interface.cc b/gcc/fortran/interface.cc
index 71eec78259b..d3e199535b3 100644
--- a/gcc/fortran/interface.cc
+++ b/gcc/fortran/interface.cc
@@ -2692,7 +2692,8 @@ compare_parameter (gfc_symbol *formal, gfc_expr *actual,
  - if the actual argument is (a substring of) an element of a
non-assumed-shape/non-pointer/non-polymorphic array; or
  - (F2003) if the actual argument is of type character of default/c_char
-   kind.  */
+   kind.
+ - (F2018) if the dummy argument is type(*).  */
 
   is_pointer = actual->expr_type == EXPR_VARIABLE
 	   ? actual->symtree->n.sym->attr.pointer : false;
@@ -2759,6 +2760,14 @@ compare_parameter (gfc_symbol *formal, gfc_expr *actual,
 
   if (ref == NULL && actual->expr_type != EXPR_NULL)
 {
+  if (actual->rank == 0
+	  && formal->ts.type == BT_ASSUMED
+	  && formal->as
+	  && formal->as->type == AS_ASSUMED_SIZE)
+	/* This is new in F2018, type(*) is new in TS29113, but gfortran does
+	   not differentiate.  Thus, if type(*) exists, it is valid;
+	   otherwise, type(*) is already rejected.  */
+	return true;
   if (where
 	  && (!formal->attr.artificial || (!formal->maybe_array
 	   && !maybe_dummy_array_arg (actual
diff --git a/gcc/testsuite/gfortran.dg/assumed_type_16.f90 b/gcc/testsuite/gfortran.dg/assumed_type_16.f90
new file mode 100644
index 000..52d8ef5ea20
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/assumed_type_16.f90
@@ -0,0 +1,14 @@
+! { dg-do compile }
+! { dg-additional-options "-std=f2008" }
+!
+! PR fortran/104143
+!
+ interface
+   subroutine foo(x)
+ type(*) :: x(*)  ! { dg-error "Fortran 2018: Assumed type" }
+   end
+ end interface
+ integer :: a
+ call foo(a)  ! { dg-error "Type mismatch in argument" }
+ call foo((a))  ! { dg-error "Type mismatch in argument" }
+end
diff --git a/gcc/testsuite/gfortran.dg/assumed_type_17.f90 b/gcc/testsuite/gfortran.dg/assumed_type_17.f90
new file mode 100644
index 000..d6ccd3058ce
--- /dev/null
+++

Re: [PATCH] c++: stream PACK_EXPANSION_EXTRA_ARGS [PR106761]

2022-09-20 Thread Nathan Sidwell via Gcc-patches


On 9/19/22 09:52, Patrick Palka wrote:

It looks like some xtreme-header-* tests are failing after the libstdc++
change r13-2158-g02f6b405f0e9dc ultimately because we're neglecting
to stream PACK_EXPANSION_EXTRA_ARGS, which leads to false equivalences
of different partial instantiations of _TupleConstraints::__constructible.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

PR c++/106761

gcc/cp/ChangeLog:

* module.cc (trees_out::type_node) :
Stream PACK_EXPANSION_EXTRA_ARGS.
(trees_in::tree_node) : Likewise.



Looks good, I wonder why I missed that.  (I guess extracting a testcase 
out of the headers was too tricky?)


nathan

---
  gcc/cp/module.cc | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 1a1ff5be574..9a9ef4e3332 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -8922,6 +8922,7 @@ trees_out::type_node (tree type)
if (streaming_p ())
u (PACK_EXPANSION_LOCAL_P (type));
tree_node (PACK_EXPANSION_PARAMETER_PACKS (type));
+  tree_node (PACK_EXPANSION_EXTRA_ARGS (type));
break;
  
  case TYPENAME_TYPE:

@@ -9455,12 +9456,14 @@ trees_in::tree_node (bool is_use)
{
  bool local = u ();
  tree param_packs = tree_node ();
+ tree extra_args = tree_node ();
  if (!get_overrun ())
{
  tree expn = cxx_make_type (TYPE_PACK_EXPANSION);
  SET_TYPE_STRUCTURAL_EQUALITY (expn);
  PACK_EXPANSION_PATTERN (expn) = res;
  PACK_EXPANSION_PARAMETER_PACKS (expn) = param_packs;
+ PACK_EXPANSION_EXTRA_ARGS (expn) = extra_args;
  PACK_EXPANSION_LOCAL_P (expn) = local;
  res = expn;
}


--
Nathan Sidwell

Re: Proxy ping [PATCH] Fortran: Fix function attributes [PR100132]

2022-09-20 Thread Mikael Morin


Hello,

Le 19/09/2022 à 22:17, Harald Anlauf via Fortran a écrit :

Dear all,

the following patch was submitted by Jose but never reviewed:

https://gcc.gnu.org/pipermail/fortran/2021-April/055946.html

Before, we didn't set function attributes properly when
passing polymorphic pointers, which could lead to
mis-optimization.

The patch is technically fine and regtests ok, although it
can be shortened slightly, which makes it more readable,
see attached.

When testing the suggested testcase I found that it was
accepted (and working fine) with NAG, but it was rejected
by both Intel and Cray.  This troubled me, but I think
it is standard conforming (F2018:15.5.2.7), while the
error messages issued by Intel

PR100132.f90(61): error #8300: If a dummy argument is allocatable or a pointer, 
and the dummy or its associated actual argument is polymorphic, both dummy and 
actual must be polymorphic with the same declared type or both must be 
unlimited polymorphic.   [S]
 call set(s)
-^

and a similar one by Cray, suggest that they refer to
F2018:15.5.2.5, which IMHO does not apply here.
(The text in the error message seems very related to
the reasoning in Note 1 of that subsection).

I'd like to hear (read: read) a second opinion on that.


I think you are correct.
If the dummy wasn't INTENT(IN) the actual argument would have to be a 
pointer, and then 15.5.2.5 would apply, but it's not the case here.

With INTENT(IN) the reasons for the constraints from Note 1 don't apply.

I think you can go ahead.

Re: [PATCH] c++: Implement P1467R9 - Extended floating-point types and standard names compiler part except for bfloat16 [PR106652]

2022-09-20 Thread Jakub Jelinek via Gcc-patches

On Tue, Sep 20, 2022 at 11:35:07AM +0800, Hongtao Liu wrote:
> > The question is (mainly for aarch64, arm and x86 backend maintainers) if we
> > shouldn't support it, in the PR there is a partial patch to do so, but
> > the big question is if it should be supported as the __bf16 type those
> > 3 targets use with u6__bf16 mangling and remove those *_invalid_* cases
> > and add conversions to/from at least SFmode but probably also DFmode, TFmode
> > and XFmode on x86 and implement arithmetics on those through conversion to
> > SFmode, performing arithmetics there and conversion back.
> > Conversion from BFmode to SFmode is easy, left shift by 16 and ought to be
> > implemented inline, SFmode -> BFmode conversion is harder,
> > I think it is roughly:
> I'm not sure if there should be any floating point exceptions for
> BFmode operation.
> For x86, there's no floating point exceptions for AVX512_BF16 related
> instructions

As long as __bf16 is just an extension, supporting or not supporting
exceptions on sNaNs is just fine I think, but I'm afraid it is different
for std::bfloat16_t.  If we claim we support it (define that type
in , predefine __STD_BFLOAT16_TYPE__), then it needs to follow
ISO/IEC/IEEE 60559, and I'm afraid that means also exceptions and the like.
While the IEEE spec doesn't cover the exact bfloat16 format, C++ talks about
a format with these and these number of bits here and there that behaves
like in IEEE otherwise.
Whether we support std::bfloat16_t at all is our choice, if we do support
it, whether we support it with __bf16 underlying type or come up with
something different, it is up to us, and with -ffast-math/-Ofast etc.
we can certainly use hw instructions for it which don't raise exceptions.

At least that is my limited understanding of it...

Jakub

[PATCH] Add attribute hot judgement for INLINE_HINT_known_hot hint.

2022-09-20 Thread Cui,Lili via Gcc-patches

Hi Honza,

This patch is to add attribute hot judgement for INLINE_HINT_known_hot hint.

We set up INLINE_HINT_known_hot hint only when we have profile feedback,
now add function attribute judgement for it, when both caller and callee
have __attribute__((hot)), we will also set up INLINE_HINT_known_hot hint
for it.

With this patch applied
 Ratio   Codesize
ADL Multi-copy:538.imagic_r  16.7%1.6%
SPR Multi-copy:538.imagic_r  15%  1.7%
ICX Multi-copy:538.imagic_r  15.2%1.7%
CLX Multi-copy:538.imagic_r  12.7%1.7%
Znver3 Multi-copy: 538.imagic_r  10.6%1.5%

Bootstrap and regtest pending on x86_64-unknown-linux-gnu.
OK for trunk?

Thanks,
Lili.

gcc/ChangeLog

  * ipa-inline-analysis.cc (do_estimate_edge_time): Add function attribute
  judgement for INLINE_HINT_known_hot hint.
---
 gcc/ipa-inline-analysis.cc | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/gcc/ipa-inline-analysis.cc b/gcc/ipa-inline-analysis.cc
index 1ca685d1b0e..7bd29c36590 100644
--- a/gcc/ipa-inline-analysis.cc
+++ b/gcc/ipa-inline-analysis.cc
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-utils.h"
 #include "cfgexpand.h"
 #include "gimplify.h"
+#include "attribs.h"
 
 /* Cached node/edge growths.  */
 fast_call_summary *edge_growth_cache = 
NULL;
@@ -249,15 +250,19 @@ do_estimate_edge_time (struct cgraph_edge *edge, sreal 
*ret_nonspec_time)
   hints = estimates.hints;
 }
 
-  /* When we have profile feedback, we can quite safely identify hot
- edges and for those we disable size limits.  Don't do that when
- probability that caller will call the callee is low however, since it
+  /* When we have profile feedback or function attribute, we can quite safely
+ identify hot edges and for those we disable size limits.  Don't do that
+ when probability that caller will call the callee is low however, since it
  may hurt optimization of the caller's hot path.  */
-  if (edge->count.ipa ().initialized_p () && edge->maybe_hot_p ()
+  if ((edge->count.ipa ().initialized_p () && edge->maybe_hot_p ()
   && (edge->count.ipa () * 2
  > (edge->caller->inlined_to
 ? edge->caller->inlined_to->count.ipa ()
 : edge->caller->count.ipa (
+  || (lookup_attribute ("hot", DECL_ATTRIBUTES (edge->caller->decl))
+ != NULL
+&& lookup_attribute ("hot", DECL_ATTRIBUTES (edge->callee->decl))
+ != NULL))
 hints |= INLINE_HINT_known_hot;
 
   gcc_checking_assert (size >= 0);
-- 
2.17.1

Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-20 Thread Mikael Morin


Le 20/09/2022 à 08:54, Thomas Koenig a écrit :


On 19.09.22 22:50, Mikael Morin wrote:

Le 19/09/2022 à 21:46, Harald Anlauf a écrit :


Assumed size (*) is just a contiguous hunk of memory of possibly
unknown size, which can be zero.  So you couldn't set a clobber
for the a(1) actual argument.

Couldn't you clobber A entirely?  If no element of B is initialized in 
SUB, well, A has undefined values on return from SUB.  That's how 
INTENT(OUT) works.


Yes, I think so - you are passing the starting element of an array
to an assumed-size array via storage association rules.

It has to be an explicit interface, of course, otherwise it is
unclear if an array or an array element is passed.



I have looked for the relevant excerpts from the standard.
From 15.5.2.11 (sequence association):


If the dummy argument is not of type character
with default or C character kind, and the actual argument is an array element 
designator, the element sequence
consists of that array element and each element that follows it in array 
element order.



If the dummy argument is
assumed-size, the number of elements in the dummy argument is exactly the 
number of elements in the element
sequence.


So the dummy size, even if not known to the programmer, is clearly 
defined (to the full array size in your example).

Re: [PATCH] genrecog.cc (print_nonbool_test): Fix type error of SUBREG_BYTE

2022-09-20 Thread Richard Sandiford via Gcc-patches

Jojo R via Gcc-patches  writes:
>   * gcc/genrecog.cc (print_nonbool_test): Fix type error of
>   SUBREG_BYTE

We can't do this here.  The code has done nothing to prove that the
subreg offset is a compile-time constant.

Thanks,
Richard

> ---
>  gcc/genrecog.cc | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/gcc/genrecog.cc b/gcc/genrecog.cc
> index 77f8fb97853..319e437e334 100644
> --- a/gcc/genrecog.cc
> +++ b/gcc/genrecog.cc
> @@ -4619,6 +4619,7 @@ print_nonbool_test (output_state *os, const rtx_test 
> )
>printf ("SUBREG_BYTE (");
>print_test_rtx (os, test);
>printf (")");
> +  printf (".to_constant ()");
>break;
>  
>  case rtx_test::WIDE_INT_FIELD:

Re: [PATCH] frange: flush denormals to zero for -funsafe-math-optimizations.

2022-09-20 Thread Jakub Jelinek via Gcc-patches

On Tue, Sep 20, 2022 at 07:22:03AM +0200, Aldy Hernandez wrote:
> > > Jakub actually suggested something completely different...just set +0
> > > always for !HONOR_SIGNED_ZEROS.
> >
> > Hmm, but the [-1, -0.] with known sign becomes [-1, +0.] with unknown sign?
> 
> Huh.  I guess that's true.  Does that happen often enough in practice

Sure, if you -fno-signed-zeros/-ffast-math and some variable can be zero,
copysign/signbit is undefined.  The option basically asserts you don't care
about it...

Jakub

[PATCH][pushed] contrib: skip new egrep warning

2022-09-20 Thread Martin Liška

contrib/ChangeLog:

* filter-clang-warnings.py: Skip egrep: warning: egrep is
  obsolescent; using grep -E.
---
 contrib/filter-clang-warnings.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/contrib/filter-clang-warnings.py b/contrib/filter-clang-warnings.py
index 942cd30b659..3c68be028a8 100755
--- a/contrib/filter-clang-warnings.py
+++ b/contrib/filter-clang-warnings.py
@@ -39,7 +39,7 @@ def skip_warning(filename, message):
  '-Wignored-attributes', '-Wgnu-zero-variadic-macro-arguments',
  '-Wformat-security', '-Wundefined-internal',
  '-Wunknown-warning-option', '-Wc++20-extensions',
- '-Wbitwise-instead-of-logical'],
+ '-Wbitwise-instead-of-logical', 'egrep is obsolescent'],
 'insn-modes.cc': ['-Wshift-count-overflow'],
 'insn-emit.cc': ['-Wtautological-compare'],
 'insn-attrtab.cc': ['-Wparentheses-equality'],
@@ -57,8 +57,8 @@ def skip_warning(filename, message):
 'lex.cc': ['-Wc++20-attribute-extensions'],
 }
 
-for name, ignores in ignores.items():
-for i in ignores:
+for name, ignore in ignores.items():
+for i in ignore:
 if name in filename and i in message:
 return True
 return False
-- 
2.37.3

Re: [PATCH] sched1: Fix -fcompare-debug issue in schedule_region [PR105586]

2022-09-20 Thread Surya Kumari Jangala via Gcc-patches

Hi Jeff, Richard,
Thank you for reviewing the patch!
I have committed the patch to the gcc repo.
Can I backport this patch to prior versions of gcc, as this is an easy patch to 
backport and the issue exists in prior versions too?

Regards,
Surya


On 31/08/22 9:09 pm, Jeff Law via Gcc-patches wrote:
> 
> 
> On 8/23/2022 5:49 AM, Surya Kumari Jangala via Gcc-patches wrote:
>> sched1: Fix -fcompare-debug issue in schedule_region [PR105586]
>>
>> In schedule_region(), a basic block that does not contain any real insns
>> is not scheduled and the dfa state at the entry of the bb is not copied
>> to the fallthru basic block. However a DEBUG insn is treated as a real
>> insn, and if a bb contains non-real insns and a DEBUG insn, it's dfa
>> state is copied to the fallthru bb. This was resulting in
>> -fcompare-debug failure as the incoming dfa state of the fallthru block
>> is different with -g. We should always copy the dfa state of a bb to
>> it's fallthru bb even if the bb does not contain real insns.
>>
>> 2022-08-22  Surya Kumari Jangala  
>>
>> gcc/
>> PR rtl-optimization/105586
>> * sched-rgn.cc (schedule_region): Always copy dfa state to
>> fallthru block.
>>
>> gcc/testsuite/
>> PR rtl-optimization/105586
>> * gcc.target/powerpc/pr105586.c: New test.
> Interesting.    We may have stumbled over this bug internally a little while 
> ago -- not from a compare-debug standpoint, but from a "why isn't the 
> processor state copied to the fallthru block" point of view.   I had it on my 
> to investigate list, but hadn't gotten around to it yet.
> 
> I think there were requests for ChangeLog updates and a function comment for 
> save_state_for_fallthru_edge.  OK with those updates.
> 
> jeff
>

Re: [PATCH] c++: Implement P1467R9 - Extended floating-point types and standard names compiler part except for bfloat16 [PR106652]

2022-09-20 Thread Hongtao Liu via Gcc-patches

+My intel folk phoebe working for llvm side.

On Tue, Sep 20, 2022 at 11:35 AM Hongtao Liu  wrote:
>
> On Mon, Sep 12, 2022 at 4:06 PM Jakub Jelinek via Gcc-patches
>  wrote:
> >
> > Hi!
> >
> > The following patch implements the compiler part of C++23
> > P1467R9 - Extended floating-point types and standard names compiler part
> > by introducing _Float{16,32,64,128} as keywords and builtin types
> > like they are implemented for C already since GCC 7.
> > It doesn't introduce _Float{32,64,128}x for C++, those remain C only
> > for now, mainly because 
> > https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling
> > has mangling for:
> > ::= DF  _ # ISO/IEC TS 18661 binary floating point type _FloatN (N 
> > bits)
> > but doesn't for _FloatNx.  And it doesn't add anything for bfloat16_t
> > support, see below.
> > Regarding mangling, I think mangling _FloatNx as DF  x _ would be
> > possible, but would need to be discussed and voted in.
> > As there is no _FloatNx support for C++, I think it is wrong to announce
> > it through __FLT{32,64,128}X_*__ predefined macros (so the patch disables
> > those for C++; unfortunately g++ 7 to 12 will predefine those and also
> > __FLT{32,64,128}_*__ even when _FloatN support isn't implemented).
> > The patch wants to keep backwards compatibility with how __float128 has
> > been handled in C++ before, both for mangling and behavior in binary
> > operations, overload resolution etc.  So, there are some backend changes
> > where for C __float128 and _Float128 are the same type (float128_type_node
> > and float128t_type_node are the same pointer), but for C++ they are distinct
> > types which mangle differently and _Float128 is treated as extended
> > floating-point type while __float128 is treated as non-standard floating
> > point type.  The various C++23 changes about how floating-point types
> > are changed are actually implemented as written in the spec only if at least
> > one of the types involved is _Float{16,32,64,128} and kept previous behavior
> > otherwise.  For float/double/long double the rules are actually written that
> > they behave the same as before.
> > There is some backwards incompatibility at least on x86 regarding _Float16,
> > because that type was already used by that name and with the DF16_ mangling
> > (but only since GCC 12 and I think it isn't that widely used in the wild
> > yet).  E.g. config/i386/avx512fp16intrin.h shows the issues, where
> > in C or in GCC 12 in C++ one could pass 0.0f to a builtin taking _Float16
> > argument, but with the changes that is not possible anymore, one needs
> > to either use 0.0f16 or (_Float16) 0.0f.
> > We have also a problem with glibc headers, where since glibc 2.27
> > math.h and complex.h aren't compilable with these changes.  One gets
> > errors like:
> > In file included from /usr/include/math.h:43,
> >  from abc.c:1:
> > /usr/include/bits/floatn.h:86:9: error: multiple types in one declaration
> >86 | typedef __float128 _Float128;
> >   | ^~
> > /usr/include/bits/floatn.h:86:20: error: declaration does not declare 
> > anything [-fpermissive]
> >86 | typedef __float128 _Float128;
> >   |^
> > In file included from /usr/include/bits/floatn.h:119:
> > /usr/include/bits/floatn-common.h:214:9: error: multiple types in one 
> > declaration
> >   214 | typedef float _Float32;
> >   | ^
> > /usr/include/bits/floatn-common.h:214:15: error: declaration does not 
> > declare anything [-fpermissive]
> >   214 | typedef float _Float32;
> >   |   ^~~~
> > /usr/include/bits/floatn-common.h:251:9: error: multiple types in one 
> > declaration
> >   251 | typedef double _Float64;
> >   | ^~
> > /usr/include/bits/floatn-common.h:251:16: error: declaration does not 
> > declare anything [-fpermissive]
> >   251 | typedef double _Float64;
> >   |^~~~
> > This is from snippets like:
> > /* The remaining of this file provides support for older compilers.  */
> > # if __HAVE_FLOAT128
> >
> > /* The type _Float128 exists only since GCC 7.0.  */
> > #  if !__GNUC_PREREQ (7, 0) || defined __cplusplus
> > typedef __float128 _Float128;
> > #  endif
> > where it hardcodes that C++ doesn't have _Float{16,32,64,128} support nor
> > {f,F}{16,32,64,128} literal suffixes nor _Complex _Float{16,32,64,128}.
> > The patch fixincludes this for now and hopefully if this is committed, then
> > glibc can change those.  Right now the patch changes those
> > #  if !__GNUC_PREREQ (7, 0) || defined __cplusplus
> > conditions to
> > #  if !__GNUC_PREREQ (7, 0) || (defined __cplusplus && !__GNUC_PREREQ (13, 
> > 1) && defined __FLT32X_MANT_DIG__)
> > where it relies on __FLT32X_*__ macros no longer being predefined for C++.
> > Now, I guess for the fixincludes it could also use
> > #  if !__GNUC_PREREQ (7, 0) || (defined __cplusplus && !__GNUC_PREREQ (13, 
> > 0))
> > where earlier GCC 13

Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-20 Thread Thomas Koenig via Gcc-patches




On 19.09.22 22:50, Mikael Morin wrote:

Le 19/09/2022 à 21:46, Harald Anlauf a écrit :

Am 18.09.22 um 22:55 schrieb Mikael Morin:

Le 18/09/2022 à 20:32, Harald Anlauf a écrit :


Assumed shape will be on the easy side,
while assumed size likely needs to be excluded for clobbering.


Isn’t it the converse that is true?
Assumed shape can be non-contiguous so have to be excluded, but assumed
size are contiguous, so valid candidates for clobbering. No?


I really was referring here to *dummies*, as in the following example:

program p
   integer :: a(4)
   a = 1
   call sub (a(1), 2)
   print *, a
contains
   subroutine sub (b, k)
 integer, intent(in)  :: k
 integer, intent(out) :: b(*)
!   integer, intent(out) :: b(k)
 if (k > 2) b(k) = k
   end subroutine sub
end program p

Assumed size (*) is just a contiguous hunk of memory of possibly
unknown size, which can be zero.  So you couldn't set a clobber
for the a(1) actual argument.

Couldn't you clobber A entirely?  If no element of B is initialized in 
SUB, well, A has undefined values on return from SUB.  That's how 
INTENT(OUT) works.


Yes, I think so - you are passing the starting element of an array
to an assumed-size array via storage association rules.

It has to be an explicit interface, of course, otherwise it is
unclear if an array or an array element is passed.

Best regards

Thomas

RE: [PATCH] Enhance final_value_replacement_loop to handle bitop with an invariant induction.[PR105735]

2022-09-20 Thread Kong, Lingling via Gcc-patches

Thanks a lot, pushed to trunk.

> Hi Richard,
> 
> Thanks again for your reviewing.
> 
> > Yes, use else if for the bitwise induction.  Can you also make the new
> > case conditional on 'def'
> > (the compute_overall_effect_of_inner_loop) being chrec_dont_know?  If
> > that call produced something useful it will not be of either of the two 
> > special
> forms.
> > Thus like
> >
> >   if (def != chrec_dont_know)
> > /* Already OK.  */
> > ;
> >  else if ((bitinv_def = ...)
> > ..
> >  else if (tree_fits_uhwi_p (niter)
> >  ... bitwise induction case...)
> > ...
> >
> Yes, I fixed it in new patch. Thanks.
> Ok for master ?
> 
> Thanks,
> Lingling
> 
> > -Original Message-
> > From: Richard Biener 
> > Sent: Wednesday, September 14, 2022 4:16 PM
> > To: Kong, Lingling 
> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao 
> > Subject: Re: [PATCH] Enhance final_value_replacement_loop to handle
> > bitop with an invariant induction.[PR105735]
> >
> > On Tue, Sep 13, 2022 at 9:54 AM Kong, Lingling
> > 
> > wrote:
> > >
> > > Hi Richard,
> > >
> > > Thanks you so much for reviewing this patch.  I really appreciate
> > > it. For these
> > review comments, I have made some changes.
> > >
> > > > That's a single-stmt match, you shouldn't use match.pd matching for 
> > > > this.
> > > > Instead just do
> > > >
> > > >   if (is_gimple_assign (stmt)
> > > >   && ((code = gimple_assign_rhs_code (stmt)), true)
> > > >   && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code ==
> > > > BIT_XOR_EXPR))
> > >
> > > Yes, I fixed it and dropped modification for match.pd.
> > >
> > > > and pick gimple_assign_rhs{1,2} (stmt) as the operands.  The :c in
> > > > bit_op:c is redundant btw. - while the name suggests "with
> > > > invariant" you don't actually check for that.  But again, given
> > > > canonicalization rules the invariant will be rhs2 so above add
> > > >
> > > > && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST
> > >
> > > For " with invariant", this needed op1 is invariant, and I used
> > `expr_invariant_in_loop_p (loop, match_op[0])` for check.
> > > And op2 just be PHI is ok. If op2 is INTEGER_CST, existing gcc can
> > > be directly
> > optimized and do not need modification.
> > >
> > > > you probably need dg-require-effective-target longlong, but is it
> > > > necessary to use long long for the testcases in the first place?
> > > > The IV seems to be unused, if it should match the variables bit
> > > > size use sizeof
> > > > (type) * 8
> > >
> > > Yes, It is not necessary to use long long for the testcases. I
> > > changed type to
> > unsigned int.
> > >
> > > > > +  inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge
> > > > > + (loop));  return fold_build2 (code1, type, inv, match_op[0]);
> > > > > + }
> > > >
> > > > The } goes to the next line.
> > >
> > > Sorry, It might be something wrong with my use of gcc send-email format.
> > >
> > > > > +  tree bitinv_def;
> > > > > +  if ((bitinv_def
> > > >
> > > > please use else if here
> > >
> > > Sorry, If use the else if here, there is no corresponding above if.
> > > I'm not sure if
> > you mean change bitwise induction expression if to else if.
> >
> > Yes, use else if for the bitwise induction.  Can you also make the new
> > case conditional on 'def'
> > (the compute_overall_effect_of_inner_loop) being chrec_dont_know?  If
> > that call produced something useful it will not be of either of the two 
> > special
> forms.
> > Thus like
> >
> >   if (def != chrec_dont_know)
> > /* Already OK.  */
> > ;
> >  else if ((bitinv_def = ...)
> > ..
> >  else if (tree_fits_uhwi_p (niter)
> >  ... bitwise induction case...)
> > ...
> >
> > ?
> >
> > Otherwise looks OK now.
> >
> > Thanks,
> > Richard.
> >
> > > Do you agree with these changes?  Thanks again for taking a look.
> > >
> > > Thanks,
> > > Lingling
> > >
> > > > -Original Message-
> > > > From: Richard Biener 
> > > > Sent: Tuesday, August 23, 2022 3:27 PM
> > > > To: Kong, Lingling 
> > > > Cc: Liu, Hongtao ; gcc-patches@gcc.gnu.org
> > > > Subject: Re: [PATCH] Enhance final_value_replacement_loop to
> > > > handle bitop with an invariant induction.[PR105735]
> > > >
> > > > On Thu, Aug 18, 2022 at 8:48 AM Kong, Lingling via Gcc-patches
> > > >  wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > This patch is for pr105735/pr101991. It will enable below 
> > > > > optimization:
> > > > > {
> > > > > -  long unsigned int bit;
> > > > > -
> > > > > -   [local count: 32534376]:
> > > > > -
> > > > > -   [local count: 1041207449]:
> > > > > -  # tmp_10 = PHI 
> > > > > -  # bit_12 = PHI 
> > > > > -  tmp_7 = bit2_6(D) & tmp_10;
> > > > > -  bit_8 = bit_12 + 1;
> > > > > -  if (bit_8 != 32)
> > > > > -goto ; [96.97%]
> > > > > -  else
> > > > > -goto ; [3.03%]
> > > > > -
> > > > > -   [local count: 1009658865]:
> > > > > -  goto ; [100.00%]
> > > > > -
> > > > > -   [local count: 32534376]:
> > > >

Re: [PATCH] Support 64-bit vectorization for single-precision floating rounding operation.

2022-09-20 Thread Uros Bizjak via Gcc-patches

On Tue, Sep 20, 2022 at 4:15 AM liuhongt via Gcc-patches
 wrote:
>
> Here's list the patch supported.
> rint/nearbyint/ceil/floor/trunc/lrint/lceil/lfloor/round/lround.
>
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/106910
> * config/i386/mmx.md (nearbyintv2sf2): New expander.
> (rintv2sf2): Ditto.
> (ceilv2sf2): Ditto.
> (lceilv2sfv2si2): Ditto.
> (floorv2sf2): Ditto.
> (lfloorv2sfv2si2): Ditto.
> (btruncv2sf2): Ditto.
> (lrintv2sfv2si2): Ditto.
> (roundv2sf2): Ditto.
> (lroundv2sfv2si2): Ditto.
> (*mmx_roundv2sf2): New define_insn.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr106910-1.c: New test.

OK.

Thanks,
Uros.

> ---
>  gcc/config/i386/mmx.md | 154 +
>  gcc/testsuite/gcc.target/i386/pr106910-1.c |  77 +++
>  2 files changed, 231 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr106910-1.c
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index dda4b43f5c1..222a041de58 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -1627,6 +1627,160 @@ (define_expand "vec_initv2sfsf"
>DONE;
>  })
>
> +;
> +;;
> +;; Parallel single-precision floating point rounding operations.
> +;;
> +;
> +
> +(define_expand "nearbyintv2sf2"
> +  [(set (match_operand:V2SF 0 "register_operand")
> +   (unspec:V2SF
> + [(match_operand:V2SF 1 "register_operand")
> +  (match_dup 2)]
> + UNSPEC_ROUND))]
> +  "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE"
> +  "operands[2] = GEN_INT (ROUND_MXCSR | ROUND_NO_EXC);")
> +
> +(define_expand "rintv2sf2"
> +  [(set (match_operand:V2SF 0 "register_operand")
> +   (unspec:V2SF
> + [(match_operand:V2SF 1 "register_operand")
> +  (match_dup 2)]
> + UNSPEC_ROUND))]
> +  "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE"
> +  "operands[2] = GEN_INT (ROUND_MXCSR);")
> +
> +(define_expand "ceilv2sf2"
> +  [(set (match_operand:V2SF 0 "register_operand")
> +   (unspec:V2SF
> + [(match_operand:V2SF 1 "register_operand")
> +  (match_dup 2)]
> + UNSPEC_ROUND))]
> +  "TARGET_SSE4_1 && !flag_trapping_math
> +   && TARGET_MMX_WITH_SSE"
> +  "operands[2] = GEN_INT (ROUND_CEIL | ROUND_NO_EXC);")
> +
> +(define_expand "lceilv2sfv2si2"
> +  [(match_operand:V2SI 0 "register_operand")
> +   (match_operand:V2SF 1 "register_operand")]
> + "TARGET_SSE4_1 && !flag_trapping_math
> +  && TARGET_MMX_WITH_SSE"
> +{
> +  rtx tmp = gen_reg_rtx (V2SFmode);
> +  emit_insn (gen_ceilv2sf2 (tmp, operands[1]));
> +  emit_insn (gen_fix_truncv2sfv2si2 (operands[0], tmp));
> +  DONE;
> +})
> +
> +(define_expand "floorv2sf2"
> +  [(set (match_operand:V2SF 0 "register_operand")
> +   (unspec:V2SF
> + [(match_operand:V2SF 1 "vector_operand")
> +  (match_dup 2)]
> + UNSPEC_ROUND))]
> +  "TARGET_SSE4_1 && !flag_trapping_math
> +  && TARGET_MMX_WITH_SSE"
> +  "operands[2] = GEN_INT (ROUND_FLOOR | ROUND_NO_EXC);")
> +
> +(define_expand "lfloorv2sfv2si2"
> +  [(match_operand:V2SI 0 "register_operand")
> +   (match_operand:V2SF 1 "register_operand")]
> + "TARGET_SSE4_1 && !flag_trapping_math
> +  && TARGET_MMX_WITH_SSE"
> +{
> +  rtx tmp = gen_reg_rtx (V2SFmode);
> +  emit_insn (gen_floorv2sf2 (tmp, operands[1]));
> +  emit_insn (gen_fix_truncv2sfv2si2 (operands[0], tmp));
> +  DONE;
> +})
> +
> +(define_expand "btruncv2sf2"
> +  [(set (match_operand:V2SF 0 "register_operand")
> +   (unspec:V2SF
> + [(match_operand:V2SF 1 "register_operand")
> +  (match_dup 2)]
> + UNSPEC_ROUND))]
> +  "TARGET_SSE4_1 && !flag_trapping_math"
> +  "operands[2] = GEN_INT (ROUND_TRUNC | ROUND_NO_EXC);")
> +
> +(define_insn "*mmx_roundv2sf2"
> +  [(set (match_operand:V2SF 0 "register_operand" "=Yr,*x,v")
> +   (unspec:V2SF
> + [(match_operand:V2SF 1 "register_operand" "Yr,x,v")
> +  (match_operand:SI 2 "const_0_to_15_operand")]
> + UNSPEC_ROUND))]
> +  "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE"
> +  "%vroundps\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "isa" "noavx,noavx,avx")
> +   (set_attr "type" "ssecvt")
> +   (set_attr "prefix_data16" "1,1,*")
> +   (set_attr "prefix_extra" "1")
> +   (set_attr "length_immediate" "1")
> +   (set_attr "prefix" "orig,orig,vex")
> +   (set_attr "mode" "V4SF")])
> +
> +(define_insn "lrintv2sfv2si2"
> +  [(set (match_operand:V2SI 0 "register_operand" "=v")
> +   (unspec:V2SI
> + [(match_operand:V2SF 1 "register_operand" "v")]
> + UNSPEC_FIX_NOTRUNC))]
> +  "TARGET_MMX_WITH_SSE"
> +  "%vcvtps2dq\t{%1, %0|%0, %1}"
> +  [(set_attr "type" "ssecvt")
> +   (set (attr "prefix_data16")
> + (if_then_else
> +   (match_test "TARGET_AVX")
> +

58 matches

Mail list logo