Re: Adding a new thread model to GCC

2022-10-02 Thread Bernhard Reutner-Fischer via Gcc-patches
On 2 October 2022 14:54:54 CEST, LIU Hao  wrote:
>在 2022-10-02 04:02, Bernhard Reutner-Fischer 写道:
>> On 1 October 2022 20:34:45 CEST, LIU Hao via Gcc-patches 
>>  wrote:
>>> Greetings.
>> 
>>> The first patch is necessary because somewhere in libgfortran, `pthread_t` 
>>> is referenced. If the thread model is not `posix`, it fails to compile.
>> 
>> One of several shortcomings mentioned already on Sun, 02 Sep 2018 15:40:28 
>> -0700 in
>> https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg196212.html
>> 
>
>Forgive me but I didn't get your point. Is the 'shortcoming' the fact that 
>`pthread_t` must be preferred to `__gthread_t`?

No, sorry for my brevity.
Using __gthread_t like in your patch is correct.

thanks,

>
>For non-posix thread models,  is not included, so `pthread_t` is 
>not declared. I haven't looked at other code in libgfortran, but changing 
>`pthread_t` to `__gthread_t` does allow libgfortran to build. I don't know how 
>to test it though, as I don't write Fortran myself.
>
>



Re: [PATCH] RISC-V missing __builtin_lceil and __builtin_lfloor

2022-10-02 Thread Kevin Lee
Thank you for the update Palmer. I'll certainly look into the corrupted
patch issue and the floating-point test failure in gfortran.

On Sun, Oct 2, 2022 at 1:42 PM Palmer Dabbelt  wrote:

> On Sat, 17 Sep 2022 14:16:13 PDT (-0700), Kito Cheng wrote:
> > LGTM, thanks, I guess I just missed this before
>
> No worries, I'd just stubmled on it looking through old stuff.
>
> Kevin: Looks like this got corrupted, possibly from copy/paste into
> gmail.  I resurrect it, but there's a floating-point test failure in
> gfortran.  Looks like it predates this, but I'm trying to bisect it to
> at least have a root cause before just ignoring it.  I've got this
> floating around on a branch and hopefully that'll remind me to commit
> it after I sort that out.
>
> >
> > Palmer Dabbelt  於 2022年9月17日 週六 23:07 寫道:
> >
> >> On Mon, 15 Aug 2022 17:44:35 PDT (-0700), kev...@rivosinc.com wrote:
> >> > Hello,
> >> > Currently, __builtin_lceil and __builtin_lfloor doesn't generate an
> >> > existing instruction fcvt, but rather calls ceil and floor from the
> >> > library. This patch adds the missing iterator and attributes for lceil
> >> and
> >> > lfloor to produce the optimized code.
> >> >  The test cases check the correct generation of the fcvt instruction
> for
> >> > float/double to int/long/long long. Passed the test in riscv-linux.
> >> > Could this patch be committed?
> >>
> >> Reviewed-by: Palmer Dabbelt 
> >> Acked-by: Palmer Dabbelt 
> >>
> >> Not sure if Kito had any comments for this one, but it looks good to me.
> >>
> >> > gcc/ChangeLog:
> >> >Michael Collison  
> >> > * config/riscv/riscv.md (RINT): Add iterator for lceil and
> >> lround.
> >> > (rint_pattern): Add ceil and floor.
> >> > (rint_rm): Add rup and rdn.
> >> >
> >> > gcc/testsuite/ChangeLog:
> >> > Kevin Lee  
> >> > * gcc.target/riscv/lfloor-lceil.c: New test.
> >> > ---
> >> >  gcc/config/riscv/riscv.md | 13 ++-
> >> >  gcc/testsuite/gcc.target/riscv/lfloor-lceil.c | 79
> +++
> >> >  2 files changed, 88 insertions(+), 4 deletions(-)
> >> >  create mode 100644 gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> >> >
> >> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> >> > index c6399b1389e..070004fa7fe 100644
> >> > --- a/gcc/config/riscv/riscv.md
> >> > +++ b/gcc/config/riscv/riscv.md
> >> > @@ -43,6 +43,9 @@ (define_c_enum "unspec" [
> >> >UNSPEC_LRINT
> >> >UNSPEC_LROUND
> >> >
> >> > +  UNSPEC_LCEIL
> >> > +  UNSPEC_LFLOOR
> >> > +
> >> >;; Stack tie
> >> >UNSPEC_TIE
> >> >  ])
> >> > @@ -345,10 +348,12 @@ (define_mode_attr UNITMODE [(SF "SF") (DF
> "DF")])
> >> >  ;; the controlling mode.
> >> >  (define_mode_attr HALFMODE [(DF "SI") (DI "SI") (TF "DI")])
> >> >
> >> > -;; Iterator and attributes for floating-point rounding instructions.
> >> > -(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND])
> >> > -(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
> >> > "round")])
> >> > -(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND
> "rmm")])
> >> > +;; Iterator and attributes for floating-point rounding instructions.f
> >> > +(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND UNSPEC_LCEIL
> >> > UNSPEC_LFLOOR])
> >> > +(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
> >> > "round")
> >> > + (UNSPEC_LCEIL "ceil") (UNSPEC_LFLOOR
> >> > "floor")])
> >> > +(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")
> >> > +(UNSPEC_LCEIL "rup") (UNSPEC_LFLOOR "rdn")])
> >> >
> >> >  ;; Iterator and attributes for quiet comparisons.
> >> >  (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET
> >> UNSPEC_FLE_QUIET])
> >> > diff --git a/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> >> > b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> >> > new file mode 100644
> >> > index 000..4d81c12cefa
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> >> > @@ -0,0 +1,79 @@
> >> > +/* { dg-do compile } */
> >> > +/* { dg-options "-march=rv64gc -mabi=lp64d" } */
> >> > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
> >> > +
> >> > +int
> >> > +ceil1(float i)
> >> > +{
> >> > +  return __builtin_lceil(i);
> >> > +}
> >> > +
> >> > +long
> >> > +ceil2(float i)
> >> > +{
> >> > +  return __builtin_lceil(i);
> >> > +}
> >> > +
> >> > +long long
> >> > +ceil3(float i)
> >> > +{
> >> > +  return __builtin_lceil(i);
> >> > +}
> >> > +
> >> > +int
> >> > +ceil4(double i)
> >> > +{
> >> > +  return __builtin_lceil(i);
> >> > +}
> >> > +
> >> > +long
> >> > +ceil5(double i)
> >> > +{
> >> > +  return __builtin_lceil(i);
> >> > +}
> >> > +
> >> > +long long
> >> > +ceil6(double i)
> >> > +{
> >> > +  return __builtin_lceil(i);
> >> > +}
> >> > +
> >> > +int
> >> > +floor1(float i)
> >> > +{
> >> > +  return __builtin_lfloor(i);
> >> > +}
> >> > +
> >> > +long
> >> > +floor2(float i)
> >> > +{
> >> 

Re: [PATCH] RISC-V missing __builtin_lceil and __builtin_lfloor

2022-10-02 Thread Palmer Dabbelt

On Sat, 17 Sep 2022 14:16:13 PDT (-0700), Kito Cheng wrote:

LGTM, thanks, I guess I just missed this before


No worries, I'd just stubmled on it looking through old stuff.

Kevin: Looks like this got corrupted, possibly from copy/paste into 
gmail.  I resurrect it, but there's a floating-point test failure in 
gfortran.  Looks like it predates this, but I'm trying to bisect it to 
at least have a root cause before just ignoring it.  I've got this 
floating around on a branch and hopefully that'll remind me to commit 
it after I sort that out.




Palmer Dabbelt  於 2022年9月17日 週六 23:07 寫道:


On Mon, 15 Aug 2022 17:44:35 PDT (-0700), kev...@rivosinc.com wrote:
> Hello,
> Currently, __builtin_lceil and __builtin_lfloor doesn't generate an
> existing instruction fcvt, but rather calls ceil and floor from the
> library. This patch adds the missing iterator and attributes for lceil
and
> lfloor to produce the optimized code.
>  The test cases check the correct generation of the fcvt instruction for
> float/double to int/long/long long. Passed the test in riscv-linux.
> Could this patch be committed?

Reviewed-by: Palmer Dabbelt 
Acked-by: Palmer Dabbelt 

Not sure if Kito had any comments for this one, but it looks good to me.

> gcc/ChangeLog:
>Michael Collison  
> * config/riscv/riscv.md (RINT): Add iterator for lceil and
lround.
> (rint_pattern): Add ceil and floor.
> (rint_rm): Add rup and rdn.
>
> gcc/testsuite/ChangeLog:
> Kevin Lee  
> * gcc.target/riscv/lfloor-lceil.c: New test.
> ---
>  gcc/config/riscv/riscv.md | 13 ++-
>  gcc/testsuite/gcc.target/riscv/lfloor-lceil.c | 79 +++
>  2 files changed, 88 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
>
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index c6399b1389e..070004fa7fe 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -43,6 +43,9 @@ (define_c_enum "unspec" [
>UNSPEC_LRINT
>UNSPEC_LROUND
>
> +  UNSPEC_LCEIL
> +  UNSPEC_LFLOOR
> +
>;; Stack tie
>UNSPEC_TIE
>  ])
> @@ -345,10 +348,12 @@ (define_mode_attr UNITMODE [(SF "SF") (DF "DF")])
>  ;; the controlling mode.
>  (define_mode_attr HALFMODE [(DF "SI") (DI "SI") (TF "DI")])
>
> -;; Iterator and attributes for floating-point rounding instructions.
> -(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND])
> -(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
> "round")])
> -(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")])
> +;; Iterator and attributes for floating-point rounding instructions.f
> +(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND UNSPEC_LCEIL
> UNSPEC_LFLOOR])
> +(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
> "round")
> + (UNSPEC_LCEIL "ceil") (UNSPEC_LFLOOR
> "floor")])
> +(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")
> +(UNSPEC_LCEIL "rup") (UNSPEC_LFLOOR "rdn")])
>
>  ;; Iterator and attributes for quiet comparisons.
>  (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET
UNSPEC_FLE_QUIET])
> diff --git a/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> new file mode 100644
> index 000..4d81c12cefa
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> @@ -0,0 +1,79 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc -mabi=lp64d" } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
> +
> +int
> +ceil1(float i)
> +{
> +  return __builtin_lceil(i);
> +}
> +
> +long
> +ceil2(float i)
> +{
> +  return __builtin_lceil(i);
> +}
> +
> +long long
> +ceil3(float i)
> +{
> +  return __builtin_lceil(i);
> +}
> +
> +int
> +ceil4(double i)
> +{
> +  return __builtin_lceil(i);
> +}
> +
> +long
> +ceil5(double i)
> +{
> +  return __builtin_lceil(i);
> +}
> +
> +long long
> +ceil6(double i)
> +{
> +  return __builtin_lceil(i);
> +}
> +
> +int
> +floor1(float i)
> +{
> +  return __builtin_lfloor(i);
> +}
> +
> +long
> +floor2(float i)
> +{
> +  return __builtin_lfloor(i);
> +}
> +
> +long long
> +floor3(float i)
> +{
> +  return __builtin_lfloor(i);
> +}
> +
> +int
> +floor4(double i)
> +{
> +  return __builtin_lfloor(i);
> +}
> +
> +long
> +floor5(double i)
> +{
> +  return __builtin_lfloor(i);
> +}
> +
> +long long
> +floor6(double i)
> +{
> +  return __builtin_lfloor(i);
> +}
> +
> +/* { dg-final { scan-assembler-times "fcvt.l.s" 6 } } */
> +/* { dg-final { scan-assembler-times "fcvt.l.d" 6 } } */
> +/* { dg-final { scan-assembler-not "call" } } */



Re: [PATCH] Enable shrink wrapping for the RISC-V target.

2022-10-02 Thread Palmer Dabbelt

On Tue, 06 Sep 2022 03:39:02 PDT (-0700), manolis.tsa...@vrull.eu wrote:

This commit implements the target macros (TARGET_SHRINK_WRAP_*) that
enable separate shrink wrapping for function prologues/epilogues in
RISC-V.

Tested against SPEC CPU 2017, this change always has a net-positive
effect on the dynamic instruction count.  See the following table for
the breakdown on how this reduces the number of dynamic instructions
per workload on a like-for-like (i.e., same config file; suppressing
shrink-wrapping with -fno-shrink-wrap):


Does this also pass the regression tests?

(there's also some comments on the code in-line)



 # dynamic instructions
w/o shrink-wrap   w/ shrink-wrap  reduction
500.perlbench_r   12657167865931262156218578 3560568015   0.28%
500.perlbench_r779224795689 76533700902513887786664   1.78%
500.perlbench_r724087331471 71130715252212780178949   1.77%
502.gcc_r  204259864844 194517006339 9742858505   4.77%
502.gcc_r  244047794302 23155583472212491959580   5.12%
502.gcc_r  230896069400 221877703011 9018366389   3.91%
502.gcc_r  192130616624 183856450605 8274166019   4.31%
502.gcc_r  258875074079 2477562032268870853   4.30%
505.mcf_r  662653430325 660678680547 1974749778   0.30%
520.omnetpp_r  985114167068 93419131015450922856914   5.17%
523.xalancbmk_r927037633578 921688937650 5348695928   0.58%
525.x264_r 490953958454 490565583447  388375007   0.08%
525.x264_r19946622944211993171932425 1490361996   0.07%
525.x264_r18976171204501896062750609 1554369841   0.08%
531.deepsjeng_r   1695189878907166930413041125885748496   1.53%
541.leela_r   192594122189790086119828040361024   1.46%
548.exchange2_r   20738162279442073816226729   1215   0.00%
557.xz_r   379572090003 379057409041  514680962   0.14%
557.xz_r   953117469352 952680431430  437037922   0.05%
557.xz_r   536859579650 536456690164  402889486   0.08%
 18421773405376   18223938521833   197834883543   1.07%  totals

Signed-off-by: Manolis Tsamis 

gcc/ChangeLog:

* config/riscv/riscv.cc (struct machine_function): Add array to store
register wrapping information.
(riscv_for_each_saved_reg): Skip registers that are wrapped separetely.
(riscv_get_separate_components): New function.
(riscv_components_for_bb): Likewise.
(riscv_disqualify_components): Likewise.
(riscv_process_components): Likewise.
(riscv_emit_prologue_components): Likewise.
(riscv_emit_epilogue_components): Likewise.
(riscv_set_handled_components): Likewise.
(TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS): Define.
(TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB): Likewise.
(TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS): Likewise.
(TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS): Likewise.
(TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS): Likewise.
(TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/shrink-wrap-1.c: New test.

---

 gcc/config/riscv/riscv.cc | 187 +-
 .../gcc.target/riscv/shrink-wrap-1.c  |  25 +++
 2 files changed, 210 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/shrink-wrap-1.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 5a0adffb5ce..3b633149a9a 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -25,6 +25,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
+#include "backend.h"
 #include "tm.h"
 #include "rtl.h"
 #include "regs.h"
@@ -52,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "optabs.h"
 #include "bitmap.h"
 #include "df.h"
+#include "function-abi.h"
 #include "diagnostic.h"
 #include "builtins.h"
 #include "predict.h"
@@ -147,6 +149,11 @@ struct GTY(())  machine_function {

   /* The current frame information, calculated by riscv_compute_frame_info.  */
   struct riscv_frame_info frame;
+
+  /* The components already handled by separate shrink-wrapping, which should
+ not be considered by the prologue and epilogue.  */
+  bool reg_is_wrapped_separately[FIRST_PSEUDO_REGISTER];
+
 };

 /* Information about a single argument.  */
@@ -4209,7 +4216,7 @@ riscv_for_each_saved_reg (HOST_WIDE_INT sp_offset, 
riscv_save_restore_fn fn,
   for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
 if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
   {
-   bool handle_reg = TRUE;
+   bool handle_reg = !cfun->machine->reg_is_wrapped_separately[regno];

/* If this is a normal 

[patch, RFC. Fortran] Some clobbering for INTENT(OUT) arrays

2022-10-02 Thread Thomas Koenig via Gcc-patches


Hi,

following Mikael's recent patch series, here is a first idea
of what extending clobbering to arrays wold look like.

The attached patch works for a subset of cases, for example

program main
  implicit none
  interface
subroutine foo(a)
  integer, intent(out) :: a(*)
end subroutine foo
  end interface
  integer, dimension(10) :: a
  call foo(a)
end program main

and

program main
  implicit none
  interface
subroutine foo(a)
  integer, intent(out) :: a(:)
end subroutine foo
  end interface
  integer, dimension(10) :: a
  a(1) = 32
  a(2) = 32
  call foo(a)
end program main

but it does not cover cases like an assumed-size array
being handed down to an INTENT(OUT) argument.

What happens if the

+ if (!sym->attr.allocatable && !sym->attr.pointer
+ && !POINTER_TYPE_P (TREE_TYPE 
(sym->backend_decl)))



part is taken out is that the whole descriptor can be clobbered in
such a case, which is of course not what is wanted.

I am a bit stuck of how to generate a reference to the first element
of the array (really, just dereferencing the data pointer)
in the most elegant way.  I am currently leaning towards
building a gfc_expr, which should work, but would be less
than elegant.

So, anything more elegant at hand?

Best regards

Thomasdiff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 4f3ae82d39c..bbb00f90a77 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -43,6 +43,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimplify.h"
 #include "tm.h"		/* For CHAR_TYPE_SIZE.  */
 
+#include "debug.h"
 
 /* Calculate the number of characters in a string.  */
 
@@ -5981,7 +5982,6 @@ post_call:
 gfc_add_block_to_block (>post, );
 }
 
-
 /* Generate code for a procedure call.  Note can return se->post != NULL.
If se->direct_byref is set then se->expr contains the return parameter.
Return nonzero, if the call has alternate specifiers.
@@ -6099,6 +6099,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 {
   bool finalized = false;
   tree derived_array = NULL_TREE;
+  tree clobber_array = NULL_TREE;
 
   e = arg->expr;
   fsym = formal ? formal->sym : NULL;
@@ -6896,10 +6897,23 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 	 fsym->attr.pointer);
 		}
 	  else
-		/* This is where we introduce a temporary to store the
-		   result of a non-lvalue array expression.  */
-		gfc_conv_array_parameter (, e, nodesc_arg, fsym,
-	  sym->name, NULL);
+		{
+		  /* This is where we introduce a temporary to store the
+		 result of a non-lvalue array expression.  */
+		  gfc_conv_array_parameter (, e, nodesc_arg, fsym,
+	sym->name, NULL);
+		  if (fsym && fsym->attr.intent == INTENT_OUT
+		  && gfc_full_array_ref_p (e->ref, NULL))
+		{
+		  gfc_symbol *sym = e->symtree->n.sym;
+		  if (!sym->attr.allocatable && !sym->attr.pointer
+			  && !POINTER_TYPE_P (TREE_TYPE (sym->backend_decl)))
+			clobber_array
+			  = gfc_build_array_ref (e->symtree->n.sym->backend_decl,
+		 build_int_cst (size_type_node, 0),
+		 NULL_TREE, true, NULL_TREE);
+		}
+		}
 
 	  /* If an ALLOCATABLE dummy argument has INTENT(OUT) and is
 		 allocated on entry, it must be deallocated.
@@ -6952,6 +6966,13 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
    tmp, build_empty_stmt (input_location));
 		  gfc_add_expr_to_block (>pre, tmp);
 		}
+
+	  if (clobber_array != NULL_TREE)
+		{
+		  tree clobber;
+		  clobber = build_clobber (TREE_TYPE(clobber_array));
+		  gfc_add_modify (, clobber_array, clobber);
+		}
 	}
 	}
   /* Special case for an assumed-rank dummy argument. */


Re: [Patch] libgomp/nvptx: Prepare for reverse-offload callback handling

2022-10-02 Thread Tobias Burnus

On 27.09.22 11:23, Tobias Burnus wrote:

We do support
 #if __PTX_SM__ >= 600  (CUDA >= 8.0, ptx isa >= 5.0)
and we also can configure GCC with
 --with-arch=sm_70 (or sm_80 or ...)
Thus, adding atomics with .sys scope is possible.

See attached patch. This seems to work fine and I hope I got the
assembly right in terms of atomic use. (And I do believe that the
.release/.acquire do not need an additional __sync_syncronize()/"membar.sys".)

Regarding this:

While 'atom.op' (op = and/or/xor/cas/exch/add/inc/dec/min/max)
with scope is a sm_60 feature, the used 'st/ld' with scope qualifier
and .relaxed, .release / .relaxed, .acquire require sm_70.

(Does not really matter as only ..., sm_53 and sm_70, ... is currently
supported but not sm_60, but the #if should be obviously fixed.)

* * *

Looking at the generated code for without inline assembler, we have instead of
 st.global.release.sys.u64 [%r27],%r39;
and
 ld.acquire.sys.global.u64 %r62,[%r27];
for the older-systems (__PTX_SM < 700) the code:
 @ %r69 membar.sys;
 @ %r69 atom.exch.b64 _,[%r27],%r41;
and
 ld.global.u64 %r64,[__gomp_rev_offload_var];
 ld.u64 %r36,[%r64];
 membar.sys;

In my understanding, the membar.sys ensures - similar to
 st.release / ld.acquire
that the memory handling is done in the correct order in scope .sys.
As the 'fn' variable is initially 0 - and then only set via the device
i.e. there is eventually a DMA write device->host, which is atomically
as the will int64_t is written at once (and not first, e.g. the lower
and then the upper half). The 'st'/'atom.exch' should work fine, despite
having no .sys scope.

Likewise, the membar.sys applies also in the other direction. Or did I
miss something. If so, would an explicit __sync_synchronize() (= membar.sys)
help between the 'st' and the 'ld'?

Tobias


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


[Patch] Fortran: Add OpenMP's assume(s) directives

2022-10-02 Thread Tobias Burnus

This patch adds '!$omp assume' and '!$omp assumes' support.
None of the directives is used after resolution.

When we actually start using for 'assumes', it has to be stored in .mod
files. The other question is how to handle 'holds()' expressions with 'assumes'.

-fopenmp-simd: I used a longer wording to imply that not only the 'simd' but
all SIMD directives are enabled.

OK for mainline?

Tobias

PS: For 'assume' with holds clause, the same applies as for Jakub's 
commit/patch:
"openmp: Add OpenMP assume, assumes and begin/end assumes support"
https://gcc.gnu.org/r13-3020-gd01bd0b0f3b8f4c33c437ff10f0b949200627f56
Namely, it requires that the following - now half-approved - patch is committed:
"[PATCH]
 c++, c: Implement C++23 P1774R8 - Portable assumptions [PR106654]"
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601991.html

PPS: I intent to take care in a separate patch the new rules for where
OpenMP specification part directives be placed (i.e. after USE/INTENT/IMPORT)
for all delarative + informational routines, the latter includes the 'assumes'
directive.


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran: Add OpenMP's assume(s) directives

gcc/ChangeLog:

	* doc/invoke.texi (-fopenmp-simd): Document that also 'assume'
	is enabled.

libgomp/ChangeLog:

	* libgomp.texi (OpenMP 5.1 Impl. Status): Mark 'assume' as 'Y'.

gcc/fortran/ChangeLog:

	* dump-parse-tree.cc (show_omp_assumes): New.
	(show_omp_clauses, show_namespace): Call it.
	(show_omp_node, show_code_node): Handle OpenMP ASSUME.
	* gfortran.h (enum gfc_statement): Add ST_OMP_ASSUME,
	ST_OMP_END_ASSUME and ST_OMP_ASSUMES.
	(gfc_exec_op): Add EXEC_OMP_ASSUME.
	(gfc_omp_assumptions): New struct.
	(gfc_get_omp_assumptions): New XCNEW #define.
	(gfc_omp_clauses, gfc_namespace): Add assume member.
	(gfc_resolve_omp_assumptions): New prototype.
	* match.h (gfc_match_omp_assume, gfc_match_omp_assumes): New.
	* openmp.cc (omp_code_to_statement): Declare.
	(gfc_free_omp_clauses): Free assume member and its struct data.
	(enum omp_mask2): Add OMP_CLAUSE_ASSUMPTIONS.
	(gfc_omp_absent_contains_clause): New.
	(gfc_match_omp_clauses): Call it; optionally use passed
	omp_clauses argument.
	(gfc_match_omp_assume, gfc_match_omp_assumes): New.
	(gfc_resolve_omp_assumptions): New.
	(resolve_omp_clauses): Call it.
	(gfc_resolve_omp_directive, omp_code_to_statement): Handle
	EXEC_OMP_ASSUME.
	* parse.cc (decode_omp_directive): Parse OpenMP ASSUME(S).
	(next_statement, parse_executable, parse_omp_structured_block):
	Handle ST_OMP_ASSUME.
	(case_omp_decl): Add ST_OMP_ASSUMES.
	(gfc_ascii_statement): Handle Assumes, optional return
	string without '!$OMP '/'!$ACC ' prefix.
	(is_omp_declarative_stmt, is_omp_informational_stmt): New.
	* parse.h (gfc_ascii_statement): Add optional bool arg to prototype.
	(is_omp_declarative_stmt, is_omp_informational_stmt): New prototype.
	* resolve.cc (gfc_resolve_blocks, gfc_resolve_code): Add
	EXEC_OMP_ASSUME.
	(gfc_resolve): Resolve ASSUMES directive.
	* symbol.cc (gfc_free_namespace): Free omp_assumes member.
	* st.cc (gfc_free_statement): Handle EXEC_OMP_ASSUME.
	* trans-openmp.cc (gfc_trans_omp_directive): Likewise.
	* trans.cc (trans_code): Likewise.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/assume-1.f90: New test.
	* gfortran.dg/gomp/assume-2.f90: New test.
	* gfortran.dg/gomp/assumes-1.f90: New test.
	* gfortran.dg/gomp/assumes-2.f90: New test.

 gcc/doc/invoke.texi  |   6 +-
 gcc/fortran/dump-parse-tree.cc   |  42 
 gcc/fortran/gfortran.h   |  22 +-
 gcc/fortran/match.h  |   2 +
 gcc/fortran/openmp.cc| 331 ++-
 gcc/fortran/parse.cc |  53 -
 gcc/fortran/parse.h  |   4 +-
 gcc/fortran/resolve.cc   |   6 +
 gcc/fortran/st.cc|   1 +
 gcc/fortran/symbol.cc|   8 +-
 gcc/fortran/trans-openmp.cc  |   2 +
 gcc/fortran/trans.cc |   1 +
 gcc/testsuite/gfortran.dg/gomp/assume-1.f90  |  24 ++
 gcc/testsuite/gfortran.dg/gomp/assume-2.f90  |  27 +++
 gcc/testsuite/gfortran.dg/gomp/assumes-1.f90 |  84 +++
 gcc/testsuite/gfortran.dg/gomp/assumes-2.f90 |   7 +
 libgomp/libgomp.texi |   2 +-
 17 files changed, 608 insertions(+), 14 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index a5dc6377835..e3701555f12 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -2749,9 +2749,9 @@ have support for @option{-pthread}. 

[committed] tree-cfg: Fix a verification diagnostic typo [PR107121]

2022-10-02 Thread Jakub Jelinek via Gcc-patches
Hi!

Obvious typo in diagnostics.

Committed as obvious to trunk.

2022-10-02  Jakub Jelinek  

PR tree-optimization/107121
* tree-cfg.cc (verify_gimple_call): Fix a typo in diagnostics,
DEFFERED_INIT -> DEFERRED_INIT.

--- gcc/tree-cfg.cc.jj  2022-09-29 09:13:31.321641176 +0200
+++ gcc/tree-cfg.cc 2022-10-02 16:41:09.716365999 +0200
@@ -3510,7 +3510,7 @@ verify_gimple_call (gcall *stmt)
   if (is_constant_size_arg0 && is_constant_size_lhs)
if (maybe_ne (size_from_arg0, size_from_lhs))
  {
-   error ("% calls should have same "
+   error ("% calls should have same "
   "constant size for the first argument and LHS");
return true;
  }

Jakub



Re: Adding a new thread model to GCC

2022-10-02 Thread LIU Hao via Gcc-patches

在 2022-10-02 04:02, Bernhard Reutner-Fischer 写道:

On 1 October 2022 20:34:45 CEST, LIU Hao via Gcc-patches 
 wrote:

Greetings.



The first patch is necessary because somewhere in libgfortran, `pthread_t` is 
referenced. If the thread model is not `posix`, it fails to compile.


One of several shortcomings mentioned already on Sun, 02 Sep 2018 15:40:28 
-0700 in
https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg196212.html



Forgive me but I didn't get your point. Is the 'shortcoming' the fact that `pthread_t` must be 
preferred to `__gthread_t`?


For non-posix thread models,  is not included, so `pthread_t` is not declared. I haven't 
looked at other code in libgfortran, but changing `pthread_t` to `__gthread_t` does allow 
libgfortran to build. I don't know how to test it though, as I don't write Fortran myself.



--
Best regards,
LIU Hao



OpenPGP_signature
Description: OpenPGP digital signature


[PATCH] c++: Disallow jumps into statement expressions

2022-10-02 Thread Jakub Jelinek via Gcc-patches
On Fri, Sep 30, 2022 at 04:39:25PM -0400, Jason Merrill wrote:
> > --- gcc/cp/decl.cc.jj   2022-09-22 00:14:55.478599363 +0200
> > +++ gcc/cp/decl.cc  2022-09-22 00:24:01.121178256 +0200
> > @@ -223,6 +223,7 @@ struct GTY((for_user)) named_label_entry
> > bool in_transaction_scope;
> > bool in_constexpr_if;
> > bool in_consteval_if;
> > +  bool in_assume;
> 
> I think it would be better to reject jumps into statement-expressions like
> the C front-end.

Ok, here is a self-contained patch that does that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-10-01  Jakub Jelinek  

* cp-tree.h (BCS_STMT_EXPR): New enumerator.
* name-lookup.h (enum scope_kind): Add sk_stmt_expr.
* name-lookup.cc (begin_scope): Handle sk_stmt_expr like sk_block.
* semantics.cc (begin_compound_stmt): For BCS_STMT_EXPR use
sk_stmt_expr.
* parser.cc (cp_parser_statement_expr): Use BCS_STMT_EXPR instead of
BCS_NORMAL.
* decl.cc (struct named_label_entry): Add in_stmt_expr.
(poplevel_named_label_1): Handle sk_stmt_expr.
(check_previous_goto_1): Diagnose entering of statement expression.
(check_goto): Likewise.

* g++.dg/ext/stmtexpr24.C: New test.

--- gcc/cp/cp-tree.h.jj 2022-09-30 18:38:55.351607176 +0200
+++ gcc/cp/cp-tree.h2022-10-01 13:06:20.731720730 +0200
@@ -7599,7 +7599,8 @@ enum {
   BCS_NO_SCOPE = 1,
   BCS_TRY_BLOCK = 2,
   BCS_FN_BODY = 4,
-  BCS_TRANSACTION = 8
+  BCS_TRANSACTION = 8,
+  BCS_STMT_EXPR = 16
 };
 extern tree begin_compound_stmt(unsigned int);
 
--- gcc/cp/name-lookup.h.jj 2022-09-23 09:02:31.103668514 +0200
+++ gcc/cp/name-lookup.h2022-10-01 13:37:50.158404107 +0200
@@ -200,6 +200,7 @@ enum scope_kind {
init-statement.  */
   sk_cond,  /* The scope of the variable declared in the condition
of an if or switch statement.  */
+  sk_stmt_expr, /* GNU statement expression block.  */
   sk_function_parms, /* The scope containing function parameters.  */
   sk_class, /* The scope containing the members of a class.  */
   sk_scoped_enum,/* The scope containing the enumerators of a C++11
--- gcc/cp/name-lookup.cc.jj2022-09-13 09:21:28.123540623 +0200
+++ gcc/cp/name-lookup.cc   2022-10-01 13:37:26.383732959 +0200
@@ -4296,6 +4296,7 @@ begin_scope (scope_kind kind, tree entit
 case sk_scoped_enum:
 case sk_transaction:
 case sk_omp:
+case sk_stmt_expr:
   scope->keep = keep_next_level_flag;
   break;
 
--- gcc/cp/semantics.cc.jj  2022-09-30 18:38:50.337675080 +0200
+++ gcc/cp/semantics.cc 2022-10-01 13:09:34.958970367 +0200
@@ -1761,6 +1761,8 @@ begin_compound_stmt (unsigned int flags)
sk = sk_try;
   else if (flags & BCS_TRANSACTION)
sk = sk_transaction;
+  else if (flags & BCS_STMT_EXPR)
+   sk = sk_stmt_expr;
   r = do_pushlevel (sk);
 }
 
--- gcc/cp/parser.cc.jj 2022-09-30 18:38:55.374606864 +0200
+++ gcc/cp/parser.cc2022-10-01 13:08:27.367927479 +0200
@@ -5272,7 +5272,7 @@ cp_parser_statement_expr (cp_parser *par
   /* Start the statement-expression.  */
   tree expr = begin_stmt_expr ();
   /* Parse the compound-statement.  */
-  cp_parser_compound_statement (parser, expr, BCS_NORMAL, false);
+  cp_parser_compound_statement (parser, expr, BCS_STMT_EXPR, false);
   /* Finish up.  */
   expr = finish_stmt_expr (expr, false);
   /* Consume the ')'.  */
--- gcc/cp/decl.cc.jj   2022-09-27 08:27:47.671428567 +0200
+++ gcc/cp/decl.cc  2022-10-01 13:14:57.990434730 +0200
@@ -223,6 +223,7 @@ struct GTY((for_user)) named_label_entry
   bool in_transaction_scope;
   bool in_constexpr_if;
   bool in_consteval_if;
+  bool in_stmt_expr;
 };
 
 #define named_labels cp_function_chain->x_named_labels
@@ -538,6 +539,9 @@ poplevel_named_label_1 (named_label_entr
case sk_transaction:
  ent->in_transaction_scope = true;
  break;
+   case sk_stmt_expr:
+ ent->in_stmt_expr = true;
+ break;
case sk_block:
  if (level_for_constexpr_if (bl->level_chain))
ent->in_constexpr_if = true;
@@ -3487,7 +3491,7 @@ check_previous_goto_1 (tree decl, cp_bin
   bool complained = false;
   int identified = 0;
   bool saw_eh = false, saw_omp = false, saw_tm = false, saw_cxif = false;
-  bool saw_ceif = false;
+  bool saw_ceif = false, saw_se = false;
 
   if (exited_omp)
 {
@@ -3560,6 +3564,12 @@ check_previous_goto_1 (tree decl, cp_bin
  saw_tm = true;
  break;
 
+   case sk_stmt_expr:
+ if (!saw_se)
+   inf = G_("  enters statement expression");
+ saw_se = true;
+ break;
+
case sk_block:
  if (!saw_cxif && level_for_constexpr_if (b->level_chain))
{
@@ -3650,12 +3660,13 @@ check_goto (tree decl)
 
   if (ent->in_try_scope || ent->in_catch_scope || 

New Swedish PO file for 'gcc' (version 12.2.0)

2022-10-02 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Swedish team of translators.  The file is available at:

https://translationproject.org/latest/gcc/sv.po

(This file, 'gcc-12.2.0.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.