Re: [RFC] rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-06-30 Thread Marc Glisse

On Mon, 29 Jun 2020, Richard Biener via Gcc-patches wrote:


At least without -frounding-math fegetround could be
constant folded to FE_TONEAREST for which we'd need the
actual value of FE_TONEAREST.


That will break existing code which, since -frounding-math doesn't work 
for that, protects all FP operations with volatile read/writes or similar 
asm, and then doesn't specify -frounding-math because it doesn't seem 
necessary. I am not saying that code is right, just that it exists.


In a world where we have implemented fenv_access, this kind of folding of 
fegetround could only happen in "#pragma fenv_access off" regions, which 
seems to imply that it would be the front-end's responsibility (although 
it would need help from the back-end to know the default value to fold 
to).


--
Marc Glisse


Emit a variable defined in gcc

2020-06-30 Thread Harshit Sharma via Gcc-patches
Hello,
I am working on a gcc patch for asan. The patch is almost ready except one
thing. To make sure that the user has applied this patch before using asan
feature, I want to declare an additional variable in gcc which is
referenced by our source code so that if this patch is missing, the user
gets an error compiling the code because the reference to this variable
will not be resolved.

I am still new to gcc development. So, can anyone tell me how can I make
gcc emit this variable?


Thanks,
Harshit


Re: [PATCH] underline null argument in -Wnonnull (PR c++/86568)

2020-06-30 Thread Andreas Schwab
This breaks a lot of tests due to the different error messages.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


[PATCH] VEC_COND_EXPR: do not expand comparisons feeding it

2020-06-30 Thread Martin Liška

Hi.

The patch is about blocking of vector expansion of comparisons
that are only feeding a VEC_COND_EXPR statements.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
The problematic mips64 test-case looks good now.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* tree-vect-generic.c (expand_vector_comparison): Do not expand
comparison that only feed first argument of a VEC_COND_EXPR statement.
---
 gcc/tree-vect-generic.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index a4b56195903..4606decd0f2 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -379,6 +379,30 @@ static tree
 expand_vector_comparison (gimple_stmt_iterator *gsi, tree type, tree op0,
   tree op1, enum tree_code code)
 {
+  tree lhs = gimple_assign_lhs (gsi_stmt (*gsi));
+  use_operand_p use_p;
+  imm_use_iterator iterator;
+  bool vec_cond_expr_only = true;
+  bool has_use = false;
+
+  /* As seen in PR95830, we should not expand comparisons that are only
+ feeding a VEC_COND_EXPR statement.  */
+  FOR_EACH_IMM_USE_FAST (use_p, iterator, lhs)
+{
+  has_use = true;
+  gassign *use = dyn_cast (USE_STMT (use_p));
+  if (use == NULL
+ || gimple_assign_rhs_code (use) != VEC_COND_EXPR
+ || gimple_assign_rhs1 (use) != lhs)
+   {
+ vec_cond_expr_only = false;
+ break;
+   }
+}
+
+  if (has_use && vec_cond_expr_only)
+return NULL_TREE;
+
   tree t;
   if (!expand_vec_cmp_expr_p (TREE_TYPE (op0), type, code)
   && !expand_vec_cond_expr_p (type, TREE_TYPE (op0), code))
--
2.27.0



Re: [PATCH PR95855]A missing ifcvt optimization to generate fcsel

2020-06-30 Thread Richard Biener via Gcc-patches
On Tue, Jun 30, 2020 at 4:29 AM yangyang (ET)  wrote:
>
> Hi,
>
> > > Hi,
> > >
> > > This is a simple fix for pr95855.
> > >
> > > With this fix, pass_split_paths can recognize the if-conversion
> > opportunity of the testcase and doesn't duplicate the corresponding block.
> > >
> > > Added one testcase for this. Bootstrap and tested on both aarch64 and
> > x86 Linux platform, no new regression witnessed.
> > >
> > > Ok for trunk?
> >
> > Can you try using the num_stmts_in_pred[12] counts instead of using
> > empty_block_p?
>
> It' ok to using num_stmts_in_pred[12] to judge whether the pred[12] is
> empty since bb's immediate dominator can't meet the constraints
> "single_pred_p (pred[12]) && single_pred (pred[12]) == pred[21]".
>
> >
> > Your matching doesn't allow for FP constants like
> >
> >  dmax[0] = d1[i] < 1.0 ? 1.0 : d1[i];
> >
> > since FP constants are not shared.  You likely want to use operand_equal_p 
> > to
> > do the PHI argument comparison.
>
> That's right, after using operand_equal_p instead of == to do the PHI argument
> Comparison, the mentioned case can be covered as well.
>
> >
> > Thanks,
> > Richard.
>
> Thanks for your suggestions. We have revised our patch based on your 
> suggestions.
>
> Bootstrap and tested on both aarch64 and x86 Linux platform. Does the v1 
> patch looks better?

Yes.  This variant is OK.

Thanks,
Richard.

> Yang Yang
>
> +2020-06-30  Yang Yang  
> +
> +   PR tree-optimization/95855
> +   * gimple-ssa-split-paths.c (is_feasible_trace): Add extra
> +   checks to recognize a missed if-conversion opportunity when
> +   judging whether to duplicate a block.
> +
>
> +2020-06-30 Yang Yang  
> +
> +   PR tree-optimization/95855
> +   * gcc.dg/tree-ssa/split-paths-12.c: New testcase.
> +


Re: [PATCH 2/2] Tune memcpy and memset for Zen cores.

2020-06-30 Thread Martin Liška

PING^3

On 6/12/20 10:31 AM, Martin Liška wrote:

PING^2

On 6/3/20 11:14 AM, Martin Liška wrote:

On 6/1/20 1:40 PM, Martin Liška wrote:

Adding Honza as Uros recommended him for a review.


@Honza asked me personally to remind him this patch ;)

Martin






Re: [stage1][PATCH] Make TOPN counter dynamically allocated.

2020-06-30 Thread Martin Liška

PING^2

On 6/17/20 8:38 AM, Martin Liška wrote:

PING^1

On 6/3/20 8:28 AM, Martin Liška wrote:

On 6/2/20 3:19 PM, Martin Liška wrote:

I'm suggesting to pre-allocate 16 gcov_kvp in the gcov run-time library.
Please take a look at the attached patch. I also added a test-case that
stresses that. I've just finished LTO PGO bootstrap of the GCC.

Ready for master?


There's V2 of the patch:

- guard __atomic_fetch_add in GCOV_SUPPORTS_ATOMIC

Martin






Re: [PATCH] Split load permutation

2020-06-30 Thread Richard Biener
On Tue, 30 Jun 2020, Richard Sandiford wrote:

> Sorry for the slow reponse…

No problem.

> Richard Biener  writes:
> > This splits SLP load nodes with load permutation into a SLP
> > load node (with load "permutation" removing gaps) and a
> > lane permutation node.  The first and foremost goal of this
> > is to be able to have a single SLP node for each load group
> > so we can start making decisions about how to vectorize
> > them factoring in all loads of that group.  The second
> > goal is to eventually be able to optimize permutations by
> > pushing them down where they can be combined from multiple
> > children to the output.  We do have one very special case
> > handled by vect_attempt_slp_rearrange_stmts doing it all
> > the way down for reductions that are associative.
> 
> Sounds great!
> 
> > For example for
> >
> >   l1 = a[0]; l2 = a[1];
> >   b[0] = l1; b[1] = l2;
> >   c[0] = l2; c[1] = l1;
> >
> > wecan avoid generating loads twice.  For
> >
> >   l1 = a[0]; l2 = a[1]; l3 = a[2];
> >   b[0] = l1; b[1] = l2;
> >  c[0] = l2; c[1] = l3;
> >
> > we will have a SLP load node with three lanes plus
> > two lane permutation nodes selecting two lanes each.  In
> > a loop context this will cause a VF of two and three
> > loads per vectorized loop iteration (plus two permutes)
> > while previously we used a VF of one with two loads
> > and no permutation per loop iteration.  In the new
> > scheme the number of loads is less but we pay with
> > permutes and a higher VF.
> >
> > There is (bad) interaction with determining of the vectorization
> > factor which for BB vectorization causes missed vectorizations
> > because the "load parts of a dataref group directly" optimization
> > is not (yet) reflected in the SLP graph.
> >
> > There is (bad) interaction with CTOR vectorization since we now
> > get confused about where to insert vectorized stmts when combining
> > two previously distinct SLP graphs.
> >
> > My immediate focus is on the SSA verification FAILs but the
> > other part points at a missing piece in this - a "pass"
> > that "optimizes" the SLP graph with respect to permutations
> > and loads, ultimatively for example deciding between using
> > interleaving schemes, scalar loads, "SLP" + permutation,
> > load-lanes, etc.;  This is also the thing that blocks
> > SLP only (apart from teaching the few pieces that cannot do SLP
> > to do SLP).
> >
> > I'm posting this mostly to make people think how it fits
> > their future development and architecture features.
> 
> Yeah, the interleaving scheme is something we'd very much like for SVE2,
> where for some operations that produce multiple vectors, it would be better
> to organise them on an even/odd division instead of a low/high division.
> There are instructions both to produce and to consume values in odd/even
> form, so in many cases no explicit permutation would be needed.
> 
> I've also seen cases for downward loops where we reverse vectors after
> loading and reverse them again before storing, even though the loop
> could easily have operated on the reversed vectors directly.
>
> Another thing we'd like for SVE in general is to allow vectors *with*
> gaps throughout the SLP graph, since with predication it's simple to
> ignore any inactive element where necessary.  This is often much cheaper
> than packing the active elements together to remove gaps.  For example:
> 
>   a[0] += 1;
>   a[2] += 1;
>   a[3] += 1;
> 
> should just be a predicated load, add and store, with no shuffling.
> 
> Even on targets without predication, it might be better to leave
> the gaps in for part of the graph, e.g. if two loads feed a single
> permute.  So it would be good if the node representation allowed
> arbitrary permutations, including with “dead” elements at any
> position.  (I realise that isn't news and that keeping the gap
> removal with the load was just “for now” :-))

Hmm, OK.  So I'm still experimenting with how to incrementally push
these kind of changes to master.  Unfortunately it resists any
"serious" change because we've painted us into a corner with respect
to load and store handling ;)  Well, it just means the change will
need to be bigger than anticipated.

As for inactive lanes, for SVE this means a mask register for each
operation, correct?  At the moment the SLP graph is a representation
of the scalar GIMPLE IL and we cannot really change that until we
commit to a vectorization factor and more.  So an inactive lane
is simply not there and a "load" is simply another way of building
up a vector from scalar stmt results much like those "built from scalars"
SLP nodes but we of course make them special because we have those
DR groups that are used.

So with the patch as posted the "gaps" are represented in the
load permutation of the "loads" which is where you could create
mask registers from and simply push them to SLP parents (up to
the point you get to a parent whose childs have differing masks...).

I think we're going to 

Re: [PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182

2020-06-30 Thread Richard Sandiford
"Yangfei (Felix)"  writes:
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c 
> b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> index e050db1a2e4..ea39fcac0e0 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile } */
>  /* { dg-additional-options "-O3" } */
>  /* { dg-additional-options "-mavx2" { target { i?86-*-* x86_64-*-* } } } */
> +/* { dg-additional-options "-march=armv8.2-a+sve -fno-vect-cost-model" { 
> target aarch64*-*-* } } */
>  
>  typedef struct {
>  unsigned short mprr_2[5][16][16];

This test is useful for Advanced SIMD too, so I think we should continue
to test it with whatever options the person running the testsuite chose.
Instead we could duplicate the test in gcc.target/aarch64/sve with
appropriate options.

> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index eb8288e7a85..b30a7d8a3bb 100644
> --- a/gcc/tree-vect-data-refs.c
> +++ b/gcc/tree-vect-data-refs.c
> @@ -1823,8 +1823,11 @@ vect_enhance_data_refs_alignment (loop_vec_info 
> loop_vinfo)
>   {
> poly_uint64 nscalars = (STMT_SLP_TYPE (stmt_info)
> ? vf * DR_GROUP_SIZE (stmt_info) : 
> vf);
> -   possible_npeel_number
> - = vect_get_num_vectors (nscalars, vectype);
> +   if (maybe_lt (nscalars, TYPE_VECTOR_SUBPARTS (vectype)))
> + possible_npeel_number = 0;
> +   else
> + possible_npeel_number
> +   = vect_get_num_vectors (nscalars, vectype);
>  
> /* NPEEL_TMP is 0 when there is no misalignment, but also
>allow peeling NELEMENTS.  */

OK, so this is coming from:

  int s[16][2];
  …
  … =s[j][1];

and an SLP node containing 16 instances of “s[j][1]”.  The DR_GROUP_SIZE
is 2 because that's the inner dimension of “s”.

I don't think maybe_lt is right here though.  The same problem could in
principle happen for cases in which NSCALARS > TYPE_VECTOR_SUBPARTS,
e.g. for different inner dimensions of “s”.

I think the testcase shows that using DR_GROUP_SIZE in this calculation
is flawed.  I'm not sure whether we can really do better given the current
representation though.  This is one case where having a separate dr_vec_info
per SLP node would help.

Maybe one option (for now) would be to use:

  if (multiple_p (nscalars, TREE_VECTOR_SUBPARTS (vectype)))
possible_npeel_number = vect_get_num_vectors (nscalars, vectype);
  else
/* This isn't a simple stream of contiguous vector accesses.  It's hard
   to predict from the available information how many vector accesses
   we'll actually need per iteration, so be conservative and assume
   one.  */
possible_npeel_number = 1;

BTW, I'm not sure whether the current choice of STMT_SLP_TYPE (stmt_info)
instead of PURE_SLP_STMT (stmt_info) is optimal or not.  It means that for
hybrid SLP we base the peeling on the SLP stmt rather than the non-SLP stmt.
I guess hybrid SLP is going away soon though, so let's not worry about
that. :-)

Maybe Richard has a better suggestion.

Thanks,
Richard


Re: [PATCH] c-family: Avoid ICEs on calls to internal functions [PR95963]

2020-06-30 Thread Richard Biener
On Tue, 30 Jun 2020, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase ICEs since recent Martin's -Wnonnull changes,
> we see a CALL_EXPR and ICE because CALL_EXPR_FN is NULL, which is
> valid for internal function calls.  Internal function calls don't have a
> function type, and will never have format_arg attribute on them nor will
> serve as the i18n routines -Wformat cares about.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

OK.

Richard.

> 2020-06-30  Jakub Jelinek  
> 
>   PR c++/95963
>   * c-common.c (check_function_arguments_recurse): Don't crash on
>   calls to internal functions.
> 
>   * g++.dg/cpp1z/launder9.C: New test.
> 
> --- gcc/c-family/c-common.c.jj2020-06-29 14:51:54.821086398 +0200
> +++ gcc/c-family/c-common.c   2020-06-29 21:06:02.624773345 +0200
> @@ -5815,7 +5815,7 @@ check_function_arguments_recurse (void (
>return;
>  }
>  
> -  if (TREE_CODE (param) == CALL_EXPR)
> +  if (TREE_CODE (param) == CALL_EXPR && CALL_EXPR_FN (param))
>  {
>tree type = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (param)));
>tree attrs;
> --- gcc/testsuite/g++.dg/cpp1z/launder9.C.jj  2020-06-29 21:07:02.191932465 
> +0200
> +++ gcc/testsuite/g++.dg/cpp1z/launder9.C 2020-06-29 21:08:19.137846260 
> +0200
> @@ -0,0 +1,11 @@
> +// PR c++/95963
> +// { dg-do compile }
> +// { dg-options "-Wnonnull" }
> +
> +struct A { virtual void foo (); };
> +
> +void
> +bar (A *p)
> +{
> +  __builtin_launder (p)->foo ();
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH] VEC_COND_EXPR: do not expand comparisons feeding it

2020-06-30 Thread Richard Biener via Gcc-patches
On Tue, Jun 30, 2020 at 11:44 AM Martin Liška  wrote:
>
> Hi.
>
> The patch is about blocking of vector expansion of comparisons
> that are only feeding a VEC_COND_EXPR statements.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> The problematic mips64 test-case looks good now.
>
> Ready to be installed?

So why does

static tree
expand_vector_comparison (gimple_stmt_iterator *gsi, tree type, tree op0,
  tree op1, enum tree_code code)
{
  tree t;
  if (!expand_vec_cmp_expr_p (TREE_TYPE (op0), type, code)
  && !expand_vec_cond_expr_p (type, TREE_TYPE (op0), code))


not return true
{

but

/* Expand a vector condition to scalars, by using many conditions
   on the vector's elements.  */
static void
expand_vector_condition (gimple_stmt_iterator *gsi, auto_bitmap *dce_ssa_names)
{
...
  if (expand_vec_cond_expr_p (type, TREE_TYPE (a1), code))
{
  gcc_assert (TREE_CODE (a) == SSA_NAME || TREE_CODE (a) == VECTOR_CST);
  return;

does?  What's special about the problematical mips testcase?  Can't you produce
the same "bad" result when the comparison is used in a non-VEC_COND?

There's a PR reference missing in the ChangeLog.

Richard.

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> * tree-vect-generic.c (expand_vector_comparison): Do not expand
> comparison that only feed first argument of a VEC_COND_EXPR statement.
> ---
>   gcc/tree-vect-generic.c | 24 
>   1 file changed, 24 insertions(+)
>
> diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
> index a4b56195903..4606decd0f2 100644
> --- a/gcc/tree-vect-generic.c
> +++ b/gcc/tree-vect-generic.c
> @@ -379,6 +379,30 @@ static tree
>   expand_vector_comparison (gimple_stmt_iterator *gsi, tree type, tree op0,
> tree op1, enum tree_code code)
>   {
> +  tree lhs = gimple_assign_lhs (gsi_stmt (*gsi));
> +  use_operand_p use_p;
> +  imm_use_iterator iterator;
> +  bool vec_cond_expr_only = true;
> +  bool has_use = false;
> +
> +  /* As seen in PR95830, we should not expand comparisons that are only
> + feeding a VEC_COND_EXPR statement.  */
> +  FOR_EACH_IMM_USE_FAST (use_p, iterator, lhs)
> +{
> +  has_use = true;
> +  gassign *use = dyn_cast (USE_STMT (use_p));
> +  if (use == NULL
> + || gimple_assign_rhs_code (use) != VEC_COND_EXPR
> + || gimple_assign_rhs1 (use) != lhs)
> +   {
> + vec_cond_expr_only = false;
> + break;
> +   }
> +}
> +
> +  if (has_use && vec_cond_expr_only)
> +return NULL_TREE;
> +
> tree t;
> if (!expand_vec_cmp_expr_p (TREE_TYPE (op0), type, code)
> && !expand_vec_cond_expr_p (type, TREE_TYPE (op0), code))
> --
> 2.27.0
>


[patch, fortran, committed] Fix PR 95366, wrong code and ABI breakage

2020-06-30 Thread Thomas Koenig

Hello world,

fortunately, upon inspection the fix for this PR turned out to be
obvious (and simple).  I have committed it as such.  Unless there
is unexpected fallout, I intend to backport it to gcc-10 over
the weekend so it can still be included into 10.2.

Regression-tested on x86_64 and on a big-endian POWER, to make
sure the hash values were the same.

Regards

Thomas

Use CHARACTER(kind) string for calculating the type hash.

This regression came about because of a change in the way
types are displayed in error messages.  The character
representation is also used to calculate the hashes for
our types, so this patch restores the old behavior if
we are indeed calculating a hash.

The test case also checks for the specific hash value because
changing that would be an ABI change, which we should not
be doing unintentionally.

gcc/fortran/ChangeLog:


2020-06-30  Thomas Koenig  

PR fortran/95355
* gfortran.h (gfc_typename): Add optional argument for_hash.
* misc.c (gfc_typename): When for_hash is true, just return
CHARACTER(kind).
* class.c (gfc_intrinsic_hash_value): Call gfc_typename with
for_hash = true.
diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c
index 2b760efe8d7..08705c7e95d 100644
--- a/gcc/fortran/class.c
+++ b/gcc/fortran/class.c
@@ -564,7 +564,7 @@ unsigned int
 gfc_intrinsic_hash_value (gfc_typespec *ts)
 {
   unsigned int hash = 0;
-  const char *c = gfc_typename (ts);
+  const char *c = gfc_typename (ts, true);
   int i, len;
 
   len = strlen (c);
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 836e0b3063d..24c5101c4cb 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -2931,7 +2931,7 @@ void gfc_clear_ts (gfc_typespec *);
 FILE *gfc_open_file (const char *);
 const char *gfc_basic_typename (bt);
 const char *gfc_dummy_typename (gfc_typespec *);
-const char *gfc_typename (gfc_typespec *);
+const char *gfc_typename (gfc_typespec *, bool for_hash = false);
 const char *gfc_typename (gfc_expr *);
 const char *gfc_op2string (gfc_intrinsic_op);
 const char *gfc_code2string (const mstring *, int);
diff --git a/gcc/fortran/misc.c b/gcc/fortran/misc.c
index 46c6277c2b9..65bcfa6162f 100644
--- a/gcc/fortran/misc.c
+++ b/gcc/fortran/misc.c
@@ -122,7 +122,7 @@ gfc_basic_typename (bt type)
the argument list of a single statement.  */
 
 const char *
-gfc_typename (gfc_typespec *ts)
+gfc_typename (gfc_typespec *ts, bool for_hash)
 {
   static char buffer1[GFC_MAX_SYMBOL_LEN + 7];  /* 7 for "TYPE()" + '\0'.  */
   static char buffer2[GFC_MAX_SYMBOL_LEN + 7];
@@ -149,6 +149,12 @@ gfc_typename (gfc_typespec *ts)
   sprintf (buffer, "LOGICAL(%d)", ts->kind);
   break;
 case BT_CHARACTER:
+  if (for_hash)
+	{
+	  sprintf (buffer, "CHARACTER(%d)", ts->kind);
+	  break;
+	}
+
   if (ts->u.cl && ts->u.cl->length)
 	length = gfc_mpz_get_hwi (ts->u.cl->length->value.integer);
   if (ts->kind == gfc_default_character_kind)
diff --git a/gcc/testsuite/gfortran.dg/select_type_49.f90 b/gcc/testsuite/gfortran.dg/select_type_49.f90
new file mode 100644
index 000..31203cd18fc
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/select_type_49.f90
@@ -0,0 +1,43 @@
+! { dg-do run }
+! { dg-additional-options "-fdump-tree-original" }
+! PR 95366 - this did not work due the wrong hashes
+! being generated for CHARACTER variables.
+MODULE mod1
+  implicit none
+  integer :: tst(3)
+CONTAINS
+  subroutine showpoly(poly)
+CLASS(*), INTENT(IN) :: poly(:)
+SELECT TYPE (poly)
+TYPE IS(INTEGER)
+   tst(1) = tst(1) + 1
+TYPE IS(character(*))
+   tst(2) = tst(2) + 1
+class default
+   tst(3) = tst(3) + 1
+end select
+  end subroutine showpoly
+END MODULE mod1
+MODULE mod2
+  implicit none
+CONTAINS
+subroutine polytest2()
+   use mod1
+   integer :: a(1)
+   character(len=42) :: c(1)
+   call showpoly(a)
+   if (any(tst /= [1,0,0])) stop 1
+   call showpoly(c)
+   if (any(tst /= [1,1,0])) stop 2
+end subroutine polytest2
+END MODULE mod2
+PROGRAM testpoly
+  use mod2
+  CALL polytest2()
+END PROGRAM testpoly
+! The value of the hashes are also checked.  If you get
+! a failure here, be aware that changing that value is
+! an ABI change.
+
+! { dg-final { scan-tree-dump-times "== 17759" 1 "original" } }  
+! { dg-final { scan-tree-dump-times "== 85893463" 1 "original" } }


Re: [PATCH] VEC_COND_EXPR: do not expand comparisons feeding it

2020-06-30 Thread Martin Liška

On 6/30/20 12:38 PM, Richard Biener wrote:

On Tue, Jun 30, 2020 at 11:44 AM Martin Liška  wrote:


Hi.

The patch is about blocking of vector expansion of comparisons
that are only feeding a VEC_COND_EXPR statements.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
The problematic mips64 test-case looks good now.

Ready to be installed?


So why does

static tree
expand_vector_comparison (gimple_stmt_iterator *gsi, tree type, tree op0,
   tree op1, enum tree_code code)
{
   tree t;
   if (!expand_vec_cmp_expr_p (TREE_TYPE (op0), type, code)
   && !expand_vec_cond_expr_p (type, TREE_TYPE (op0), code))


not return true


Apparently because it's called for type:
(gdb) p debug_tree(gimple_expr_type(stmt))
 
unit-size 
align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x777212a0 
precision:32 min  max >
BLK
size  constant 64>
unit-size  constant 8>
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
0x77721348 nunits:2>

while ...


 {

but

/* Expand a vector condition to scalars, by using many conditions
on the vector's elements.  */
static void
expand_vector_condition (gimple_stmt_iterator *gsi, auto_bitmap *dce_ssa_names)
{
...
   if (expand_vec_cond_expr_p (type, TREE_TYPE (a1), code))
 {
   gcc_assert (TREE_CODE (a) == SSA_NAME || TREE_CODE (a) == VECTOR_CST);
   return;

does?


this has

 
unit-size 
align:32 warn_if_not_align:0 symtab:0 alias-set 1 canonical-type 
0x7763e2a0 precision:32
pointer_to_this >
V2SF
size  constant 64>
unit-size  constant 8>
align:64 warn_if_not_align:0 symtab:0 alias-set 1 canonical-type 
0x776fb3f0 nunits:2
pointer_to_this >


What's special about the problematical mips testcase?


That we end up with something like:

  _34 = {_29, _37};
  vect_iftmp.12_35 = VEC_COND_EXPR <_34, vect__1.7_28, vect__2.10_31>;

which can't be expanded to .VCOND, but we expand to series of BIT_FIELD_REFs:

  _40 = BIT_FIELD_REF ;
  _22 = BIT_FIELD_REF <_34, 32, 0>;
  _10 = _22 != 0 ? _26 : _40;
  _16 = BIT_FIELD_REF ;
  _18 = BIT_FIELD_REF <_34, 32, 32>;
  _41 = _18 == 0 ? _16 : _30;



Can't you produce
the same "bad" result when the comparison is used in a non-VEC_COND?


Theoretically yes.

Anyway, question is how much do we care about the fallout as it happens for 
MIPS and
it's only a worse code stuff?

Martin



There's a PR reference missing in the ChangeLog.

Richard.


Thanks,
Martin

gcc/ChangeLog:

 * tree-vect-generic.c (expand_vector_comparison): Do not expand
 comparison that only feed first argument of a VEC_COND_EXPR statement.
---
   gcc/tree-vect-generic.c | 24 
   1 file changed, 24 insertions(+)

diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index a4b56195903..4606decd0f2 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -379,6 +379,30 @@ static tree
   expand_vector_comparison (gimple_stmt_iterator *gsi, tree type, tree op0,
 tree op1, enum tree_code code)
   {
+  tree lhs = gimple_assign_lhs (gsi_stmt (*gsi));
+  use_operand_p use_p;
+  imm_use_iterator iterator;
+  bool vec_cond_expr_only = true;
+  bool has_use = false;
+
+  /* As seen in PR95830, we should not expand comparisons that are only
+ feeding a VEC_COND_EXPR statement.  */
+  FOR_EACH_IMM_USE_FAST (use_p, iterator, lhs)
+{
+  has_use = true;
+  gassign *use = dyn_cast (USE_STMT (use_p));
+  if (use == NULL
+ || gimple_assign_rhs_code (use) != VEC_COND_EXPR
+ || gimple_assign_rhs1 (use) != lhs)
+   {
+ vec_cond_expr_only = false;
+ break;
+   }
+}
+
+  if (has_use && vec_cond_expr_only)
+return NULL_TREE;
+
 tree t;
 if (!expand_vec_cmp_expr_p (TREE_TYPE (op0), type, code)
 && !expand_vec_cond_expr_p (type, TREE_TYPE (op0), code))
--
2.27.0





RE: PSA: Default C++ dialect is now C++17

2020-06-30 Thread Tamar Christina
> >>
> >
> > Just a small note that 510.parest_r SPEC 2017 benchmark can't be built
> > now with default changed to -std=c++17. The spec config needs to be
> adjusted.
> >
> > Martin
> 
> And there one another failure in 520.omnetpp_r caused by run-time error:
>  Error during startup: Register_Function() or cMathFunction: attempt to
> register function "SPEC_HYPOT" with wrong number of arguments 2, should
> be 3.
> 
> which is about call of std::__hypot3 which has newly 3 args since c++ 17:
> https://en.cppreference.com/w/cpp/numeric/math/hypot
> 

Thanks!, I was scratching my head where this error came from..

Regards,
Tamar

> Martin



Re: [PATCH] reassoc: Propagate PHI_LOOP_BIAS along single uses

2020-06-30 Thread Richard Biener via Gcc-patches
On Mon, Jun 29, 2020 at 4:58 PM Ilya Leoshkevich  wrote:
>
> On Thu, 2020-06-25 at 14:34 +0200, Richard Biener wrote:
> > On Wed, Jun 24, 2020 at 1:31 AM Ilya Leoshkevich via Gcc-patches
> >  wrote:
> > > Bootstrapped and regtested x86_64-redhat-linux, ppc64le-redhat-
> > > linux and
> > > s390x-redhat-linux.  I also ran SPEC 2006 and 2017 on these
> > > platforms,
> > > and the only measurable regression was 3% in 520.omnetpp_r on ppc,
> > > which
> > > went away after inserting a single nop at the beginning of
> > > cDynamicExpression::evaluate.
> > >
> > > OK for master?
> >
> > As you might know this is incredibly fragile so I'd prefer if you
> > submit
> > and push the change to disable PHI biasing in the early pass instance
> > separately.  At first glance that change looks reasonable (but we'll
> > watch for fallout).
>
> Will do.
>
> >
> > Comments on the other changes inline
> >
> > > ---
> > >
> > > PR tree-optimization/49749 introduced code that shortens dependency
> > > chains containing loop accumulators by placing them last on operand
> > > lists of associative operations.
> > >
> > > 456.hmmer benchmark on s390 could benefit from this, however, the
> > > code
> > > that needs it modifies loop accumulator before using it, and since
> > > only
> > > so-called loop-carried phis are are treated as loop accumulators,
> > > the
> > > code in the present form doesn't really help.   According to Bill
> > > Schmidt - the original author - such a conservative approach was
> > > chosen
> > > so as to avoid unnecessarily swapping operands, which might cause
> > > unpredictable effects.  However, giving special treatment to forms
> > > of
> > > loop accumulators is acceptable.
> > >
> > > The definition of loop-carried phi is: it's a single-use phi, which
> > > is
> > > used in the same innermost loop it's defined in, at least one
> > > argument
> > > of which is defined in the same innermost loop as the phi itself.
> > > Given this, it seems natural to treat single uses of such phis as
> > > phis
> > > themselves.
> > >
> > > gcc/ChangeLog:
> > >
> > > 2020-05-06  Ilya Leoshkevich  
> > >
> > > * passes.def (pass_reassoc): Rename parameter to early_p.
> > > * tree-ssa-reassoc.c
> > > (reassoc_bias_loop_carried_phi_ranks_p):
> > > New variable.
> > > (phi_rank): Don't bias loop-carried phi ranks
> > > before vectorization pass.
> > > (loop_carried_phi): Remove (superseded by
> > > operand_rank::biased_p).
> > > (propagate_rank): Propagate bias along single uses.
> > > (get_rank): Pass stmt to propagate_rank.
> > > (execute_reassoc): Add bias_loop_carried_phi_ranks_p
> > > parameter.
> > > (pass_reassoc::pass_reassoc): Add
> > > bias_loop_carried_phi_ranks_p
> > > initializer.
> > > (pass_reassoc::set_param): Set
> > > bias_loop_carried_phi_ranks_p
> > > value.
> > > (pass_reassoc::execute): Pass bias_loop_carried_phi_ranks_p
> > > to
> > > execute_reassoc.
> > > (pass_reassoc::bias_loop_carried_phi_ranks_p): New member.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > 2020-05-06  Ilya Leoshkevich  
> > >
> > > * gcc.target/s390/reassoc-1.c: New test.
> > > * gcc.target/s390/reassoc-2.c: New test.
> > > * gcc.target/s390/reassoc-3.c: New test.
> > > * gcc.target/s390/reassoc.h: New test.
> > > ---
> > >  gcc/passes.def|  4 +-
> > >  gcc/testsuite/gcc.target/s390/reassoc-1.c |  6 ++
> > >  gcc/testsuite/gcc.target/s390/reassoc-2.c |  7 ++
> > >  gcc/testsuite/gcc.target/s390/reassoc-3.c |  8 ++
> > >  gcc/testsuite/gcc.target/s390/reassoc.h   | 22 +
> > >  gcc/tree-ssa-reassoc.c| 97 ++-
> > > 
> > >  6 files changed, 105 insertions(+), 39 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-1.c
> > >  create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-2.c
> > >  create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-3.c
> > >  create mode 100644 gcc/testsuite/gcc.target/s390/reassoc.h
> > >
> > > diff --git a/gcc/passes.def b/gcc/passes.def
> > > index 2b1e09fdda3..6864f583f20 100644
> > > --- a/gcc/passes.def
> > > +++ b/gcc/passes.def
> > > @@ -235,7 +235,7 @@ along with GCC; see the file COPYING3.  If not
> > > see
> > >  program and isolate those paths.  */
> > >NEXT_PASS (pass_isolate_erroneous_paths);
> > >NEXT_PASS (pass_dse);
> > > -  NEXT_PASS (pass_reassoc, true /* insert_powi_p */);
> > > +  NEXT_PASS (pass_reassoc, true /* early_p */);
> > >NEXT_PASS (pass_dce);
> > >NEXT_PASS (pass_forwprop);
> > >NEXT_PASS (pass_phiopt, false /* early_p */);
> > > @@ -312,7 +312,7 @@ along with GCC; see the file COPYING3.  If not
> > > see
> > >NEXT_PASS (pass_lower_vector_ssa);
> > >NEXT_PASS (pass_lower_switch);
> > >NEXT_PASS (pass_cse_reciprocals);
> > > -

Re: [PATCH] Split load permutation

2020-06-30 Thread Richard Sandiford
Sorry for the slow reponse…

Richard Biener  writes:
> This splits SLP load nodes with load permutation into a SLP
> load node (with load "permutation" removing gaps) and a
> lane permutation node.  The first and foremost goal of this
> is to be able to have a single SLP node for each load group
> so we can start making decisions about how to vectorize
> them factoring in all loads of that group.  The second
> goal is to eventually be able to optimize permutations by
> pushing them down where they can be combined from multiple
> children to the output.  We do have one very special case
> handled by vect_attempt_slp_rearrange_stmts doing it all
> the way down for reductions that are associative.

Sounds great!

> For example for
>
>   l1 = a[0]; l2 = a[1];
>   b[0] = l1; b[1] = l2;
>   c[0] = l2; c[1] = l1;
>
> wecan avoid generating loads twice.  For
>
>   l1 = a[0]; l2 = a[1]; l3 = a[2];
>   b[0] = l1; b[1] = l2;
>  c[0] = l2; c[1] = l3;
>
> we will have a SLP load node with three lanes plus
> two lane permutation nodes selecting two lanes each.  In
> a loop context this will cause a VF of two and three
> loads per vectorized loop iteration (plus two permutes)
> while previously we used a VF of one with two loads
> and no permutation per loop iteration.  In the new
> scheme the number of loads is less but we pay with
> permutes and a higher VF.
>
> There is (bad) interaction with determining of the vectorization
> factor which for BB vectorization causes missed vectorizations
> because the "load parts of a dataref group directly" optimization
> is not (yet) reflected in the SLP graph.
>
> There is (bad) interaction with CTOR vectorization since we now
> get confused about where to insert vectorized stmts when combining
> two previously distinct SLP graphs.
>
> My immediate focus is on the SSA verification FAILs but the
> other part points at a missing piece in this - a "pass"
> that "optimizes" the SLP graph with respect to permutations
> and loads, ultimatively for example deciding between using
> interleaving schemes, scalar loads, "SLP" + permutation,
> load-lanes, etc.;  This is also the thing that blocks
> SLP only (apart from teaching the few pieces that cannot do SLP
> to do SLP).
>
> I'm posting this mostly to make people think how it fits
> their future development and architecture features.

Yeah, the interleaving scheme is something we'd very much like for SVE2,
where for some operations that produce multiple vectors, it would be better
to organise them on an even/odd division instead of a low/high division.
There are instructions both to produce and to consume values in odd/even
form, so in many cases no explicit permutation would be needed.

I've also seen cases for downward loops where we reverse vectors after
loading and reverse them again before storing, even though the loop
could easily have operated on the reversed vectors directly.

Another thing we'd like for SVE in general is to allow vectors *with*
gaps throughout the SLP graph, since with predication it's simple to
ignore any inactive element where necessary.  This is often much cheaper
than packing the active elements together to remove gaps.  For example:

  a[0] += 1;
  a[2] += 1;
  a[3] += 1;

should just be a predicated load, add and store, with no shuffling.

Even on targets without predication, it might be better to leave
the gaps in for part of the graph, e.g. if two loads feed a single
permute.  So it would be good if the node representation allowed
arbitrary permutations, including with “dead” elements at any
position.  (I realise that isn't news and that keeping the gap
removal with the load was just “for now” :-))

I guess this to some extent feeds into a long-standing TODO to allow
“don't care” elements in things like vec_perm_indices.

Thanks,
Richard


RE: [PATCH PR95855]A missing ifcvt optimization to generate fcsel

2020-06-30 Thread yangyang (ET)
> On Tue, Jun 30, 2020 at 4:29 AM yangyang (ET) 
> wrote:
> >
> > Hi,
> >
> > > > Hi,
> > > >
> > > > This is a simple fix for pr95855.
> > > >
> > > > With this fix, pass_split_paths can recognize the
> > > > if-conversion
> > > opportunity of the testcase and doesn't duplicate the corresponding block.
> > > >
> > > > Added one testcase for this. Bootstrap and tested on both
> > > > aarch64 and
> > > x86 Linux platform, no new regression witnessed.
> > > >
> > > > Ok for trunk?
> > >
> > > Can you try using the num_stmts_in_pred[12] counts instead of using
> > > empty_block_p?
> >
> > It' ok to using num_stmts_in_pred[12] to judge whether the pred[12] is
> > empty since bb's immediate dominator can't meet the constraints
> > "single_pred_p (pred[12]) && single_pred (pred[12]) == pred[21]".
> >
> > >
> > > Your matching doesn't allow for FP constants like
> > >
> > >  dmax[0] = d1[i] < 1.0 ? 1.0 : d1[i];
> > >
> > > since FP constants are not shared.  You likely want to use
> > > operand_equal_p to do the PHI argument comparison.
> >
> > That's right, after using operand_equal_p instead of == to do the PHI
> > argument Comparison, the mentioned case can be covered as well.
> >
> > >
> > > Thanks,
> > > Richard.
> >
> > Thanks for your suggestions. We have revised our patch based on your
> suggestions.
> >
> > Bootstrap and tested on both aarch64 and x86 Linux platform. Does the v1
> patch looks better?
> 
> Yes.  This variant is OK.
> 
> Thanks,
> Richard.

Thanks for reviewing this. Could you please help install it?

Yang Yang


[PATCH] c-family: Avoid ICEs on calls to internal functions [PR95963]

2020-06-30 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs since recent Martin's -Wnonnull changes,
we see a CALL_EXPR and ICE because CALL_EXPR_FN is NULL, which is
valid for internal function calls.  Internal function calls don't have a
function type, and will never have format_arg attribute on them nor will
serve as the i18n routines -Wformat cares about.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2020-06-30  Jakub Jelinek  

PR c++/95963
* c-common.c (check_function_arguments_recurse): Don't crash on
calls to internal functions.

* g++.dg/cpp1z/launder9.C: New test.

--- gcc/c-family/c-common.c.jj  2020-06-29 14:51:54.821086398 +0200
+++ gcc/c-family/c-common.c 2020-06-29 21:06:02.624773345 +0200
@@ -5815,7 +5815,7 @@ check_function_arguments_recurse (void (
   return;
 }
 
-  if (TREE_CODE (param) == CALL_EXPR)
+  if (TREE_CODE (param) == CALL_EXPR && CALL_EXPR_FN (param))
 {
   tree type = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (param)));
   tree attrs;
--- gcc/testsuite/g++.dg/cpp1z/launder9.C.jj2020-06-29 21:07:02.191932465 
+0200
+++ gcc/testsuite/g++.dg/cpp1z/launder9.C   2020-06-29 21:08:19.137846260 
+0200
@@ -0,0 +1,11 @@
+// PR c++/95963
+// { dg-do compile }
+// { dg-options "-Wnonnull" }
+
+struct A { virtual void foo (); };
+
+void
+bar (A *p)
+{
+  __builtin_launder (p)->foo ();
+}

Jakub



Re: [PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182

2020-06-30 Thread Richard Biener via Gcc-patches
On Tue, Jun 30, 2020 at 2:18 PM Richard Sandiford
 wrote:
>
> "Yangfei (Felix)"  writes:
> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c 
> > b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> > index e050db1a2e4..ea39fcac0e0 100644
> > --- a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> > +++ b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> > @@ -1,6 +1,7 @@
> >  /* { dg-do compile } */
> >  /* { dg-additional-options "-O3" } */
> >  /* { dg-additional-options "-mavx2" { target { i?86-*-* x86_64-*-* } } } */
> > +/* { dg-additional-options "-march=armv8.2-a+sve -fno-vect-cost-model" { 
> > target aarch64*-*-* } } */
> >
> >  typedef struct {
> >  unsigned short mprr_2[5][16][16];
>
> This test is useful for Advanced SIMD too, so I think we should continue
> to test it with whatever options the person running the testsuite chose.
> Instead we could duplicate the test in gcc.target/aarch64/sve with
> appropriate options.
>
> > diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> > index eb8288e7a85..b30a7d8a3bb 100644
> > --- a/gcc/tree-vect-data-refs.c
> > +++ b/gcc/tree-vect-data-refs.c
> > @@ -1823,8 +1823,11 @@ vect_enhance_data_refs_alignment (loop_vec_info 
> > loop_vinfo)
> >   {
> > poly_uint64 nscalars = (STMT_SLP_TYPE (stmt_info)
> > ? vf * DR_GROUP_SIZE (stmt_info) : 
> > vf);
> > -   possible_npeel_number
> > - = vect_get_num_vectors (nscalars, vectype);
> > +   if (maybe_lt (nscalars, TYPE_VECTOR_SUBPARTS (vectype)))
> > + possible_npeel_number = 0;
> > +   else
> > + possible_npeel_number
> > +   = vect_get_num_vectors (nscalars, vectype);
> >
> > /* NPEEL_TMP is 0 when there is no misalignment, but also
> >allow peeling NELEMENTS.  */
>
> OK, so this is coming from:
>
>   int s[16][2];
>   …
>   … =s[j][1];
>
> and an SLP node containing 16 instances of “s[j][1]”.  The DR_GROUP_SIZE
> is 2 because that's the inner dimension of “s”.
>
> I don't think maybe_lt is right here though.  The same problem could in
> principle happen for cases in which NSCALARS > TYPE_VECTOR_SUBPARTS,
> e.g. for different inner dimensions of “s”.
>
> I think the testcase shows that using DR_GROUP_SIZE in this calculation
> is flawed.  I'm not sure whether we can really do better given the current
> representation though.  This is one case where having a separate dr_vec_info
> per SLP node would help.

I guess what the code likes to know is what we now have in SLP_TREE_LANES
(or formerly group_size) but that's not necessarily connected to DR_GROUP_SIZE.
Given we only see a stmt_info here and there's no 1:1 mapping of SLP node
to stmt_info (and the reverse mapping doesn't even exist) I do not have
any good idea either.

Honestly I don't really see what this code tries to do ... doesn't it
simply want to set possible_npeel_number to TYPE_VECTOR_SUBPARTS (vectype)?!

>
> Maybe one option (for now) would be to use:
>
>   if (multiple_p (nscalars, TREE_VECTOR_SUBPARTS (vectype)))
> possible_npeel_number = vect_get_num_vectors (nscalars, vectype);
>   else
> /* This isn't a simple stream of contiguous vector accesses.  It's hard
>to predict from the available information how many vector accesses
>we'll actually need per iteration, so be conservative and assume
>one.  */
> possible_npeel_number = 1;
>
> BTW, I'm not sure whether the current choice of STMT_SLP_TYPE (stmt_info)
> instead of PURE_SLP_STMT (stmt_info) is optimal or not.  It means that for
> hybrid SLP we base the peeling on the SLP stmt rather than the non-SLP stmt.
> I guess hybrid SLP is going away soon though, so let's not worry about
> that. :-)
>
> Maybe Richard has a better suggestion.
>
> Thanks,
> Richard


Re: [PATCH] Fortran : False positive for optional arguments PR95446

2020-06-30 Thread Mark Eggleston

ping!

On 24/06/2020 09:00, Mark Eggleston wrote:
Please find attached a fix for PR95446.  Patch originally posted to 
the PR by Steve Kargl.


OK to commit to master and backport?

Commit message:

Fortran  : False positive for optional arguments PR95446

Check that there is non-optional argument of the same rank in the
list of actual arguments.  If there is the warning is not required.

2020-06-24  Steven G. Kargl  

gcc/fortran/

    PR fortran/95446
    * resolve.c (resolve_elemental_actual): Add code to check for
    non-optional argument of the same rank.  Revise warning message
    to refer to the Fortran 2018 standard.

2020-06-24  Mark Eggleston 

gcc/testsuite/

    PR fortran/95446
    * gfortran.dg/elemental_optional_args_6.f90: Remove check
    for warnings that were erroneously output.
    * gfortran.dg/pr95446.f90: New test.



--
https://www.codethink.co.uk/privacy.html



RE: [PATCH] arm: Warn if IRQ handler is not compiled with -mgeneral-regs-only [PR target/94743]

2020-06-30 Thread Kyrylo Tkachov


> -Original Message-
> From: Christophe Lyon 
> Sent: 30 June 2020 14:32
> To: Kyrylo Tkachov 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] arm: Warn if IRQ handler is not compiled with -
> mgeneral-regs-only [PR target/94743]
> 
> On Tue, 30 Jun 2020 at 15:16, Kyrylo Tkachov 
> wrote:
> >
> > Hi Christophe,
> >
> > Sorry for the delay.
> >
> > > -Original Message-
> > > From: Gcc-patches  On Behalf Of
> > > Christophe Lyon via Gcc-patches
> > > Sent: 29 April 2020 16:19
> > > To: gcc-patches@gcc.gnu.org
> > > Subject: [PATCH] arm: Warn if IRQ handler is not compiled with -
> mgeneral-
> > > regs-only [PR target/94743]
> > >
> > > The interrupt attribute does not guarantee that the FP registers are
> > > saved, which can result in problems difficult to debug.
> > >
> > > Saving the FP registers and status registers can be a large penalty,
> > > so it's probably not desirable to do that all the time.
> > >
> > > If the handler calls other functions, we'd likely need to save all of
> > > them, for lack of knowledge of which registers they actually use.
> > >
> > > This is even more obscure for the end-user when the compiler inserts
> > > calls to helper functions such as memcpy (some multilibs do use FP
> > > registers to speed it up).
> > >
> > > In the PR, we discussed adding routines in libgcc to save the FP
> > > context and saving only locally-clobbered FP registers, but I think
> > > this is too intrusive for stage 4.
> > >
> > > In the mean time, emit a warning to suggest re-compiling with
> > > -mgeneral-regs-only. Note that this can lead to errors if the code
> > > uses floating-point and -mfloat-abi=hard, eg:
> > > argument of type 'double' not permitted with -mgeneral-regs-only
> > >
> > > This can be troublesome for the user, but at least this would make
> > > them aware of the latent issue.
> > >
> > > The patch adds two testcases:
> > > - pr94734-1.c checks that a warning is emitted. One function can make
> > >   implicit calls to runtime floating-point routines, the other one
> > >   doesn't. We can improve the diagnostic later not to warn in the
> > >   second case.
> > >
> > > - pr94734-2.c checks that no warning is emitted when using
> > >   -mgeneral-regs-only.
> > >
> > > 2020-04-29  Christophe Lyon  
> > >
> > >   PR target/94743
> > >   gcc/
> > >   * config/arm/arm.c (arm_handle_isr_attribute): Warn if
> > >   -mgeneral-regs-only is not used.
> > >
> > >   gcc/testsuite/
> > >   * gcc.target/arm/pr94743-1.c: New test.
> > >   * gcc.target/arm/pr94743-2.c: New test.
> > > ---
> > >  gcc/config/arm/arm.c |  5 +
> > >  gcc/testsuite/gcc.target/arm/pr94743-1.c | 20 
> > >  gcc/testsuite/gcc.target/arm/pr94743-2.c | 17 +
> > >  3 files changed, 42 insertions(+)
> > >  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-1.c
> > >  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-2.c
> > >
> > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > > index 6a6e804..34aad1d 100644
> > > --- a/gcc/config/arm/arm.c
> > > +++ b/gcc/config/arm/arm.c
> > > @@ -7176,6 +7176,11 @@ arm_handle_isr_attribute (tree *node, tree
> name,
> > > tree args, int flags,
> > >  name);
> > > *no_add_attrs = true;
> > >   }
> > > +  else if (TARGET_VFP_BASE)
> > > + {
> > > +   warning (OPT_Wattributes, "FP registers might be clobbered
> > > despite %qE attribute: compile with -mgeneral-regs-only",
> > > +name);
> > > + }
> >
> > Let's use %< and %> to quote -mgeneral-regs-only properly.
> > Ok with that change.
> 
> Hi Kyrill,
> 
> Thanks for the review, however I have posted v2 here:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545747.html
> (you approved v1)
> 
> It includes a few more testcases and updates to existing ones.
> 
> Is v2 OK with the quotation marks fixed?
> 

Oops, sorry. Yes that looks ok too (with the quotation fixed).
Kyrill

> Thanks
> 
> Christophe
> 
> > Thanks,
> > Kyrill
> >
> > >/* FIXME: the argument if any is checked for type attributes;
> > >should it be checked for decl ones?  */
> > >  }
> > > diff --git a/gcc/testsuite/gcc.target/arm/pr94743-1.c
> > > b/gcc/testsuite/gcc.target/arm/pr94743-1.c
> > > new file mode 100644
> > > index 000..67700c6
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/arm/pr94743-1.c
> > > @@ -0,0 +1,20 @@
> > > +/* PR target/94743 */
> > > +/* { dg-do compile } */
> > > +
> > > +typedef struct {
> > > +  double fpdata[32];
> > > +} dummy_t;
> > > +
> > > +dummy_t global_d;
> > > +dummy_t global_d1;
> > > +
> > > +__attribute__ ((interrupt("IRQ"))) void IRQ_HDLR_Test(void)
> > > +{ /* { dg-warning { FP registers might be clobbered despite 'interrupt'
> > > attribute: compile with -mgeneral-regs-only} "" { target *-*-* } . } */
> > > +  global_d.fpdata[3] += global_d.fpdata[3] * global_d1.fpdata[3];
> > > +}
> > > +
> > > +
> > > 

Re: [wwwdocs PATCH] Fix typo

2020-06-30 Thread Richard Sandiford
Hu Jiangping  writes:
> Hi,
>
> this patch fix a typo in contribute.html.
>
> Best Regards.
> Hujp

Thanks for the patch, pushed to gcc-wwwdocs.

Richard

>
> ---
>  htdocs/contribute.html | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/htdocs/contribute.html b/htdocs/contribute.html
> index 80a4470e..a913565b 100644
> --- a/htdocs/contribute.html
> +++ b/htdocs/contribute.html
> @@ -133,7 +133,7 @@ other changes applied.
>  Documentation changes do not require a new bootstrap (a working
>  bootstrap is necessary to get the build environment correct), but you
>  must perform make info and make dvi and correct
> -and errors.  You should investigate complaints about overfull or
> +any errors.  You should investigate complaints about overfull or
>  underfull hboxes from make dvi, as these can be the only
>  indication of serious markup problems, but do not feel obliged to
>  eliminate them all.


Re: [PATCH v3] Add -fuse-ld= to specify an arbitrary executable as the linker

2020-06-30 Thread Fāng-ruì Sòng via Gcc-patches
There is some concern about clang's -fuse-ld=path
http://lists.llvm.org/pipermail/cfe-dev/2020-June/065710.html and use
of COMPILER_PATH vs PATH.
Shall we introduce another option like -fld-path=path (PATH is used,
COMPILER_PATH is not used)?

On Tue, Jun 30, 2020 at 6:04 AM Martin Liška  wrote:
>
> PING^1
>
> On 5/29/20 3:10 AM, Fangrui Song wrote:
> > On 2020-05-25, Martin Liška wrote:
> >> On 5/22/20 6:42 AM, Fangrui Song wrote:
> >>> but I can't fix this one because joining two lines will break the 
> >>> 80-column rule.
> >>
> >> What about this:
> >>
> >> diff --git a/gcc/collect2.c b/gcc/collect2.c
> >> index cc57a20e08b..e5b54b080f7 100644
> >> --- a/gcc/collect2.c
> >> +++ b/gcc/collect2.c
> >> @@ -1138,8 +1138,8 @@ main (int argc, char **argv)
> >>   /* Search the ordinary system bin directories
> >>  for `ld' (if native linking) or `TARGET-ld' (if cross).  */
> >>   if (ld_file_name == 0)
> >> -ld_file_name =
> >> -  find_a_file (, full_ld_suffixes[selected_linker], X_OK);
> >> +ld_file_name
> >> +  = find_a_file (, full_ld_suffixes[selected_linker], X_OK);
> >> }
> >> #ifdef REAL_NM_FILE_NAME
> >>
> >> Apart from that, the patch is fine.
> >>
> >> Martin
> >
> > Adding people who may be able to approve and commit on my behalf.
> >
> > This formatting issue seems small enough. Hopefully a maintainer can do
> > it for me.
>


-- 
宋方睿


Re: [PATCH 1/7 v7] ifn/optabs: Support vector load/store with length

2020-06-30 Thread Richard Sandiford
"Kewen.Lin"  writes:
> Hi Richard,
>
> Thanks for the comments!
>
> on 2020/6/29 下午6:07, Richard Sandiford wrote:
>> Thanks for the update.  I agree with the summary of the IRC discussion
>> except for…
>> 
>> "Kewen.Lin"  writes:
>>> Hi Richard S./Richi/Jim/Segher,
>>>
>>> Thanks a lot for your comments to make this patch more solid.
>>>
>>> Based on our discussion, for the vector load/store with length
>>> optab, the length unit would be measured in lanes by default.
>>> For the targets which support length measured in bytes like Power,
>>> they should only define VnQI modes to wrap the other same size
>>> vector modes.  If the length is larger than total lane/byte count
>>> of the given mode, it's taken to load all lanes/bytes implicitly.
>> 
>> …this last bit.  IMO the behaviour of the optab should be undefined
>> when the supplied length is greater than the number of lanes.
>> 
>> I think that also makes things better for the lxvl implementation,
>> which ignores the upper 56 bits of the length.  It sounds like the
>> above semantics would instead require Power to saturate the value
>> at 255 before shifting it.
>> 
>
> Good catch, I just realized that this part is inconsistent to what I
> implemented in patch 5/7, where the function vect_gen_len still does
> the min operation between the given length and length_limit.
>
> This patch is updated accordingly to state the behavior to be undefined.
> The others aren't required to change.
>
> Could you have a further look? Thanks in advance!
>
> v6/v7: Updated optab descriptions.
>
> v5:
>   - Updated lenload/lenstore optab to len_load/len_store and the docs.
>   - Rename expand_mask_{load,store}_optab_fn to 
> expand_partial_{load,store}_optab_fn
>   - Added/updated macros for expand_mask_{load,store}_optab_fn
> and expand_len_{load,store}_optab_fn
>
> v4: Update len_load_direct/len_store_direct to align with direct optab.
>
> v3: Get rid of length mode hook.

Thanks, mostly looks good, just some comments about the documentation…

>
> BR,
> Kewen
> -
> gcc/ChangeLog:
>
> 2020-MM-DD  Kewen Lin  
>
>   * doc/md.texi (len_load_@var{m}): Document.
>   (len_store_@var{m}): Likewise.
>   * internal-fn.c (len_load_direct): New macro.
>   (len_store_direct): Likewise.
>   (expand_len_load_optab_fn): Likewise.
>   (expand_len_store_optab_fn): Likewise.
>   (direct_len_load_optab_supported_p): Likewise.
>   (direct_len_store_optab_supported_p): Likewise.
>   (expand_mask_load_optab_fn): New macro.  Original renamed to ...
>   (expand_partial_load_optab_fn): ... here.  Add handlings for
>   len_load_optab.
>   (expand_mask_store_optab_fn): New macro.  Original renamed to ...
>   (expand_partial_store_optab_fn): ... here. Add handlings for
>   len_store_optab.
>   (internal_load_fn_p): Handle IFN_LEN_LOAD.
>   (internal_store_fn_p): Handle IFN_LEN_STORE.
>   (internal_fn_stored_value_index): Handle IFN_LEN_STORE.
>   * internal-fn.def (LEN_LOAD): New internal function.
>   (LEN_STORE): Likewise.
>   * optabs.def (len_load_optab, len_store_optab): New optab.
>
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 2c67c818da5..c8d7bcc7f62 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5167,6 +5167,33 @@ mode @var{n}.
>  
>  This pattern is not allowed to @code{FAIL}.
>  
> +@cindex @code{len_load_@var{m}} instruction pattern
> +@item @samp{len_load_@var{m}}
> +Load the number of units specified by operand 2 from memory operand 1

s/units/vector elements/

> +into register operand 0, setting the other bytes of operand 0 to

s/bytes/elements/

Maybe s/register operand 0/vector register operand 0/ would be clearer,
now that we're explicitly measuring elements rather than bytes.

> +undefined values.  Operands 0 and 1 have mode @var{m}.  Operand 2 has

and maybe here “…@var{m}, which must be a vector mode”.

> +whichever integer mode the target prefers.  If operand 2 exceeds the
> +maximum units of mode @var{m}, the behavior is undefined.  For targets

Maybe s/maximum units of/number of elements in/

> +which support length measured in bytes, they should only define VnQI
> +mode to wrap the other vector modes with the same size.  Meanwhile,

How about:

  If the target prefers the length to be measured in bytes
  rather than elements, it should only implement this pattern
  for vectors of @code{QI} elements.

> +it's required that the byte count should be a multiple of the element
> +size (wrapped vector).

This last sentence doesn't apply now that the length is measured in
elements (lanes).

> +
> +This pattern is not allowed to @code{FAIL}.
> +
> +@cindex @code{len_store_@var{m}} instruction pattern
> +@item @samp{len_store_@var{m}}
> +Store the number of units specified by operand 2 from nonmemory operand 1
> +into memory operand 0, leaving the other bytes of operand 0 unchanged.
> +Operands 0 and 1 have mode @var{m}.  Operand 2 has whichever integer
> +mode the 

Re: [PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182

2020-06-30 Thread Richard Biener
On Tue, 30 Jun 2020, Richard Sandiford wrote:

> Richard Biener  writes:
> > On Tue, Jun 30, 2020 at 2:18 PM Richard Sandiford
> >  wrote:
> >>
> >> "Yangfei (Felix)"  writes:
> >> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c 
> >> > b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> >> > index e050db1a2e4..ea39fcac0e0 100644
> >> > --- a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> >> > +++ b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
> >> > @@ -1,6 +1,7 @@
> >> >  /* { dg-do compile } */
> >> >  /* { dg-additional-options "-O3" } */
> >> >  /* { dg-additional-options "-mavx2" { target { i?86-*-* x86_64-*-* } } 
> >> > } */
> >> > +/* { dg-additional-options "-march=armv8.2-a+sve -fno-vect-cost-model" 
> >> > { target aarch64*-*-* } } */
> >> >
> >> >  typedef struct {
> >> >  unsigned short mprr_2[5][16][16];
> >>
> >> This test is useful for Advanced SIMD too, so I think we should continue
> >> to test it with whatever options the person running the testsuite chose.
> >> Instead we could duplicate the test in gcc.target/aarch64/sve with
> >> appropriate options.
> >>
> >> > diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> >> > index eb8288e7a85..b30a7d8a3bb 100644
> >> > --- a/gcc/tree-vect-data-refs.c
> >> > +++ b/gcc/tree-vect-data-refs.c
> >> > @@ -1823,8 +1823,11 @@ vect_enhance_data_refs_alignment (loop_vec_info 
> >> > loop_vinfo)
> >> >   {
> >> > poly_uint64 nscalars = (STMT_SLP_TYPE (stmt_info)
> >> > ? vf * DR_GROUP_SIZE (stmt_info) 
> >> > : vf);
> >> > -   possible_npeel_number
> >> > - = vect_get_num_vectors (nscalars, vectype);
> >> > +   if (maybe_lt (nscalars, TYPE_VECTOR_SUBPARTS (vectype)))
> >> > + possible_npeel_number = 0;
> >> > +   else
> >> > + possible_npeel_number
> >> > +   = vect_get_num_vectors (nscalars, vectype);
> >> >
> >> > /* NPEEL_TMP is 0 when there is no misalignment, but also
> >> >allow peeling NELEMENTS.  */
> >>
> >> OK, so this is coming from:
> >>
> >>   int s[16][2];
> >>   …
> >>   … =s[j][1];
> >>
> >> and an SLP node containing 16 instances of “s[j][1]”.  The DR_GROUP_SIZE
> >> is 2 because that's the inner dimension of “s”.
> >>
> >> I don't think maybe_lt is right here though.  The same problem could in
> >> principle happen for cases in which NSCALARS > TYPE_VECTOR_SUBPARTS,
> >> e.g. for different inner dimensions of “s”.
> >>
> >> I think the testcase shows that using DR_GROUP_SIZE in this calculation
> >> is flawed.  I'm not sure whether we can really do better given the current
> >> representation though.  This is one case where having a separate 
> >> dr_vec_info
> >> per SLP node would help.
> >
> > I guess what the code likes to know is what we now have in SLP_TREE_LANES
> > (or formerly group_size) but that's not necessarily connected to 
> > DR_GROUP_SIZE.
> > Given we only see a stmt_info here and there's no 1:1 mapping of SLP node
> > to stmt_info (and the reverse mapping doesn't even exist) I do not have
> > any good idea either.
> >
> > Honestly I don't really see what this code tries to do ... doesn't it
> > simply want to set possible_npeel_number to TYPE_VECTOR_SUBPARTS (vectype)?!
> 
> I think it's trying to set possible_npeel_number to the number of
> vector stmts per iteration (i.e. ncopies for non-SLP stuff):
> 
>   /* For multiple types, it is possible that the bigger type 
> access
>  will have more than one peeling option.  E.g., a loop with 
> two
>  types: one of size (vector size / 4), and the other one of
>  size (vector size / 8).  Vectorization factor will 8.  If 
> both
>  accesses are misaligned by 3, the first one needs one scalar
>  iteration to be aligned, and the second one needs 5.  But the
>  first one will be aligned also by peeling 5 scalar
>  iterations, and in that case both accesses will be aligned.
>  Hence, except for the immediate peeling amount, we also want
>  to try to add full vector size, while we don't exceed
>  vectorization factor.
>  We do this automatically for cost model, since we calculate
>  cost for every peeling option.  */
> 
> So for this example, possible_npeel_number==2 for the first access
> (letting us try two peeling values for it) and possible_npeel_number==1
> for the second.

Yeah, but npeel is _scalar_ lanes, since we always peel scalar iterations.
So it seems odd to somehow put in the number of vectors...  so to me
it would have made sense if it did

  possible_npeel_number = lower_bound (nscalars);

or whateveris necessary to make the polys happy.  Thus simply elide
the vect_get_num_vectors call?  But it's been very long since I've
dived into the alignment 

RE: [PATCH] [arm] Don't generate invalid LDRD insns

2020-06-30 Thread Kyrylo Tkachov
Hi Alex,

> -Original Message-
> From: Alex Coplan 
> Sent: 26 June 2020 10:38
> To: Alex Coplan ; Kyrylo Tkachov
> ; gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; nd ;
> Ramana Radhakrishnan 
> Subject: RE: [PATCH] [arm] Don't generate invalid LDRD insns
> 
> > -Original Message-
> > From: Gcc-patches  On Behalf Of Alex
> > Coplan
> > Sent: 18 May 2020 16:37
> > To: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org
> > Cc: Richard Earnshaw ; nd ;
> Ramana
> > Radhakrishnan 
> > Subject: RE: [PATCH] [arm] Don't generate invalid LDRD insns
> >
> > > -Original Message-
> > > From: Kyrylo Tkachov 
> > > Sent: 15 May 2020 11:57
> > > To: Alex Coplan ; gcc-patches@gcc.gnu.org
> > > Cc: nd ; ni...@redhat.com; Richard Earnshaw
> > > ; Ramana Radhakrishnan
> > 
> > > Subject: RE: [PATCH] [arm] Don't generate invalid LDRD insns
> > >
> > > Hi Alex,
> > >
> > > > -Original Message-
> > > > From: Alex Coplan 
> > > > Sent: 15 May 2020 11:36
> > > > To: gcc-patches@gcc.gnu.org
> > > > Cc: nd ; ni...@redhat.com; Richard Earnshaw
> > > > ; Ramana Radhakrishnan
> > > > ; Kyrylo Tkachov
> > > > 
> > > > Subject: [PATCH] [arm] Don't generate invalid LDRD insns
> > > >
> > > > Hello,
> > > >
> > > > This patch fixes a bug in the arm backend where GCC generates invalid
> > LDRD
> > > > instructions. The LDRD instruction requires the first transfer
> > register to be
> > > > even, but GCC attempts to use odd registers here. For example, with
> > the
> > > > following C code:
> > > >
> > > > struct c {
> > > >   double a;
> > > > } __attribute((aligned)) __attribute((packed));
> > > > struct c d;
> > > > struct c f(struct c);
> > > > void e() { f(d); }
> > > >
> > > > The struct d is passed in registers r1 and r2 to the function f, and
> > GCC
> > > > attempts to do this with a LDRD instruction when compiling with -
> > > > march=armv7-a
> > > > on a soft float toolchain.
> > > >
> > > > The fix is analogous to the corresponding one for STRD in the same
> > function:
> > > >
> >
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=52057dc4ac5295caebf83147f6
> > > > 88d769c93cbc8d
> > > >
> > > > Testing:
> > > >  - New unit tests which pass after applying the patch.
> > > >  - Tested on an x64 -> arm-none-eabi cross.
> > > >  - Bootstrapped and regtested on arm-none-linux-gnueabihf (in both
> > thumb
> > > > and arm
> > > >modes).
> > > >
> > > > OK for master?
> > >
> > > Ok.
> > > Please apply for write-after-approval commit access to the repo by
> > filling the
> > > form at:
> > > https://sourceware.org/cgi-bin/pdw/ps_form.cgi
> > > listing me as your sponsor.
> > > You can then push the patch yourself.
> >
> > OK, I've pushed the patch to master.
> >
> > Thanks,
> > Alex
> 
> This patch applies cleanly on gcc-{8,9,10} branches. I've run bootstrap
> and regression tests (both thumb and arm) with the patch applied to the
> branches and those came back clean.
> 
> OK to backport to the branches?

Ok.
Thanks,
Kyrill

> 
> Thanks,
> Alex
> 
> >
> > >
> > > Thanks,
> > > Kyrill
> > >
> > > >
> > > > Thanks,
> > > > Alex
> > > >
> > > > ---
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > 2020-05-15  Alex Coplan  
> > > > * config/arm/arm.c (output_move_double): Fix codegen when
> > loading
> > > > into
> > > > a register pair with an odd base register.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > 2020-05-15  Alex Coplan  
> > > > * gcc.c-torture/compile/packed-aligned-1.c: New test.
> > > > * gcc.c-torture/execute/packed-aligned.c: New test.
> > >
> 
> 



Re: [PATCH v3] Add -fuse-ld= to specify an arbitrary executable as the linker

2020-06-30 Thread Martin Liška

PING^1

On 5/29/20 3:10 AM, Fangrui Song wrote:

On 2020-05-25, Martin Liška wrote:

On 5/22/20 6:42 AM, Fangrui Song wrote:

but I can't fix this one because joining two lines will break the 80-column 
rule.


What about this:

diff --git a/gcc/collect2.c b/gcc/collect2.c
index cc57a20e08b..e5b54b080f7 100644
--- a/gcc/collect2.c
+++ b/gcc/collect2.c
@@ -1138,8 +1138,8 @@ main (int argc, char **argv)
  /* Search the ordinary system bin directories
 for `ld' (if native linking) or `TARGET-ld' (if cross).  */
  if (ld_file_name == 0)
-    ld_file_name =
-  find_a_file (, full_ld_suffixes[selected_linker], X_OK);
+    ld_file_name
+  = find_a_file (, full_ld_suffixes[selected_linker], X_OK);
    }
#ifdef REAL_NM_FILE_NAME

Apart from that, the patch is fine.

Martin


Adding people who may be able to approve and commit on my behalf.

This formatting issue seems small enough. Hopefully a maintainer can do
it for me.




Re: [PATCH] analyzer/pr94028.C and non-null warning

2020-06-30 Thread David Malcolm via Gcc-patches
On Tue, 2020-06-30 at 09:51 -0400, David Edelsohn wrote:
> The changes to the non-null warning now produce an additional warning
> for analyzer/pr94028.C on one of the "leak" lines.  This causes new
> failures on trunk.

Hi David

Do you have the output to hand?  What is the full text of the new
diagnostic?

Some high level questions:
  * Ought GCC to warn for that code?  (or not)
  * Is it behavior we ought to have a test for?

Note that AFAIK there are *two* kinds of non-null warnings that GCC can
emit: the kind emitted by Martin's code, versus those emitted by
-fanalyzer (the latter imply the much more expensive analysis performed
by -fanalyzer, and are controlled by the various -Wanalyzer-*null*
options; see 
https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Static-Analyzer-Options.html
)


> Because non-null is not the purpose of the analyzer test, I propose
> pruning the output to resolve the new failures.

Looking back through bugzilla, it seems that the main purpose of adding
that test was to ensure that -fanalyzer doesn't ICE on that code.

At some point I hope to properly support C++ in -fanalyzer, at which
point some kind of null warning may be warranted on that code.  Sadly
I'm not yet at that point.  FWIW I'm currently working on a big rewrite
of how state is tracked within the analyzer, as I've identified at
least two major flaws in the current implementation, which my rewrite
addresses.  I'm deferring on C++ support until that rewrite is done.

>   Alternatively, I
> could explicitly test for the additional non-null warning.
> 
> Do you have any preferences?

This test is controlled by analyzer.exp and thus, for example, is
disabled if the analyzer was disabled at configure time.

If this is coming from Martin's non-analyzer code, a third possibility
would be to use -Wno-something to disable that warning, so that the
analyzer test can focus on the -fanalyzer test, as it were (and if this
is a behavior that ought to be checked for in Martin's warning, then
copy the pertinent fragment of the testcase to one of the g++.dg cases,
I suppose).  I think I prefer that approach.

How does this sound?

> I propose appending
> 
> // { dg-prune-output "where non-null expected" }
> 
> to the file to prune the warning output.
> 
> Thanks, David

Thanks for looking into this; hope this is constructive.
Dave



Re: [PATCH PR95700 v2] Redefine NULL to nullptr

2020-06-30 Thread Richard Sandiford
Ilya Leoshkevich  writes:
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2020-June/548822.html
>
> This is the implementation of Richard's suggestion.  I've decided not
> to remove the initial #define NULL 0, since system.h mixes inclusion of
> system and helper headers (e.g. libiberty), which might use NULL - if
> not today, then in the future.  If system headers do not provide NULL,
> removing #define NULL 0 might lead to build errors.  Ideally the order
> should be:
>
>   system headers
>   #undef NULL
>   #define NULL nullptr
>   helper headers
>
> but I fear that rearranging includes might break some weird system.
>
> This patch helps with alpine bootstrap, I'm currently testing it on
> x86_64-redhat-linux.
>
> ---
>
> Bootstrap with musl libc fails with numerous "missing sentinel in
> function call" errors.  This is because musl defines NULL as 0L for C++,
> but gcc requires sentinel value to be a pointer or __null.
>
> Jonathan Wakely says:
>
> To be really safe during stage 1, GCC should not use NULL as a
> pointer sentinel in C++ code anyway.
>
> The bootstrap compiler could define it to 0 or 0u, neither of which
> is guaranteed to be OK to pass as a varargs sentinel where a null
> pointer is expected.  Any of (void*)0 or (void*)NULL or nullptr
> would be safe.
>
> While it is possible to fix this by replacing NULL sentinels with
> nullptrs, such approach would generate backporting conflicts, therefore
> simply redefine NULL to nullptr at the end of system.h, where it would
> not confuse system headers.
>
> gcc/ChangeLog:
>
> 2020-06-30  Ilya Leoshkevich  
>
>   PR bootstrap/95700
>   * system.h (NULL): Redefine to nullptr.

LGTM, thanks.  OK if noone objects in the next 24hrs.

Richard

> ---
>  gcc/system.h | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/gcc/system.h b/gcc/system.h
> index 544f7ba427f..1eec77e730e 100644
> --- a/gcc/system.h
> +++ b/gcc/system.h
> @@ -1244,4 +1244,14 @@ void gcc_stablesort (void *, size_t, size_t,
> of the number.  */
>  #define PRsa(n) "%" #n PRIu64 "%c"
>  
> +/* System headers may define NULL to be an integer (e.g. 0L), which cannot be
> +   used safely in certain contexts (e.g. as sentinels).  Redefine NULL to
> +   nullptr in order to make it safer.  Note that this might confuse system
> +   headers, however, by convention they must not be included after this 
> point.
> +*/
> +#ifdef __cplusplus
> +#undef NULL
> +#define NULL nullptr
> +#endif
> +
>  #endif /* ! GCC_SYSTEM_H */


Re: [PATCH] Fortran : False positive for optional arguments PR95446

2020-06-30 Thread Thomas Koenig

Hi Mark,

this is OK.

Regarding backport: This is not a regression, and the patch itself
is a bit large to bee entirely trivial.  So, I'd prefer it you left
that for master.

Thanks for taking this up!

Regards

Thomas


Re: [PATCH] arm: Warn if IRQ handler is not compiled with -mgeneral-regs-only [PR target/94743]

2020-06-30 Thread Christophe Lyon via Gcc-patches
On Tue, 30 Jun 2020 at 15:16, Kyrylo Tkachov  wrote:
>
> Hi Christophe,
>
> Sorry for the delay.
>
> > -Original Message-
> > From: Gcc-patches  On Behalf Of
> > Christophe Lyon via Gcc-patches
> > Sent: 29 April 2020 16:19
> > To: gcc-patches@gcc.gnu.org
> > Subject: [PATCH] arm: Warn if IRQ handler is not compiled with -mgeneral-
> > regs-only [PR target/94743]
> >
> > The interrupt attribute does not guarantee that the FP registers are
> > saved, which can result in problems difficult to debug.
> >
> > Saving the FP registers and status registers can be a large penalty,
> > so it's probably not desirable to do that all the time.
> >
> > If the handler calls other functions, we'd likely need to save all of
> > them, for lack of knowledge of which registers they actually use.
> >
> > This is even more obscure for the end-user when the compiler inserts
> > calls to helper functions such as memcpy (some multilibs do use FP
> > registers to speed it up).
> >
> > In the PR, we discussed adding routines in libgcc to save the FP
> > context and saving only locally-clobbered FP registers, but I think
> > this is too intrusive for stage 4.
> >
> > In the mean time, emit a warning to suggest re-compiling with
> > -mgeneral-regs-only. Note that this can lead to errors if the code
> > uses floating-point and -mfloat-abi=hard, eg:
> > argument of type 'double' not permitted with -mgeneral-regs-only
> >
> > This can be troublesome for the user, but at least this would make
> > them aware of the latent issue.
> >
> > The patch adds two testcases:
> > - pr94734-1.c checks that a warning is emitted. One function can make
> >   implicit calls to runtime floating-point routines, the other one
> >   doesn't. We can improve the diagnostic later not to warn in the
> >   second case.
> >
> > - pr94734-2.c checks that no warning is emitted when using
> >   -mgeneral-regs-only.
> >
> > 2020-04-29  Christophe Lyon  
> >
> >   PR target/94743
> >   gcc/
> >   * config/arm/arm.c (arm_handle_isr_attribute): Warn if
> >   -mgeneral-regs-only is not used.
> >
> >   gcc/testsuite/
> >   * gcc.target/arm/pr94743-1.c: New test.
> >   * gcc.target/arm/pr94743-2.c: New test.
> > ---
> >  gcc/config/arm/arm.c |  5 +
> >  gcc/testsuite/gcc.target/arm/pr94743-1.c | 20 
> >  gcc/testsuite/gcc.target/arm/pr94743-2.c | 17 +
> >  3 files changed, 42 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-2.c
> >
> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > index 6a6e804..34aad1d 100644
> > --- a/gcc/config/arm/arm.c
> > +++ b/gcc/config/arm/arm.c
> > @@ -7176,6 +7176,11 @@ arm_handle_isr_attribute (tree *node, tree name,
> > tree args, int flags,
> >  name);
> > *no_add_attrs = true;
> >   }
> > +  else if (TARGET_VFP_BASE)
> > + {
> > +   warning (OPT_Wattributes, "FP registers might be clobbered
> > despite %qE attribute: compile with -mgeneral-regs-only",
> > +name);
> > + }
>
> Let's use %< and %> to quote -mgeneral-regs-only properly.
> Ok with that change.

Hi Kyrill,

Thanks for the review, however I have posted v2 here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545747.html
(you approved v1)

It includes a few more testcases and updates to existing ones.

Is v2 OK with the quotation marks fixed?

Thanks

Christophe

> Thanks,
> Kyrill
>
> >/* FIXME: the argument if any is checked for type attributes;
> >should it be checked for decl ones?  */
> >  }
> > diff --git a/gcc/testsuite/gcc.target/arm/pr94743-1.c
> > b/gcc/testsuite/gcc.target/arm/pr94743-1.c
> > new file mode 100644
> > index 000..67700c6
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/arm/pr94743-1.c
> > @@ -0,0 +1,20 @@
> > +/* PR target/94743 */
> > +/* { dg-do compile } */
> > +
> > +typedef struct {
> > +  double fpdata[32];
> > +} dummy_t;
> > +
> > +dummy_t global_d;
> > +dummy_t global_d1;
> > +
> > +__attribute__ ((interrupt("IRQ"))) void IRQ_HDLR_Test(void)
> > +{ /* { dg-warning { FP registers might be clobbered despite 'interrupt'
> > attribute: compile with -mgeneral-regs-only} "" { target *-*-* } . } */
> > +  global_d.fpdata[3] += global_d.fpdata[3] * global_d1.fpdata[3];
> > +}
> > +
> > +
> > +__attribute__ ((interrupt("IRQ"))) void IRQ_HDLR_Test2(void)
> > +{ /* { dg-warning { FP registers might be clobbered despite 'interrupt'
> > attribute: compile with -mgeneral-regs-only} "" { target *-*-* } . } */
> > +  global_d.fpdata[3] = 1.0;
> > +}
> > diff --git a/gcc/testsuite/gcc.target/arm/pr94743-2.c
> > b/gcc/testsuite/gcc.target/arm/pr94743-2.c
> > new file mode 100644
> > index 000..745fd36
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/arm/pr94743-2.c
> > @@ -0,0 +1,17 @@
> > +/* PR target/94743 */
> > +/* { dg-do compile } */
> > +/* { dg-options 

Re: [PATCH] Fortran : Bogus error with additional blanks in type(*) PR95829

2020-06-30 Thread Thomas Koenig

Hi Mark,


Fixed the test case, the new patch is attached.

OK for commit to master and backports?


OK.

Thanks a lot!

Regards

Thomas


Re: [patch, fortran, committed] Fix PR 95366, wrong code and ABI breakage

2020-06-30 Thread Thomas Koenig

Am 30.06.20 um 13:11 schrieb Thomas Koenig:

 PR fortran/95355


That should have been PR 955366.

Regards

Thomas


[pushed] coroutines: Fix a diagnostic trailing space warning.

2020-06-30 Thread Iain Sandoe
Hi

A recently add diagnostic has a trailing space which triggers a warning.
Fixed thus.

tested on x86_64-darwin, applied to master, will also apply to 10.2
thanks
Iain

gcc/cp/ChangeLog:

* coroutines.cc (morph_fn_to_coro): Remove trailing
space in a diagnostic.
---
 gcc/cp/coroutines.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index ba4ac682f11..bec7f2f2027 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -4119,7 +4119,7 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
 else if (!grooaf && TYPE_NOTHROW_P (TREE_TYPE (func)))
   warning_at (fn_start, 0, "%qE is marked % or % but"
  " no usable %"
- " is provided by %qT ", nwname, promise_type);
+ " is provided by %qT", nwname, promise_type);
 }
   else /* No operator new in the promise.  */
 {
-- 
2.24.1




Re: [PATCH 2/9] [OpenACC] GOMP_MAP_ATTACH handling in find_group_last

2020-06-30 Thread Thomas Schwinge
Hi Julian!

On 2020-06-16T15:38:32-0700, Julian Brown  wrote:
> Later patches in this series assume that GOMP_MAP_ATTACH will be grouped
> together with a preceding GOMP_MAP_TO_PSET or other "to" data movement
> clause, except in cases where an explicit "attach" clause is used.
> This patch arranges for that to be so.
>
> OK?

OK for master branch and releases/gcc-10 branch.  However, still a few
questions, which can be addressed separately:

> --- a/libgomp/oacc-mem.c
> +++ b/libgomp/oacc-mem.c
> @@ -985,9 +985,15 @@ find_group_last (int pos, size_t mapnum, size_t *sizes, 
> unsigned short *kinds)
>switch (kind0)
>  {
>  case GOMP_MAP_TO_PSET:
> -  while (pos + 1 < mapnum && (kinds[pos + 1] & 0xff) == GOMP_MAP_POINTER)
> +  if (pos + 1 < mapnum
> +   && (kinds[pos + 1] & 0xff) == GOMP_MAP_ATTACH)
> + return pos + 1;
> +
> +  while (pos + 1 < mapnum
> +  && (kinds[pos + 1] & 0xff) == GOMP_MAP_POINTER)
>   pos++;
> -  /* We expect at least one GOMP_MAP_POINTER after a GOMP_MAP_TO_PSET.  
> */
> +  /* We expect at least one GOMP_MAP_POINTER (if not a single
> +  GOMP_MAP_ATTACH) after a GOMP_MAP_TO_PSET.  */
>assert (pos > first_pos);
>break;

So a 'GOMP_MAP_TO_PSET' can now be followed by either a single
'GOMP_MAP_ATTACH', or by potentially several 'GOMP_MAP_POINTER's, but not
both.  If the former ('GOMP_MAP_ATTACH'), then any additional following
'GOMP_MAP_POINTER's are not anymore handled together with the
'GOMP_MAP_TO_PSET' -- which defeats the description in
'include/gomp-constants.h', which explicitly details how
'GOMP_MAP_TO_PSET' is used to specially handle 'GOMP_MAP_POINTER's
following it.  So, please update the 'enum gomp_map_kind' definition to
describe what's (now) actually going on.

(Maybe that'll then make obsolete the source code comments you're adding
here?  ..., if also updating the 'GOMP_MAP_ATTACH' description to detail
how it may follow 'GOMP_MAP_TO' etc.  I think such rationale --
describing the valid combinations -- is better put there instead of into
'find_group_last'.)

In the compiler, are we making sure that after 'GOMP_MAP_TO_PSET' we're
not trying to emit both a 'GOMP_MAP_ATTACH' as well as
'GOMP_MAP_POINTER's?

> @@ -1002,6 +1008,9 @@ find_group_last (int pos, size_t mapnum, size_t *sizes, 
> unsigned short *kinds)
>gomp_fatal ("unexpected mapping");
>break;
>
> +case GOMP_MAP_ATTACH:
> +  return pos;
> +
>  default:
>/* GOMP_MAP_ALWAYS_POINTER can only appear directly after some other
>mapping.  */
> @@ -1012,9 +1021,16 @@ find_group_last (int pos, size_t mapnum, size_t 
> *sizes, unsigned short *kinds)
>   return pos + 1;
>   }
>
> +  /* We can have a single GOMP_MAP_ATTACH mapping after a to/from
> +  mapping.  */
> +  if (pos + 1 < mapnum
> +   && (kinds[pos + 1] & 0xff) == GOMP_MAP_ATTACH)
> + return pos + 1;
> +
>/* We can have zero or more GOMP_MAP_POINTER mappings after a to/from
>(etc.) mapping.  */
> -  while (pos + 1 < mapnum && (kinds[pos + 1] & 0xff) == GOMP_MAP_POINTER)
> +  while (pos + 1 < mapnum
> +  && (kinds[pos + 1] & 0xff) == GOMP_MAP_POINTER)
>   pos++;
>  }

Similar (regarding documenting in 'enum gomp_map_kind' etc.).


Grüße
 Thomas
-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


Re: [PATCH] Fortran : Bogus error with additional blanks in type(*) PR95829

2020-06-30 Thread Mark Eggleston

On 24/06/2020 10:13, Manfred Schwarb wrote:

Am 24.06.20 um 10:12 schrieb Mark Eggleston:

Please find a fix for PR95829.  Original patch by Steve Kargl posted to PR.

Commit to master and backport?

Commit message:

Fortran  : Bogus error with additional blanks in type(*) PR95829

Checking for "* ) " instead of "*)" clears the bogus error.

2020-06-24  Steven G. Kargl  

gcc/fortran/

     PR fortran/95829
     * decl.c (gfc_match_decl_type_spec): Compare with "* ) " instead
     of "*)".

2020-06-24  Mark Eggleston 

gcc/testsuite/

     PR fortran/95829
     * gfortran.dg/pr95829.f90: New test.


@@ -0,0 +1,14 @@
+! {dg-do compile }
+!
+! Declaration of b used to be a bogus failure.

{ dg-do compile }

with surrounding spaces, otherwise this dg-directive will not be recognized.


Fixed the test case, the new patch is attached.

OK for commit to master and backports?

Updated commit message:

Fortran  : Bogus error with additional blanks in type(*) PR95829

Checking for "* ) " instead of "*)" clears the bogus error.

2020-06-30  Steven G. Kargl  

gcc/fortran/

    PR fortran/95829
    * decl.c (gfc_match_decl_type_spec): Compare with "* ) " instead
    of "*)".

2020-06-30  Mark Eggleston 

gcc/testsuite/

    PR fortran/95829
    * gfortran.dg/pr95829.f90: New test.

--
https://www.codethink.co.uk/privacy.html

>From 0dd0b086f64d295a4373ce4cf495f85de2d95b1f Mon Sep 17 00:00:00 2001
From: Mark Eggleston 
Date: Tue, 23 Jun 2020 11:01:28 +0100
Subject: [PATCH] Fortran  : Bogus error with additional blanks in type(*)
 PR95829

Checking for "* ) " instead of "*)" clears the bogus error.

2020-06-24  Steven G. Kargl  

gcc/fortran/

	PR fortran/95829
	* decl.c (gfc_match_decl_type_spec): Compare with "* ) " instead
	of "*)".

2020-06-24  Mark Eggleston  

gcc/testsuite/

	PR fortran/95829
	* gfortran.dg/pr95829.f90: New test.
---
 gcc/fortran/decl.c|  2 +-
 gcc/testsuite/gfortran.dg/pr95829.f90 | 14 ++
 2 files changed, 15 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr95829.f90

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index ac1f63f66e0..3bc5881e0ce 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -4128,7 +4128,7 @@ gfc_match_decl_type_spec (gfc_typespec *ts, int implicit_flag)
   gfc_gobble_whitespace ();
   if (gfc_peek_ascii_char () == '*')
 	{
-	  if ((m = gfc_match ("*)")) != MATCH_YES)
+	  if ((m = gfc_match ("* ) ")) != MATCH_YES)
 	return m;
 	  if (gfc_comp_struct (gfc_current_state ()))
 	{
diff --git a/gcc/testsuite/gfortran.dg/pr95829.f90 b/gcc/testsuite/gfortran.dg/pr95829.f90
new file mode 100644
index 000..081d647c7c0
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr95829.f90
@@ -0,0 +1,14 @@
+! { dg-do compile }
+!
+! Declaration of b used to be a bogus failure.
+
+subroutine s (a, b, c, d, e, f, g)
+  type(*) :: a
+  type(* ) :: b
+  type( *) :: c
+  type( * ) :: d
+  type(*  ) :: e
+  type(  *) :: f
+  type(  *  ) :: g
+end
+
-- 
2.11.0



Re: Fortran : ICE in generic_correspondence PR95584

2020-06-30 Thread Thomas Koenig

Hi Mark,



OK to commit and backport?


OK, fixing such a NULL dererefence could also count as obvious
and simple (if it does the right thing :-)

Regarding backport:  I probably would not do it myself; it's
not a regression, and ICE does not have such a large impact.
However, if you want to do it, feel free.

Thanks for taking this on!

Regards

Thomas



Re: Fortran : Fortran translation issues PR52279

2020-06-30 Thread Mark Eggleston

ping!

On 25/06/2020 07:28, Mark Eggleston wrote:
Patch to fix PR52279.  Marked strings that are indirectly used in 
error and warning messages so that they are marked for translation.


OK to commit?

Commit message:

Fortran  : Fortran translation issues PR52279

Mark strings for translation by enclosing in G_() and _().

2020-06-24  Mark Eggleston 

gcc/fortran/

    PR fortran/52279
    * arith.c (reduce_binary_aa): Mark for translation the string
    parameter to gfc_check_conformance with G_().
    * check.c (gfc_invalid_boz): Mark hint for translation using
    _().  (gfc_check_achar): Mark for translation the message
    parameter to gfc_invalid_boz using G_(). (gfc_check_char):
    Mark for translation the message parameter to gfc_invalid_boz
    using G_().  (gfc_check_complex): Mark for translation the
    message parameter to gfc_invalid_boz using G_().
    (gfc_check_float): Mark for translation the message
    parameter to gfc_invalid_boz using G_(). (check_rest): Mark
    for translation the string parameter to gfc_check_conformance
    with _().  (gfc_check_minloc_maxloc): Mark for translation
    the string parameter to gfc_check_conformance with _().
    (gfc_check_findloc): Mark for translation the string parameter
    to gfc_check_conformance with _(). (check_reduction): Mark
    for translation the string parameter to gfc_check_conformance
    with _().  (gfc_check_pack): Mark for translation the string
    parameter to gfc_check_conformance with _().
    * decl.c (match_old_style_init): Mark for translation the
    message parameter to gfc_invalid_boz using G_().
    * decl.c (gfc_check_assign): Mark for translation the string
    parameter to gfc_check_conformance with _().
    * intrinsic.c (check_specific): Mark for translation the string
    parameter to gfc_check_conformance with _().
    (gfc_check_intrinsic_standard): Mark symstd_msg strings for
    translation using G_(). No need to mark symstd_msg for
    translation in call to gfc_warning or when setting symstd.
    * io.c (check_open_constraints):  Mark strings for translation
    using G_() in all calls to warn_or_error. (match_io_element):
    Mark for translation the message parameter to gfc_invalid_boz
    using G_().
    * primary.c (match_boz_constant): Mark for translation the
    message parameter to gfc_invalid_boz using G_().
    * resolve.c (resolve_elemental_actual):  Mark for translation
    the string parameter to gfc_check_conformance with _().
    (resolve_operator):  Mark for translation the string parameter
    to gfc_check_conformance with _().  Mark translation strings
    assigned to msg using G_() for use in a call to cfg_warning.


--
https://www.codethink.co.uk/privacy.html



Re: [PATCH PR95855]A missing ifcvt optimization to generate fcsel

2020-06-30 Thread Richard Biener via Gcc-patches
On Tue, Jun 30, 2020 at 1:31 PM yangyang (ET)  wrote:
>
> > On Tue, Jun 30, 2020 at 4:29 AM yangyang (ET) 
> > wrote:
> > >
> > > Hi,
> > >
> > > > > Hi,
> > > > >
> > > > > This is a simple fix for pr95855.
> > > > >
> > > > > With this fix, pass_split_paths can recognize the
> > > > > if-conversion
> > > > opportunity of the testcase and doesn't duplicate the corresponding 
> > > > block.
> > > > >
> > > > > Added one testcase for this. Bootstrap and tested on both
> > > > > aarch64 and
> > > > x86 Linux platform, no new regression witnessed.
> > > > >
> > > > > Ok for trunk?
> > > >
> > > > Can you try using the num_stmts_in_pred[12] counts instead of using
> > > > empty_block_p?
> > >
> > > It' ok to using num_stmts_in_pred[12] to judge whether the pred[12] is
> > > empty since bb's immediate dominator can't meet the constraints
> > > "single_pred_p (pred[12]) && single_pred (pred[12]) == pred[21]".
> > >
> > > >
> > > > Your matching doesn't allow for FP constants like
> > > >
> > > >  dmax[0] = d1[i] < 1.0 ? 1.0 : d1[i];
> > > >
> > > > since FP constants are not shared.  You likely want to use
> > > > operand_equal_p to do the PHI argument comparison.
> > >
> > > That's right, after using operand_equal_p instead of == to do the PHI
> > > argument Comparison, the mentioned case can be covered as well.
> > >
> > > >
> > > > Thanks,
> > > > Richard.
> > >
> > > Thanks for your suggestions. We have revised our patch based on your
> > suggestions.
> > >
> > > Bootstrap and tested on both aarch64 and x86 Linux platform. Does the v1
> > patch looks better?
> >
> > Yes.  This variant is OK.
> >
> > Thanks,
> > Richard.
>
> Thanks for reviewing this. Could you please help install it?

Installed.  Please double-check your ChangeLog with
gcc-verify, it had a typo in the testcase filename.

Richard.

> Yang Yang


Re: [PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182

2020-06-30 Thread Richard Sandiford
Richard Biener  writes:
> On Tue, 30 Jun 2020, Richard Sandiford wrote:
>
>> Richard Biener  writes:
>> > On Tue, Jun 30, 2020 at 2:18 PM Richard Sandiford
>> >  wrote:
>> >>
>> >> "Yangfei (Felix)"  writes:
>> >> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c 
>> >> > b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
>> >> > index e050db1a2e4..ea39fcac0e0 100644
>> >> > --- a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
>> >> > +++ b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
>> >> > @@ -1,6 +1,7 @@
>> >> >  /* { dg-do compile } */
>> >> >  /* { dg-additional-options "-O3" } */
>> >> >  /* { dg-additional-options "-mavx2" { target { i?86-*-* x86_64-*-* } } 
>> >> > } */
>> >> > +/* { dg-additional-options "-march=armv8.2-a+sve -fno-vect-cost-model" 
>> >> > { target aarch64*-*-* } } */
>> >> >
>> >> >  typedef struct {
>> >> >  unsigned short mprr_2[5][16][16];
>> >>
>> >> This test is useful for Advanced SIMD too, so I think we should continue
>> >> to test it with whatever options the person running the testsuite chose.
>> >> Instead we could duplicate the test in gcc.target/aarch64/sve with
>> >> appropriate options.
>> >>
>> >> > diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
>> >> > index eb8288e7a85..b30a7d8a3bb 100644
>> >> > --- a/gcc/tree-vect-data-refs.c
>> >> > +++ b/gcc/tree-vect-data-refs.c
>> >> > @@ -1823,8 +1823,11 @@ vect_enhance_data_refs_alignment (loop_vec_info 
>> >> > loop_vinfo)
>> >> >   {
>> >> > poly_uint64 nscalars = (STMT_SLP_TYPE (stmt_info)
>> >> > ? vf * DR_GROUP_SIZE 
>> >> > (stmt_info) : vf);
>> >> > -   possible_npeel_number
>> >> > - = vect_get_num_vectors (nscalars, vectype);
>> >> > +   if (maybe_lt (nscalars, TYPE_VECTOR_SUBPARTS (vectype)))
>> >> > + possible_npeel_number = 0;
>> >> > +   else
>> >> > + possible_npeel_number
>> >> > +   = vect_get_num_vectors (nscalars, vectype);
>> >> >
>> >> > /* NPEEL_TMP is 0 when there is no misalignment, but 
>> >> > also
>> >> >allow peeling NELEMENTS.  */
>> >>
>> >> OK, so this is coming from:
>> >>
>> >>   int s[16][2];
>> >>   …
>> >>   … =s[j][1];
>> >>
>> >> and an SLP node containing 16 instances of “s[j][1]”.  The DR_GROUP_SIZE
>> >> is 2 because that's the inner dimension of “s”.
>> >>
>> >> I don't think maybe_lt is right here though.  The same problem could in
>> >> principle happen for cases in which NSCALARS > TYPE_VECTOR_SUBPARTS,
>> >> e.g. for different inner dimensions of “s”.
>> >>
>> >> I think the testcase shows that using DR_GROUP_SIZE in this calculation
>> >> is flawed.  I'm not sure whether we can really do better given the current
>> >> representation though.  This is one case where having a separate 
>> >> dr_vec_info
>> >> per SLP node would help.
>> >
>> > I guess what the code likes to know is what we now have in SLP_TREE_LANES
>> > (or formerly group_size) but that's not necessarily connected to 
>> > DR_GROUP_SIZE.
>> > Given we only see a stmt_info here and there's no 1:1 mapping of SLP node
>> > to stmt_info (and the reverse mapping doesn't even exist) I do not have
>> > any good idea either.
>> >
>> > Honestly I don't really see what this code tries to do ... doesn't it
>> > simply want to set possible_npeel_number to TYPE_VECTOR_SUBPARTS 
>> > (vectype)?!
>> 
>> I think it's trying to set possible_npeel_number to the number of
>> vector stmts per iteration (i.e. ncopies for non-SLP stuff):
>> 
>>   /* For multiple types, it is possible that the bigger type 
>> access
>>  will have more than one peeling option.  E.g., a loop with 
>> two
>>  types: one of size (vector size / 4), and the other one of
>>  size (vector size / 8).  Vectorization factor will 8.  If 
>> both
>>  accesses are misaligned by 3, the first one needs one scalar
>>  iteration to be aligned, and the second one needs 5.  But 
>> the
>>  first one will be aligned also by peeling 5 scalar
>>  iterations, and in that case both accesses will be aligned.
>>  Hence, except for the immediate peeling amount, we also want
>>  to try to add full vector size, while we don't exceed
>>  vectorization factor.
>>  We do this automatically for cost model, since we calculate
>>  cost for every peeling option.  */
>> 
>> So for this example, possible_npeel_number==2 for the first access
>> (letting us try two peeling values for it) and possible_npeel_number==1
>> for the second.
>
> Yeah, but npeel is _scalar_ lanes, since we always peel scalar iterations.
> So it seems odd to somehow put in the number of vectors...  so to me
> it would have made sense if it did
>
>   possible_npeel_number = lower_bound (nscalars);
>
> or 

Re: [PATCH] analyzer/pr94028.C and non-null warning

2020-06-30 Thread David Edelsohn via Gcc-patches
The unexpected warning is

gcc/testsuite/g++.dg/analyzer/pr94028.C:28:21: warning: use of
possibly-NULL '' where non-null expected [CWE-690]
[-Wanalyzer-possible-null-argument]

This is the same location as one of the existing "leak" warnings.

How would you like pr94028.C to be adjusted in the testsuite to
account for this?

Thanks, David

On Tue, Jun 30, 2020 at 10:18 AM David Malcolm  wrote:
>
> On Tue, 2020-06-30 at 09:51 -0400, David Edelsohn wrote:
> > The changes to the non-null warning now produce an additional warning
> > for analyzer/pr94028.C on one of the "leak" lines.  This causes new
> > failures on trunk.
>
> Hi David
>
> Do you have the output to hand?  What is the full text of the new
> diagnostic?
>
> Some high level questions:
>   * Ought GCC to warn for that code?  (or not)
>   * Is it behavior we ought to have a test for?
>
> Note that AFAIK there are *two* kinds of non-null warnings that GCC can
> emit: the kind emitted by Martin's code, versus those emitted by
> -fanalyzer (the latter imply the much more expensive analysis performed
> by -fanalyzer, and are controlled by the various -Wanalyzer-*null*
> options; see
> https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Static-Analyzer-Options.html
> )
>
>
> > Because non-null is not the purpose of the analyzer test, I propose
> > pruning the output to resolve the new failures.
>
> Looking back through bugzilla, it seems that the main purpose of adding
> that test was to ensure that -fanalyzer doesn't ICE on that code.
>
> At some point I hope to properly support C++ in -fanalyzer, at which
> point some kind of null warning may be warranted on that code.  Sadly
> I'm not yet at that point.  FWIW I'm currently working on a big rewrite
> of how state is tracked within the analyzer, as I've identified at
> least two major flaws in the current implementation, which my rewrite
> addresses.  I'm deferring on C++ support until that rewrite is done.
>
> >   Alternatively, I
> > could explicitly test for the additional non-null warning.
> >
> > Do you have any preferences?
>
> This test is controlled by analyzer.exp and thus, for example, is
> disabled if the analyzer was disabled at configure time.
>
> If this is coming from Martin's non-analyzer code, a third possibility
> would be to use -Wno-something to disable that warning, so that the
> analyzer test can focus on the -fanalyzer test, as it were (and if this
> is a behavior that ought to be checked for in Martin's warning, then
> copy the pertinent fragment of the testcase to one of the g++.dg cases,
> I suppose).  I think I prefer that approach.
>
> How does this sound?
>
> > I propose appending
> >
> > // { dg-prune-output "where non-null expected" }
> >
> > to the file to prune the warning output.
> >
> > Thanks, David
>
> Thanks for looking into this; hope this is constructive.
> Dave
>


Re: [PATCH] Fortran : ICE in generic_correspondence PR95584

2020-06-30 Thread Mark Eggleston

I forgot to use [PATCH] in the subject line. ping!

On 25/06/2020 07:33, Mark Eggleston wrote:

Please find attached fix for PR95584.  Original patch by Steve Kargl.

OK to commit and backport?

Commit message:

Fortran  : ICE in generic_correspondence PR95584

Output an error for ambiguous interfaces in generic interface
instead of ICE.

2020-06-25  Steven G. Kargl  

gcc/fortran/

    PR fortran/95584
    * interface.c (generic_correspondence): Only use the pointer
    to a symbol if exists.

2020-06-25  Mark Eggleston 

gcc/testsuite/

    PR fortran/95584
    * gfortran.dg/pr95584.f90: New test.


--
https://www.codethink.co.uk/privacy.html



Re: [PATCH] VEC_COND_EXPR: do not expand comparisons feeding it

2020-06-30 Thread Richard Biener via Gcc-patches
On Tue, Jun 30, 2020 at 2:16 PM Martin Liška  wrote:
>
> On 6/30/20 12:38 PM, Richard Biener wrote:
> > On Tue, Jun 30, 2020 at 11:44 AM Martin Liška  wrote:
> >>
> >> Hi.
> >>
> >> The patch is about blocking of vector expansion of comparisons
> >> that are only feeding a VEC_COND_EXPR statements.
> >>
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >> The problematic mips64 test-case looks good now.
> >>
> >> Ready to be installed?
> >
> > So why does
> >
> > static tree
> > expand_vector_comparison (gimple_stmt_iterator *gsi, tree type, tree op0,
> >tree op1, enum tree_code code)
> > {
> >tree t;
> >if (!expand_vec_cmp_expr_p (TREE_TYPE (op0), type, code)
> >&& !expand_vec_cond_expr_p (type, TREE_TYPE (op0), code))
> > 
> >
> > not return true
>
> Apparently because it's called for type:
> (gdb) p debug_tree(gimple_expr_type(stmt))
> type   size 
>  unit-size 
>  align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> 0x777212a0 precision:32 min  max 
> >
>  BLK
>  size  bitsizetype> constant 64>
>  unit-size  sizetype> constant 8>
>  align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> 0x77721348 nunits:2>
>
> while ...
>
> >  {
> >
> > but
> >
> > /* Expand a vector condition to scalars, by using many conditions
> > on the vector's elements.  */
> > static void
> > expand_vector_condition (gimple_stmt_iterator *gsi, auto_bitmap 
> > *dce_ssa_names)
> > {
> > ...
> >if (expand_vec_cond_expr_p (type, TREE_TYPE (a1), code))
> >  {
> >gcc_assert (TREE_CODE (a) == SSA_NAME || TREE_CODE (a) == 
> > VECTOR_CST);
> >return;
> >
> > does?
>
> this has
>
> type   size 
>  unit-size 
>  align:32 warn_if_not_align:0 symtab:0 alias-set 1 canonical-type 
> 0x7763e2a0 precision:32
>  pointer_to_this >
>  V2SF
>  size  bitsizetype> constant 64>
>  unit-size  sizetype> constant 8>
>  align:64 warn_if_not_align:0 symtab:0 alias-set 1 canonical-type 
> 0x776fb3f0 nunits:2
>  pointer_to_this >
>
> > What's special about the problematical mips testcase?
>
> That we end up with something like:
>
>_34 = {_29, _37};
>vect_iftmp.12_35 = VEC_COND_EXPR <_34, vect__1.7_28, vect__2.10_31>;
>
> which can't be expanded to .VCOND, but we expand to series of BIT_FIELD_REFs:
>
>_40 = BIT_FIELD_REF ;
>_22 = BIT_FIELD_REF <_34, 32, 0>;
>_10 = _22 != 0 ? _26 : _40;
>_16 = BIT_FIELD_REF ;
>_18 = BIT_FIELD_REF <_34, 32, 32>;
>_41 = _18 == 0 ? _16 : _30;
>
>
> > Can't you produce
> > the same "bad" result when the comparison is used in a non-VEC_COND?
>
> Theoretically yes.
>
> Anyway, question is how much do we care about the fallout as it happens for 
> MIPS and
> it's only a worse code stuff?

We certainly care about the fact that vector lowering will lower the
condition first
and only then come along a VEC_COND_EXPR and lowers that as well.  It's
classical fallout of separating the compare from the condition.  I see
vector lowering works on BBs in "random" order and I've always thought that it
should work more like complex lowering using a lattice to be able to optimize
things better.

Now to simply restore previous behavior for this particular case we should
probably make expand_vector_comparison walk immediate uses as you do
but then call expand_vector_condition for each VEC_COND_EXPR use,
making that return whether it "consumed" the comparison.  If all uses
consumed the comparison we should DCE it, if not, we should lower it?

Richard.

> Martin
>
> >
> > There's a PR reference missing in the ChangeLog.
> >
> > Richard.
> >
> >> Thanks,
> >> Martin
> >>
> >> gcc/ChangeLog:
> >>
> >>  * tree-vect-generic.c (expand_vector_comparison): Do not expand
> >>  comparison that only feed first argument of a VEC_COND_EXPR 
> >> statement.
> >> ---
> >>gcc/tree-vect-generic.c | 24 
> >>1 file changed, 24 insertions(+)
> >>
> >> diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
> >> index a4b56195903..4606decd0f2 100644
> >> --- a/gcc/tree-vect-generic.c
> >> +++ b/gcc/tree-vect-generic.c
> >> @@ -379,6 +379,30 @@ static tree
> >>expand_vector_comparison (gimple_stmt_iterator *gsi, tree type, tree 
> >> op0,
> >>  tree op1, enum tree_code code)
> >>{
> >> +  tree lhs = gimple_assign_lhs (gsi_stmt (*gsi));
> >> +  use_operand_p use_p;
> >> +  imm_use_iterator iterator;
> >> +  bool vec_cond_expr_only = true;
> >> +  bool has_use = false;
> >> +
> >> +  /* As seen in PR95830, we should not expand comparisons that are only
> >> + feeding a VEC_COND_EXPR statement.  */
> >> +  FOR_EACH_IMM_USE_FAST (use_p, iterator, lhs)
> >> +{
> >> +  has_use = true;
> >> +  gassign *use = dyn_cast (USE_STMT (use_p));
> >> +  if (use == NULL
> >> + || 

Re: [PATCH][RFC] Do not stream all zeros for gcda files.

2020-06-30 Thread Nathan Sidwell

On 6/30/20 8:53 AM, Martin Liška wrote:

Hey.

The bug reported confirmed that using the direct merging provides a solid
speed and so I don't insist on this patch. Using a generic compression 
algorithm

would be a better solution anyway.


Thanks for the reminder.  I won't insist on the generic compression 
scheme, because it's more work.  And you've got this one now.  While we 
do (or I tried at least) insisting the gcov format was an internal 
implementation detail, that does tend to get ignored.  It seems possible 
we could add zlib (or other) compression later in a compatible manner.


nathan

--
Nathan Sidwell


[PATCH] analyzer/pr94028.C and non-null warning

2020-06-30 Thread David Edelsohn via Gcc-patches
The changes to the non-null warning now produce an additional warning
for analyzer/pr94028.C on one of the "leak" lines.  This causes new
failures on trunk.

Because non-null is not the purpose of the analyzer test, I propose
pruning the output to resolve the new failures.  Alternatively, I
could explicitly test for the additional non-null warning.

Do you have any preferences?

I propose appending

// { dg-prune-output "where non-null expected" }

to the file to prune the warning output.

Thanks, David


Re: [PATCH 3/9] [OpenACC] Adjust dynamic reference count semantics

2020-06-30 Thread Thomas Schwinge
Hi Julian!

On 2020-06-16T15:38:33-0700, Julian Brown  wrote:
> This is a new version of the patch last sent here:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546332.html
>
> Minus the bits that Thomas has committed already (thanks!), and with
> adjustments to allow for GOMP_MAP_ATTACH being grouped together with a
> preceding clause.
>
> OK?

Please also update the "virtual refcount" comment in
'libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-4.c'.

Your patch now makes the 'libgomp.oacc-fortran/mdc-refcount-1-1-1.f90',
'libgomp.oacc-fortran/mdc-refcount-1-2-1.f90',
'libgomp.oacc-fortran/mdc-refcount-1-2-2.f90',
'libgomp.oacc-fortran/mdc-refcount-1-3-1.f90' test cases PASS (did you
not see that?), so we have to remove all XFAILing, 'print'/'dg-output'
etc. from these, and it changes the error reporting in
'libgomp.oacc-fortran/mdc-refcount-1-4-1.f90', so we have to adjust that.
See attached patch "into Adjust dynamic reference count semantics".

Basically OK for master branch and releases/gcc-10 branch.  However,
still a few questions, which can be addressed first, or separately, as
appropriate:

> --- a/libgomp/oacc-mem.c
> +++ b/libgomp/oacc-mem.c

> @@ -1048,13 +1052,111 @@ goacc_enter_data_internal (struct gomp_device_descr 
> *acc_dev, size_t mapnum,
>  {
>for (size_t i = 0; i < mapnum; i++)
>  {
> -  int group_last = find_group_last (i, mapnum, sizes, kinds);
> +  splay_tree_key n;
> +  size_t group_last = find_group_last (i, mapnum, sizes, kinds);
> +  bool struct_p = false;
> +  size_t size, groupnum = (group_last - i) + 1;
> +
> +  switch (kinds[i] & 0xff)
> + {
> + case GOMP_MAP_STRUCT:
> +   {
> + int last = i + sizes[i];

(If you'd like to, see my comment about 'last' in
<87k10o72dd.fsf@euler.schwinge.homeip.net">http://mid.mail-archive.com/87k10o72dd.fsf@euler.schwinge.homeip.net>.)

> + size = (uintptr_t) hostaddrs[last] + sizes[last]
> +- (uintptr_t) hostaddrs[i];
> + struct_p = true;
> +   }
> +   break;
> +
> + case GOMP_MAP_ATTACH:
> +   size = sizeof (void *);
> +   break;
> +
> + default:
> +   size = sizes[i];
> + }
> +
> +  n = lookup_host (acc_dev, hostaddrs[i], size);
> +
> +  if (n && struct_p)
> + {
> +   if (n->refcount != REFCOUNT_INFINITY)
> + n->refcount += groupnum - 1;
> +   n->dynamic_refcount += groupnum - 1;
> +   gomp_mutex_unlock (_dev->lock);
> + }

As that had already confused me before,
<87k10o72dd.fsf@euler.schwinge.homeip.net">http://mid.mail-archive.com/87k10o72dd.fsf@euler.schwinge.homeip.net>,
please add a minimal comment here, something like: "Increment refcount
not by one but by number of items in 'GOMP_MAP_STRUCT'".

> +  else if (n && groupnum == 1)
> + {
> +   void *h = hostaddrs[i];
> +   size_t s = sizes[i];
> +
> +   /* A standalone attach clause.  */
> +   if ((kinds[i] & 0xff) == GOMP_MAP_ATTACH)
> + gomp_attach_pointer (acc_dev, aq, _dev->mem_map, n,
> +  (uintptr_t) h, s, NULL);
> +   else if (h + s > (void *) n->host_end)
> + {
> +   gomp_mutex_unlock (_dev->lock);
> +   gomp_fatal ("[%p,+%d] not mapped", (void *)h, (int)s);
> + }
> +
> +   assert (n->refcount != REFCOUNT_LINK);
> +   if (n->refcount != REFCOUNT_INFINITY)
> + n->refcount++;
> +   n->dynamic_refcount++;
>
> -  gomp_map_vars_async (acc_dev, aq,
> -(group_last - i) + 1,
> -[i], NULL,
> -[i], [i], true,
> -GOMP_MAP_VARS_OPENACC_ENTER_DATA);
> +   gomp_mutex_unlock (_dev->lock);
> + }
> +  else if (n && groupnum > 1)
> + {
> +   assert (n->refcount != REFCOUNT_INFINITY
> +   && n->refcount != REFCOUNT_LINK);
> +
> +   for (size_t j = i + 1; j <= group_last; j++)
> + if ((kinds[j] & 0xff) == GOMP_MAP_ATTACH)
> +   {
> + splay_tree_key m
> +   = lookup_host (acc_dev, hostaddrs[j], sizeof (void *));
> + gomp_attach_pointer (acc_dev, aq, _dev->mem_map, m,
> +  (uintptr_t) hostaddrs[j], sizes[j], NULL);
> +   }

Per the earlier '[OpenACC] GOMP_MAP_ATTACH handling in find_group_last',
we should never have more than one 'GOMP_MAP_ATTACH' following something
else (right?), but it's still OK to leave this in this generic form --
unless you want to add some 'assert'ing here.

> +
> +   bool processed = false;
> +
> +   struct target_mem_desc *tgt = n->tgt;
> +   for (size_t j = 0; j < tgt->list_count; j++)
> + if (tgt->list[j].key == n)
> +   {
> + for (size_t k = 0; k < groupnum; k++)
> +   if (j + k < tgt->list_count && tgt->list[j + k].key)
> + {
> +   tgt->list[j + k].key->refcount++;
> +  

Re: [PATCH][GCC][Arm] PR target/95646: Do not clobber callee saved registers with CMSE

2020-06-30 Thread Andre Vieira (lists)



On 29/06/2020 11:15, Christophe Lyon wrote:

On Mon, 29 Jun 2020 at 10:56, Andre Vieira (lists)
 wrote:


On 23/06/2020 21:52, Christophe Lyon wrote:

On Tue, 23 Jun 2020 at 15:28, Andre Vieira (lists)
 wrote:

On 23/06/2020 13:10, Kyrylo Tkachov wrote:

-Original Message-
From: Andre Vieira (lists) 
Sent: 22 June 2020 09:52
To: gcc-patches@gcc.gnu.org
Cc: Kyrylo Tkachov 
Subject: [PATCH][GCC][Arm] PR target/95646: Do not clobber callee saved
registers with CMSE

Hi,

As reported in bugzilla when the -mcmse option is used while compiling
for size (-Os) with a thumb-1 target the generated code will clear the
registers r7-r10. These however are callee saved and should be preserved
accross ABI boundaries. The reason this happens is because these
registers are made "fixed" when optimising for size with Thumb-1 in a
way to make sure they are not used, as pushing and popping hi-registers
requires extra moves to and from LO_REGS.

To fix this, this patch uses 'callee_saved_reg_p', which accounts for
this optimisation, instead of 'call_used_or_fixed_reg_p'. Be aware of
'callee_saved_reg_p''s definition, as it does still take call used
registers into account, which aren't callee_saved in my opinion, so it
is a rather misnoemer, works in our advantage here though as it does
exactly what we need.

Regression tested on arm-none-eabi.

Is this OK for trunk? (Will eventually backport to previous versions if
stable.)

Ok.
Thanks,
Kyrill

As I was getting ready to push this I noticed I didn't add any skip-ifs
to prevent this failing with specific target options. So here's a new
version with those.

Still OK?


Hi,

This is not sufficient to skip arm-linux-gnueabi* configs built with
non-default cpu/fpu.

For instance, with arm-linux-gnueabihf --with-cpu=cortex-a9
--with-fpu=neon-fp16 --with-float=hard
I see:
FAIL: gcc.target/arm/pr95646.c (test for excess errors)
Excess errors:
cc1: error: ARMv8-M Security Extensions incompatible with selected FPU
cc1: error: target CPU does not support ARM mode

and the testcase is compiled with -mcpu=cortex-m23 -mcmse -Os

Resending as I don't think my earlier one made it to the lists (sorry if
you are receiving this double!)

I'm not following this, before I go off and try to reproduce it, what do
you mean by 'the testcase is compiled with -mcpu=cortex-m23 -mcmse -Os'?
These are the options you are seeing in the log file? Surely they should
override the default options? Only thing I can think of is this might
need an extra -mfloat-abi=soft to make sure it overrides the default
float-abi.  Could you give that a try?

No it doesn't make a difference alone.

I also had to add:
-mfpu=auto (that clears the above warning)
-mthumb otherwise we now get cc1: error: target CPU does not support ARM mode

Looks like some effective-target machinery is needed
So I had a look at this,  I was pretty sure that -mfloat-abi=soft 
overwrote -mfpu=<>, which in large it does, as in no FP instructions 
will be generated but the error you see only checks for the right number 
of FP registers. Which doesn't check whether 'TARGET_HARD_FLOAT' is set 
or not. I'll fix this too and use the check-effective-target for 
armv8-m.base for this test as it is indeed a better approach than my bag 
of skip-ifs. I'm testing it locally to make sure my changes don't break 
anything.


Cheers,
Andre


Christophe



Cheers,
Andre

Christophe


Cheers,
Andre

Cheers,
Andre

gcc/ChangeLog:
2020-06-22  Andre Vieira  

PR target/95646
* config/arm/arm.c: (cmse_nonsecure_entry_clear_before_return):
Use 'callee_saved_reg_p' instead of
'calL_used_or_fixed_reg_p'.

gcc/testsuite/ChangeLog:
2020-06-22  Andre Vieira  

PR target/95646
* gcc.target/arm/pr95646.c: New test.


Re: [PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182

2020-06-30 Thread Richard Sandiford
Richard Biener  writes:
> On Tue, Jun 30, 2020 at 2:18 PM Richard Sandiford
>  wrote:
>>
>> "Yangfei (Felix)"  writes:
>> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c 
>> > b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
>> > index e050db1a2e4..ea39fcac0e0 100644
>> > --- a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
>> > +++ b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
>> > @@ -1,6 +1,7 @@
>> >  /* { dg-do compile } */
>> >  /* { dg-additional-options "-O3" } */
>> >  /* { dg-additional-options "-mavx2" { target { i?86-*-* x86_64-*-* } } } 
>> > */
>> > +/* { dg-additional-options "-march=armv8.2-a+sve -fno-vect-cost-model" { 
>> > target aarch64*-*-* } } */
>> >
>> >  typedef struct {
>> >  unsigned short mprr_2[5][16][16];
>>
>> This test is useful for Advanced SIMD too, so I think we should continue
>> to test it with whatever options the person running the testsuite chose.
>> Instead we could duplicate the test in gcc.target/aarch64/sve with
>> appropriate options.
>>
>> > diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
>> > index eb8288e7a85..b30a7d8a3bb 100644
>> > --- a/gcc/tree-vect-data-refs.c
>> > +++ b/gcc/tree-vect-data-refs.c
>> > @@ -1823,8 +1823,11 @@ vect_enhance_data_refs_alignment (loop_vec_info 
>> > loop_vinfo)
>> >   {
>> > poly_uint64 nscalars = (STMT_SLP_TYPE (stmt_info)
>> > ? vf * DR_GROUP_SIZE (stmt_info) : 
>> > vf);
>> > -   possible_npeel_number
>> > - = vect_get_num_vectors (nscalars, vectype);
>> > +   if (maybe_lt (nscalars, TYPE_VECTOR_SUBPARTS (vectype)))
>> > + possible_npeel_number = 0;
>> > +   else
>> > + possible_npeel_number
>> > +   = vect_get_num_vectors (nscalars, vectype);
>> >
>> > /* NPEEL_TMP is 0 when there is no misalignment, but also
>> >allow peeling NELEMENTS.  */
>>
>> OK, so this is coming from:
>>
>>   int s[16][2];
>>   …
>>   … =s[j][1];
>>
>> and an SLP node containing 16 instances of “s[j][1]”.  The DR_GROUP_SIZE
>> is 2 because that's the inner dimension of “s”.
>>
>> I don't think maybe_lt is right here though.  The same problem could in
>> principle happen for cases in which NSCALARS > TYPE_VECTOR_SUBPARTS,
>> e.g. for different inner dimensions of “s”.
>>
>> I think the testcase shows that using DR_GROUP_SIZE in this calculation
>> is flawed.  I'm not sure whether we can really do better given the current
>> representation though.  This is one case where having a separate dr_vec_info
>> per SLP node would help.
>
> I guess what the code likes to know is what we now have in SLP_TREE_LANES
> (or formerly group_size) but that's not necessarily connected to 
> DR_GROUP_SIZE.
> Given we only see a stmt_info here and there's no 1:1 mapping of SLP node
> to stmt_info (and the reverse mapping doesn't even exist) I do not have
> any good idea either.
>
> Honestly I don't really see what this code tries to do ... doesn't it
> simply want to set possible_npeel_number to TYPE_VECTOR_SUBPARTS (vectype)?!

I think it's trying to set possible_npeel_number to the number of
vector stmts per iteration (i.e. ncopies for non-SLP stuff):

  /* For multiple types, it is possible that the bigger type access
 will have more than one peeling option.  E.g., a loop with two
 types: one of size (vector size / 4), and the other one of
 size (vector size / 8).  Vectorization factor will 8.  If both
 accesses are misaligned by 3, the first one needs one scalar
 iteration to be aligned, and the second one needs 5.  But the
 first one will be aligned also by peeling 5 scalar
 iterations, and in that case both accesses will be aligned.
 Hence, except for the immediate peeling amount, we also want
 to try to add full vector size, while we don't exceed
 vectorization factor.
 We do this automatically for cost model, since we calculate
 cost for every peeling option.  */

So for this example, possible_npeel_number==2 for the first access
(letting us try two peeling values for it) and possible_npeel_number==1
for the second.

Thanks,
Richard


Re: Fortran : Fortran translation issues PR52279

2020-06-30 Thread Thomas Koenig

Hi Mark,

that one is OK for master.  Thanks!

Regards

Thomas



[PATCH PR95700 v2] Redefine NULL to nullptr

2020-06-30 Thread Ilya Leoshkevich via Gcc-patches
v1: https://gcc.gnu.org/pipermail/gcc-patches/2020-June/548822.html

This is the implementation of Richard's suggestion.  I've decided not
to remove the initial #define NULL 0, since system.h mixes inclusion of
system and helper headers (e.g. libiberty), which might use NULL - if
not today, then in the future.  If system headers do not provide NULL,
removing #define NULL 0 might lead to build errors.  Ideally the order
should be:

  system headers
  #undef NULL
  #define NULL nullptr
  helper headers

but I fear that rearranging includes might break some weird system.

This patch helps with alpine bootstrap, I'm currently testing it on
x86_64-redhat-linux.

---

Bootstrap with musl libc fails with numerous "missing sentinel in
function call" errors.  This is because musl defines NULL as 0L for C++,
but gcc requires sentinel value to be a pointer or __null.

Jonathan Wakely says:

To be really safe during stage 1, GCC should not use NULL as a
pointer sentinel in C++ code anyway.

The bootstrap compiler could define it to 0 or 0u, neither of which
is guaranteed to be OK to pass as a varargs sentinel where a null
pointer is expected.  Any of (void*)0 or (void*)NULL or nullptr
would be safe.

While it is possible to fix this by replacing NULL sentinels with
nullptrs, such approach would generate backporting conflicts, therefore
simply redefine NULL to nullptr at the end of system.h, where it would
not confuse system headers.

gcc/ChangeLog:

2020-06-30  Ilya Leoshkevich  

PR bootstrap/95700
* system.h (NULL): Redefine to nullptr.
---
 gcc/system.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/system.h b/gcc/system.h
index 544f7ba427f..1eec77e730e 100644
--- a/gcc/system.h
+++ b/gcc/system.h
@@ -1244,4 +1244,14 @@ void gcc_stablesort (void *, size_t, size_t,
of the number.  */
 #define PRsa(n) "%" #n PRIu64 "%c"
 
+/* System headers may define NULL to be an integer (e.g. 0L), which cannot be
+   used safely in certain contexts (e.g. as sentinels).  Redefine NULL to
+   nullptr in order to make it safer.  Note that this might confuse system
+   headers, however, by convention they must not be included after this point.
+*/
+#ifdef __cplusplus
+#undef NULL
+#define NULL nullptr
+#endif
+
 #endif /* ! GCC_SYSTEM_H */
-- 
2.25.4



RE: [PATCH] arm: Warn if IRQ handler is not compiled with -mgeneral-regs-only [PR target/94743]

2020-06-30 Thread Kyrylo Tkachov
Hi Christophe,

Sorry for the delay.

> -Original Message-
> From: Gcc-patches  On Behalf Of
> Christophe Lyon via Gcc-patches
> Sent: 29 April 2020 16:19
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH] arm: Warn if IRQ handler is not compiled with -mgeneral-
> regs-only [PR target/94743]
> 
> The interrupt attribute does not guarantee that the FP registers are
> saved, which can result in problems difficult to debug.
> 
> Saving the FP registers and status registers can be a large penalty,
> so it's probably not desirable to do that all the time.
> 
> If the handler calls other functions, we'd likely need to save all of
> them, for lack of knowledge of which registers they actually use.
> 
> This is even more obscure for the end-user when the compiler inserts
> calls to helper functions such as memcpy (some multilibs do use FP
> registers to speed it up).
> 
> In the PR, we discussed adding routines in libgcc to save the FP
> context and saving only locally-clobbered FP registers, but I think
> this is too intrusive for stage 4.
> 
> In the mean time, emit a warning to suggest re-compiling with
> -mgeneral-regs-only. Note that this can lead to errors if the code
> uses floating-point and -mfloat-abi=hard, eg:
> argument of type 'double' not permitted with -mgeneral-regs-only
> 
> This can be troublesome for the user, but at least this would make
> them aware of the latent issue.
> 
> The patch adds two testcases:
> - pr94734-1.c checks that a warning is emitted. One function can make
>   implicit calls to runtime floating-point routines, the other one
>   doesn't. We can improve the diagnostic later not to warn in the
>   second case.
> 
> - pr94734-2.c checks that no warning is emitted when using
>   -mgeneral-regs-only.
> 
> 2020-04-29  Christophe Lyon  
> 
>   PR target/94743
>   gcc/
>   * config/arm/arm.c (arm_handle_isr_attribute): Warn if
>   -mgeneral-regs-only is not used.
> 
>   gcc/testsuite/
>   * gcc.target/arm/pr94743-1.c: New test.
>   * gcc.target/arm/pr94743-2.c: New test.
> ---
>  gcc/config/arm/arm.c |  5 +
>  gcc/testsuite/gcc.target/arm/pr94743-1.c | 20 
>  gcc/testsuite/gcc.target/arm/pr94743-2.c | 17 +
>  3 files changed, 42 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-2.c
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 6a6e804..34aad1d 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -7176,6 +7176,11 @@ arm_handle_isr_attribute (tree *node, tree name,
> tree args, int flags,
>  name);
> *no_add_attrs = true;
>   }
> +  else if (TARGET_VFP_BASE)
> + {
> +   warning (OPT_Wattributes, "FP registers might be clobbered
> despite %qE attribute: compile with -mgeneral-regs-only",
> +name);
> + }

Let's use %< and %> to quote -mgeneral-regs-only properly.
Ok with that change.
Thanks,
Kyrill

>/* FIXME: the argument if any is checked for type attributes;
>should it be checked for decl ones?  */
>  }
> diff --git a/gcc/testsuite/gcc.target/arm/pr94743-1.c
> b/gcc/testsuite/gcc.target/arm/pr94743-1.c
> new file mode 100644
> index 000..67700c6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pr94743-1.c
> @@ -0,0 +1,20 @@
> +/* PR target/94743 */
> +/* { dg-do compile } */
> +
> +typedef struct {
> +  double fpdata[32];
> +} dummy_t;
> +
> +dummy_t global_d;
> +dummy_t global_d1;
> +
> +__attribute__ ((interrupt("IRQ"))) void IRQ_HDLR_Test(void)
> +{ /* { dg-warning { FP registers might be clobbered despite 'interrupt'
> attribute: compile with -mgeneral-regs-only} "" { target *-*-* } . } */
> +  global_d.fpdata[3] += global_d.fpdata[3] * global_d1.fpdata[3];
> +}
> +
> +
> +__attribute__ ((interrupt("IRQ"))) void IRQ_HDLR_Test2(void)
> +{ /* { dg-warning { FP registers might be clobbered despite 'interrupt'
> attribute: compile with -mgeneral-regs-only} "" { target *-*-* } . } */
> +  global_d.fpdata[3] = 1.0;
> +}
> diff --git a/gcc/testsuite/gcc.target/arm/pr94743-2.c
> b/gcc/testsuite/gcc.target/arm/pr94743-2.c
> new file mode 100644
> index 000..745fd36
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pr94743-2.c
> @@ -0,0 +1,17 @@
> +/* PR target/94743 */
> +/* { dg-do compile } */
> +/* { dg-options "-mgeneral-regs-only" } */
> +
> +typedef struct {
> +  /* Do not use floating-point types, which are not be compatible with
> + -mgeneral-regs-only under -mfloat-abi=hard */
> +  int fpdata[32];
> +} dummy_t;
> +
> +dummy_t global_d;
> +dummy_t global_d1;
> +
> +__attribute__ ((interrupt("IRQ"))) void IRQ_HDLR_Test(void)
> +{
> +  global_d.fpdata[3] += global_d.fpdata[3] * global_d1.fpdata[3];
> +}
> --
> 2.7.4



Re: [PATCH][RFC] Do not stream all zeros for gcda files.

2020-06-30 Thread Martin Liška

Hey.

The bug reported confirmed that using the direct merging provides a solid
speed and so I don't insist on this patch. Using a generic compression algorithm
would be a better solution anyway.

Martin


Re: [PATCH] nvptx: Add support for subword compare-and-swap

2020-06-30 Thread Tom de Vries
On 6/23/20 6:44 PM, Thomas Schwinge wrote:
> Per the PTX 3.1 manual that I looked into, I see for CAS it supports:
> '.b32', '.b64'.  We've got:
> 
> $ build-gcc-offload-nvptx-none/gcc/xgcc 
> -Bbuild-gcc-offload-nvptx-none/gcc -dM -E -x c /dev/null | grep -i 
> compare.and.swap
> #define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 1
> #define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8 1
> 
> ..., so that would match the PTX 3.1 manual.  GCC also seems to know
> about 16-byte ("'.b128'", which doesn't exist in PTX).  A quick run of
> your testcase with 'GENERATE_TEST(__int128)' seems to just work, but I
> haven't verified why/how.

In this case, we hit the critical section strategy and use
expand_omp_atomic_mutex.

Thanks,
- Tom




Re: [PATCH] Split load permutation

2020-06-30 Thread Richard Sandiford
Richard Biener  writes:
>> Another thing we'd like for SVE in general is to allow vectors *with*
>> gaps throughout the SLP graph, since with predication it's simple to
>> ignore any inactive element where necessary.  This is often much cheaper
>> than packing the active elements together to remove gaps.  For example:
>> 
>>   a[0] += 1;
>>   a[2] += 1;
>>   a[3] += 1;
>> 
>> should just be a predicated load, add and store, with no shuffling.
>> 
>> Even on targets without predication, it might be better to leave
>> the gaps in for part of the graph, e.g. if two loads feed a single
>> permute.  So it would be good if the node representation allowed
>> arbitrary permutations, including with “dead” elements at any
>> position.  (I realise that isn't news and that keeping the gap
>> removal with the load was just “for now” :-))
>
> Hmm, OK.  So I'm still experimenting with how to incrementally push
> these kind of changes to master.  Unfortunately it resists any
> "serious" change because we've painted us into a corner with respect
> to load and store handling ;)  Well, it just means the change will
> need to be bigger than anticipated.
>
> As for inactive lanes, for SVE this means a mask register for each
> operation, correct?

Certainly for potentially trapping ops and reductions, yeah.
For others it really doesn't matter (and those wouldn't require
a target that supports predication).

So I don't think having gaps necessarily means we have a mask.
Being able to attach masks to nodes (perhaps independently of gaps)
would be useful though.

> At the moment the SLP graph is a representation
> of the scalar GIMPLE IL and we cannot really change that until we
> commit to a vectorization factor and more.  So an inactive lane
> is simply not there and a "load" is simply another way of building
> up a vector from scalar stmt results much like those "built from scalars"
> SLP nodes but we of course make them special because we have those
> DR groups that are used.
>
> So with the patch as posted the "gaps" are represented in the
> load permutation of the "loads" which is where you could create
> mask registers from and simply push them to SLP parents (up to
> the point you get to a parent whose childs have differing masks...).

OK.  But I'd argue that's true of permutations too.  At the point
that the SLP graph just represents scalar gimple IL, we can simply
permute SLP_TREE_SCALAR_STMTS and not have any explicit permute
operations in the graph.

Permutations and gaps only come into play once we add more
vector-related information or restrictions.

> I think we're going to have a set of post-processing steps on the
> initial SLP graph for both "optimizing" where (and if) to do permutations
> and whether to elide gap removal in favor of masks (and we'd then
> annotate the SLP nodes with the active mask).

So we'd start out without any permutations and gaps, and then add
them as post-processing step based on what the target can handle?
If so, sounds good to me FWIW.

> All of this requires pushing back some of the early decisions
> (I've mostly identified vectorization factor determining) but also
> do load/store classification early.  For example if we end up
> with strided loads or stores such operations can fuse with any
> permutation at no cost.
>
> At the moment I'm continuing to poke the code for its least
> resistance for introducing at least parts of the machinery.
> I'm targeting post-processing for merging of identical
> permutes across operations like it appears for
>
>   a[0] = b[1] + c[1];
>   a[1] = b[0] + c[0];
>
>> I guess this to some extent feeds into a long-standing TODO to allow
>> “don't care” elements in things like vec_perm_indices.
>
> Hmm, so when there's no masking and we don't want to remove gaps
> we'd replace the permutation removing the gaps with an operation
> that turns the inactive lanes to zero (bitwise and) or all ones
> (bitwise or).  Tracking the number of "vector" lanes compared
> to the number of "real scalar" lanes in a SLP node might be enough
> to handle this more or less transparently.

I was also thinking of cases like:

  a[0] += b[0];
  a[1] += c[1];
  a[2] += b[2];
  a[3] += c[3];

(for all targets).

If we can somehow be smart about the handling of “c” and load a vector
at c[0] rather than c[1], the rhs of the addition could be a blend of:

  b[0]  b[2] 
   c[1]  c[3]

Or, even if we can't:

  b[0]  b[2] 
  c[1]  c[3] 

is a natural permute on some targets (e.g. Arm ones).

This of course requires the usual proof that the other elements of b[]
and c[] can be safely loaded.

Thanks,
Richard


Re: [PATCH PR95961] vect: ICE: in exact_div, at poly-int.h:2182

2020-06-30 Thread Richard Biener
On June 30, 2020 4:23:03 PM GMT+02:00, Richard Sandiford 
 wrote:
>Richard Biener  writes:
>> On Tue, 30 Jun 2020, Richard Sandiford wrote:
>>
>>> Richard Biener  writes:
>>> > On Tue, Jun 30, 2020 at 2:18 PM Richard Sandiford
>>> >  wrote:
>>> >>
>>> >> "Yangfei (Felix)"  writes:
>>> >> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
>b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
>>> >> > index e050db1a2e4..ea39fcac0e0 100644
>>> >> > --- a/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
>>> >> > +++ b/gcc/testsuite/gcc.dg/vect/vect-ctor-1.c
>>> >> > @@ -1,6 +1,7 @@
>>> >> >  /* { dg-do compile } */
>>> >> >  /* { dg-additional-options "-O3" } */
>>> >> >  /* { dg-additional-options "-mavx2" { target { i?86-*-*
>x86_64-*-* } } } */
>>> >> > +/* { dg-additional-options "-march=armv8.2-a+sve
>-fno-vect-cost-model" { target aarch64*-*-* } } */
>>> >> >
>>> >> >  typedef struct {
>>> >> >  unsigned short mprr_2[5][16][16];
>>> >>
>>> >> This test is useful for Advanced SIMD too, so I think we should
>continue
>>> >> to test it with whatever options the person running the testsuite
>chose.
>>> >> Instead we could duplicate the test in gcc.target/aarch64/sve
>with
>>> >> appropriate options.
>>> >>
>>> >> > diff --git a/gcc/tree-vect-data-refs.c
>b/gcc/tree-vect-data-refs.c
>>> >> > index eb8288e7a85..b30a7d8a3bb 100644
>>> >> > --- a/gcc/tree-vect-data-refs.c
>>> >> > +++ b/gcc/tree-vect-data-refs.c
>>> >> > @@ -1823,8 +1823,11 @@ vect_enhance_data_refs_alignment
>(loop_vec_info loop_vinfo)
>>> >> >   {
>>> >> > poly_uint64 nscalars = (STMT_SLP_TYPE
>(stmt_info)
>>> >> > ? vf * DR_GROUP_SIZE
>(stmt_info) : vf);
>>> >> > -   possible_npeel_number
>>> >> > - = vect_get_num_vectors (nscalars, vectype);
>>> >> > +   if (maybe_lt (nscalars, TYPE_VECTOR_SUBPARTS
>(vectype)))
>>> >> > + possible_npeel_number = 0;
>>> >> > +   else
>>> >> > + possible_npeel_number
>>> >> > +   = vect_get_num_vectors (nscalars, vectype);
>>> >> >
>>> >> > /* NPEEL_TMP is 0 when there is no
>misalignment, but also
>>> >> >allow peeling NELEMENTS.  */
>>> >>
>>> >> OK, so this is coming from:
>>> >>
>>> >>   int s[16][2];
>>> >>   …
>>> >>   … =s[j][1];
>>> >>
>>> >> and an SLP node containing 16 instances of “s[j][1]”.  The
>DR_GROUP_SIZE
>>> >> is 2 because that's the inner dimension of “s”.
>>> >>
>>> >> I don't think maybe_lt is right here though.  The same problem
>could in
>>> >> principle happen for cases in which NSCALARS >
>TYPE_VECTOR_SUBPARTS,
>>> >> e.g. for different inner dimensions of “s”.
>>> >>
>>> >> I think the testcase shows that using DR_GROUP_SIZE in this
>calculation
>>> >> is flawed.  I'm not sure whether we can really do better given
>the current
>>> >> representation though.  This is one case where having a separate
>dr_vec_info
>>> >> per SLP node would help.
>>> >
>>> > I guess what the code likes to know is what we now have in
>SLP_TREE_LANES
>>> > (or formerly group_size) but that's not necessarily connected to
>DR_GROUP_SIZE.
>>> > Given we only see a stmt_info here and there's no 1:1 mapping of
>SLP node
>>> > to stmt_info (and the reverse mapping doesn't even exist) I do not
>have
>>> > any good idea either.
>>> >
>>> > Honestly I don't really see what this code tries to do ... doesn't
>it
>>> > simply want to set possible_npeel_number to TYPE_VECTOR_SUBPARTS
>(vectype)?!
>>> 
>>> I think it's trying to set possible_npeel_number to the number of
>>> vector stmts per iteration (i.e. ncopies for non-SLP stuff):
>>> 
>>>   /* For multiple types, it is possible that the bigger
>type access
>>>  will have more than one peeling option.  E.g., a
>loop with two
>>>  types: one of size (vector size / 4), and the other
>one of
>>>  size (vector size / 8).  Vectorization factor will
>8.  If both
>>>  accesses are misaligned by 3, the first one needs
>one scalar
>>>  iteration to be aligned, and the second one needs
>5.  But the
>>>  first one will be aligned also by peeling 5 scalar
>>>  iterations, and in that case both accesses will be
>aligned.
>>>  Hence, except for the immediate peeling amount, we
>also want
>>>  to try to add full vector size, while we don't
>exceed
>>>  vectorization factor.
>>>  We do this automatically for cost model, since we
>calculate
>>>  cost for every peeling option.  */
>>> 
>>> So for this example, possible_npeel_number==2 for the first access
>>> (letting us try two peeling values for it) and
>possible_npeel_number==1
>>> for the second.
>>
>> Yeah, but npeel is _scalar_ lanes, since we always peel scalar
>iterations.
>> So it seems odd to somehow put in the number of vectors...  

[PATCH] analyzer: Fix -Wanalyzer-possible-null-argument warning

2020-06-30 Thread Jonathan Wakely via Gcc-patches
gcc/testsuite/ChangeLog:

* g++.dg/analyzer/pr94028.C: Make operator new non-throwing so
that the compiler doesn't implicitly mark it as returning
non-null.

Fixes these:

FAIL: g++.dg/analyzer/pr94028.C  -std=c++98 (test for excess errors)
FAIL: g++.dg/analyzer/pr94028.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/analyzer/pr94028.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/analyzer/pr94028.C  -std=c++2a (test for excess errors)

OK for master?


commit 34e9c12533c6313d37f56900876e41f69e8474bc
Author: Jonathan Wakely 
Date:   Tue Jun 30 17:40:08 2020 +0100

analyzer: Fix -Wanalyzer-possible-null-argument warning

gcc/testsuite/ChangeLog:

* g++.dg/analyzer/pr94028.C: Make operator new non-throwing so
that the compiler doesn't implicitly mark it as returning
non-null.

diff --git a/gcc/testsuite/g++.dg/analyzer/pr94028.C 
b/gcc/testsuite/g++.dg/analyzer/pr94028.C
index 0a222d1b991..c0c35d65829 100644
--- a/gcc/testsuite/g++.dg/analyzer/pr94028.C
+++ b/gcc/testsuite/g++.dg/analyzer/pr94028.C
@@ -12,7 +12,7 @@ enum e {} i;
 
 struct j
 {
-  void *operator new (__SIZE_TYPE__ b)
+  void *operator new (__SIZE_TYPE__ b) throw()
   {
 return calloc (b, sizeof (int)); // { dg-warning "leak" }
   }


Re: [PATCH] analyzer/pr94028.C and non-null warning

2020-06-30 Thread David Edelsohn via Gcc-patches
Also, cpp1y/lambda-generic-69078-1.C elicits a similar, new warning:

FAIL: g++.dg/cpp1y/lambda-generic-69078-1.C  -std=gnu++14 (test for
excess errors)
Excess errors:
g++.dg/cpp1y/lambda-generic-69078-1.C:23:13: warning: 'this' pointer
null [-Wnonnull]

Thanks, David

On Tue, Jun 30, 2020 at 12:23 PM David Malcolm  wrote:
>
> On Tue, 2020-06-30 at 10:12 -0600, Martin Sebor wrote:
> > On 6/30/20 8:47 AM, David Edelsohn via Gcc-patches wrote:
> > > The unexpected warning is
> > >
> > > gcc/testsuite/g++.dg/analyzer/pr94028.C:28:21: warning: use of
> > > possibly-NULL '' where non-null expected [CWE-690]
> > > [-Wanalyzer-possible-null-argument]
> > >
> > > This is the same location as one of the existing "leak" warnings.
> >
> > I'm surprised that the Wnonull change triggers
> > a -Wanalyzer-possible-null-argument warning.  There is no
> > -Wnonnull for the test case with or without -fanalyzer (and
> > with or without optimization) so that suggests the two are
> > independent of one another.
> >
> > I'm also not sure I see a justification for a nonnull warning
> > in the test.  The note printed after the warning says:
> >
> >  |  | (5) possible return of NULL to ‘f’
> > from ‘j::operator new’
> >  |  | (6) argument 1 (‘’) from
> > (4)
> > could be NULL where non-null expected
> >  |
> > /src/gcc/master/gcc/testsuite/g++.dg/analyzer/pr94028.C:19:3: note:
> > argument 1 of ‘j::j(B*, int)’ must be non-null
> > 19 |   j (B *, int)
> >|   ^
> >
> > but the first argument in j::j(B*, int) is not marked nonnull,
> > and it's not the this pointer (treating those as implicitly
> > nonnull was one of the changes introduced by my patch).
>
> Are you sure it's not the "this" pointer?  Bear in mind that the C++
> "support" in the analyzer is practically non-existent; I haven't yet
> implemented any special-casing about how arguments are numbered, so it
> could well be the "this" pointer.
>
> If it is a complaint about the "this" pointer, that could explain why
> this started happening when your patch went in.
>
> Dave
>
> >
> > FWIW, I don't remember if I saw it when going through the test
> > results for my patch but if I did, it's possible that I assumed
> > it was unrelated because of the -Wanalyzer- option.  Sorry if
> > this was the wrong call.
> >
> > Martin
>
> > > How would you like pr94028.C to be adjusted in the testsuite to
> > > account for this?
> > >
> > > Thanks, David
> > >
> > > On Tue, Jun 30, 2020 at 10:18 AM David Malcolm  > > > wrote:
> > > > On Tue, 2020-06-30 at 09:51 -0400, David Edelsohn wrote:
> > > > > The changes to the non-null warning now produce an additional
> > > > > warning
> > > > > for analyzer/pr94028.C on one of the "leak" lines.  This causes
> > > > > new
> > > > > failures on trunk.
> > > >
> > > > Hi David
> > > >
> > > > Do you have the output to hand?  What is the full text of the new
> > > > diagnostic?
> > > >
> > > > Some high level questions:
> > > >* Ought GCC to warn for that code?  (or not)
> > > >* Is it behavior we ought to have a test for?
> > > >
> > > > Note that AFAIK there are *two* kinds of non-null warnings that
> > > > GCC can
> > > > emit: the kind emitted by Martin's code, versus those emitted by
> > > > -fanalyzer (the latter imply the much more expensive analysis
> > > > performed
> > > > by -fanalyzer, and are controlled by the various -Wanalyzer-
> > > > *null*
> > > > options; see
> > > > https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Static-Analyzer-Options.html
> > > > )
> > > >
> > > >
> > > > > Because non-null is not the purpose of the analyzer test, I
> > > > > propose
> > > > > pruning the output to resolve the new failures.
> > > >
> > > > Looking back through bugzilla, it seems that the main purpose of
> > > > adding
> > > > that test was to ensure that -fanalyzer doesn't ICE on that code.
> > > >
> > > > At some point I hope to properly support C++ in -fanalyzer, at
> > > > which
> > > > point some kind of null warning may be warranted on that
> > > > code.  Sadly
> > > > I'm not yet at that point.  FWIW I'm currently working on a big
> > > > rewrite
> > > > of how state is tracked within the analyzer, as I've identified
> > > > at
> > > > least two major flaws in the current implementation, which my
> > > > rewrite
> > > > addresses.  I'm deferring on C++ support until that rewrite is
> > > > done.
> > > >
> > > > >Alternatively, I
> > > > > could explicitly test for the additional non-null warning.
> > > > >
> > > > > Do you have any preferences?
> > > >
> > > > This test is controlled by analyzer.exp and thus, for example, is
> > > > disabled if the analyzer was disabled at configure time.
> > > >
> > > > If this is coming from Martin's non-analyzer code, a third
> > > > possibility
> > > > would be to use -Wno-something to disable that warning, so that
> > > > the
> > > > analyzer test can focus on the -fanalyzer test, as it were (and
> > > > if this
> > > > 

[PATCH] Fix unnecessary register spill that depends on function ordering

2020-06-30 Thread Omar Tahir
Hi,

The variables first_moveable_pseudo and last_moveable_pseudo aren't reset
after compiling a function, which means they leak into the first scheduler
pass of the following function. In some cases, this can cause an extra spill
during register allocation of the second function.

This patch fixes this by setting

first_moveable_pseudo = last_moveable_pseudo

at the beginning of the first scheduler pass.

Because the spill occurs in the middle of the IRA pass it is highly
target-sensitive, so I have provided a test case that works on aarch64. There
doesn't seem to be a target-independent way to trigger this bug, since the
RTL at the IRA stage is already quite target-specific.

Interestingly, the aarch64 test case here doesn't actually perform a complete
register spill - the value is stored on the stack but never retrieved.
Perhaps this points to another, deeper bug?

Bootstrapped and regression tested on aarch64-none-linux-gnu.

I don't have write privileges, so if it's fine could someone push for me?

Thanks,
Omar



gcc/ChangeLog:

2020-06-30: Omar Tahir 

* sched-rgn.c: Include ira-int.h, ira.h, regs.h.
(rest_of_handle_sched): Clear moveable pseudo range.

gcc/testsuite/ChangeLog:

2020-06-30: Omar Tahir 

* gcc.target/aarch64/nospill.c: New test.



diff --git a/gcc/sched-rgn.c b/gcc/sched-rgn.c
index 7f5dfdb..51329fc 100644
--- a/gcc/sched-rgn.c
+++ b/gcc/sched-rgn.c
@@ -65,6 +65,9 @@ along with GCC; see the file COPYING3.  If not see
#include "dbgcnt.h"
#include "pretty-print.h"
#include "print-rtl.h"
+#include "ira.h"
+#include "regs.h"
+#include "ira-int.h"

/* Disable warnings about quoting issues in the pp_xxx calls below
that (intentionally) don't follow GCC diagnostic conventions.  */
@@ -3719,6 +3722,7 @@ static unsigned int
rest_of_handle_sched (void)
{
#ifdef INSN_SCHEDULING
+  first_moveable_pseudo = last_moveable_pseudo;
   if (flag_selective_scheduling
   && ! maybe_skip_selective_scheduling ())
 run_selective_scheduling ();
diff --git a/gcc/testsuite/gcc.target/aarch64/nospill.c 
b/gcc/testsuite/gcc.target/aarch64/nospill.c
new file mode 100644
index 000..60399d8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/nospill.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+/* The pseudo for P is marked as moveable in the IRA pass. */
+float
+func_0 (float a, float b, float c)
+{
+  float p = c / a;
+
+  if (b > 1)
+{
+  b /= p;
+  if (c > 2)
+a /= 3;
+}
+
+  return b / c * a;
+}
+
+/* If first_moveable_pseudo and last_moveable_pseudo are not reset correctly,
+   they will carry over and spill the pseudo for Q. */
+float
+func_1 (float a, float b, float c)
+{
+  float q = a + b;
+
+  c *= a / (b + b);
+  if (a > 0)
+c *= q;
+
+  return a * b * c;
+}
+
+/* We have plenty of spare registers, so check nothing has been spilled. */
+/* { dg-final { scan-assembler-not "str" } } */


moveable_pseudo.patch
Description: moveable_pseudo.patch


Re: [PATCH] analyzer/pr94028.C and non-null warning

2020-06-30 Thread Martin Sebor via Gcc-patches

On 6/30/20 10:47 AM, David Edelsohn wrote:

Also, cpp1y/lambda-generic-69078-1.C elicits a similar, new warning:

FAIL: g++.dg/cpp1y/lambda-generic-69078-1.C  -std=gnu++14 (test for
excess errors)
Excess errors:
g++.dg/cpp1y/lambda-generic-69078-1.C:23:13: warning: 'this' pointer
null [-Wnonnull]


I think this (mainly the ICE) is the subject of pr95984.  AFAICT,
the warning is working correctly but the IL the C++ front end
emits doesn't look right: AFAICS, it creates a function object
for the lambda and calls its operator() with a null this pointer:

; Function static decltype (((const main()::...)>*)0)->operator()...>(static_cast(main::._anon_0::_FUN::) ...)) 
main()_FUN(auto:1 ...) [with auto:1 = {int}; 
decltype (((const main()::*)0)->operator()...>(static_cast(main::._anon_0::_FUN::) ...)) = 
void] (null)

;; enabled by -tree-original


<::operator() (0B, D.2440) >;
return;

Martin



Thanks, David

On Tue, Jun 30, 2020 at 12:23 PM David Malcolm  wrote:


On Tue, 2020-06-30 at 10:12 -0600, Martin Sebor wrote:

On 6/30/20 8:47 AM, David Edelsohn via Gcc-patches wrote:

The unexpected warning is

gcc/testsuite/g++.dg/analyzer/pr94028.C:28:21: warning: use of
possibly-NULL '' where non-null expected [CWE-690]
[-Wanalyzer-possible-null-argument]

This is the same location as one of the existing "leak" warnings.


I'm surprised that the Wnonull change triggers
a -Wanalyzer-possible-null-argument warning.  There is no
-Wnonnull for the test case with or without -fanalyzer (and
with or without optimization) so that suggests the two are
independent of one another.

I'm also not sure I see a justification for a nonnull warning
in the test.  The note printed after the warning says:

  |  | (5) possible return of NULL to ‘f’
from ‘j::operator new’
  |  | (6) argument 1 (‘’) from
(4)
could be NULL where non-null expected
  |
/src/gcc/master/gcc/testsuite/g++.dg/analyzer/pr94028.C:19:3: note:
argument 1 of ‘j::j(B*, int)’ must be non-null
 19 |   j (B *, int)
|   ^

but the first argument in j::j(B*, int) is not marked nonnull,
and it's not the this pointer (treating those as implicitly
nonnull was one of the changes introduced by my patch).


Are you sure it's not the "this" pointer?  Bear in mind that the C++
"support" in the analyzer is practically non-existent; I haven't yet
implemented any special-casing about how arguments are numbered, so it
could well be the "this" pointer.

If it is a complaint about the "this" pointer, that could explain why
this started happening when your patch went in.

Dave



FWIW, I don't remember if I saw it when going through the test
results for my patch but if I did, it's possible that I assumed
it was unrelated because of the -Wanalyzer- option.  Sorry if
this was the wrong call.

Martin



How would you like pr94028.C to be adjusted in the testsuite to
account for this?

Thanks, David

On Tue, Jun 30, 2020 at 10:18 AM David Malcolm 
wrote:
On Tue, 2020-06-30 at 09:51 -0400, David Edelsohn wrote:

The changes to the non-null warning now produce an additional
warning
for analyzer/pr94028.C on one of the "leak" lines.  This causes
new
failures on trunk.


Hi David

Do you have the output to hand?  What is the full text of the new
diagnostic?

Some high level questions:
* Ought GCC to warn for that code?  (or not)
* Is it behavior we ought to have a test for?

Note that AFAIK there are *two* kinds of non-null warnings that
GCC can
emit: the kind emitted by Martin's code, versus those emitted by
-fanalyzer (the latter imply the much more expensive analysis
performed
by -fanalyzer, and are controlled by the various -Wanalyzer-
*null*
options; see
https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Static-Analyzer-Options.html
)



Because non-null is not the purpose of the analyzer test, I
propose
pruning the output to resolve the new failures.


Looking back through bugzilla, it seems that the main purpose of
adding
that test was to ensure that -fanalyzer doesn't ICE on that code.

At some point I hope to properly support C++ in -fanalyzer, at
which
point some kind of null warning may be warranted on that
code.  Sadly
I'm not yet at that point.  FWIW I'm currently working on a big
rewrite
of how state is tracked within the analyzer, as I've identified
at
least two major flaws in the current implementation, which my
rewrite
addresses.  I'm deferring on C++ support until that rewrite is
done.


Alternatively, I
could explicitly test for the additional non-null warning.

Do you have any preferences?


This test is controlled by analyzer.exp and thus, for example, is
disabled if the analyzer was disabled at configure time.

If this is coming from Martin's non-analyzer code, a third
possibility
would be to use -Wno-something to disable that warning, so that
the
analyzer test can focus on the -fanalyzer test, as it were (and
if this
is a behavior that ought to be checked for in Martin's warning,
then

Re: [PATCH] analyzer/pr94028.C and non-null warning

2020-06-30 Thread David Edelsohn via Gcc-patches
Why did you mark PR96008 as a duplicate?  The ICE is a duplicate, but
the wrong IL is a C++ FE bug.

Thanks, David

On Tue, Jun 30, 2020 at 1:45 PM Martin Sebor  wrote:
>
> On 6/30/20 10:47 AM, David Edelsohn wrote:
> > Also, cpp1y/lambda-generic-69078-1.C elicits a similar, new warning:
> >
> > FAIL: g++.dg/cpp1y/lambda-generic-69078-1.C  -std=gnu++14 (test for
> > excess errors)
> > Excess errors:
> > g++.dg/cpp1y/lambda-generic-69078-1.C:23:13: warning: 'this' pointer
> > null [-Wnonnull]
>
> I think this (mainly the ICE) is the subject of pr95984.  AFAICT,
> the warning is working correctly but the IL the C++ front end
> emits doesn't look right: AFAICS, it creates a function object
> for the lambda and calls its operator() with a null this pointer:
>
> ; Function static decltype (((const main():: ...)>*)0)->operator() ...>(static_cast(main::._anon_0::_FUN::) ...))
> main()_FUN(auto:1 ...) [with auto:1 = {int};
> decltype (((const main()::*)0)->operator() ...>(static_cast(main::._anon_0::_FUN::) ...)) =
> void] (null)
> ;; enabled by -tree-original
>
>
> ;
> return;
>
> Martin
>
> >
> > Thanks, David
> >
> > On Tue, Jun 30, 2020 at 12:23 PM David Malcolm  wrote:
> >>
> >> On Tue, 2020-06-30 at 10:12 -0600, Martin Sebor wrote:
> >>> On 6/30/20 8:47 AM, David Edelsohn via Gcc-patches wrote:
>  The unexpected warning is
> 
>  gcc/testsuite/g++.dg/analyzer/pr94028.C:28:21: warning: use of
>  possibly-NULL '' where non-null expected [CWE-690]
>  [-Wanalyzer-possible-null-argument]
> 
>  This is the same location as one of the existing "leak" warnings.
> >>>
> >>> I'm surprised that the Wnonull change triggers
> >>> a -Wanalyzer-possible-null-argument warning.  There is no
> >>> -Wnonnull for the test case with or without -fanalyzer (and
> >>> with or without optimization) so that suggests the two are
> >>> independent of one another.
> >>>
> >>> I'm also not sure I see a justification for a nonnull warning
> >>> in the test.  The note printed after the warning says:
> >>>
> >>>   |  | (5) possible return of NULL to ‘f’
> >>> from ‘j::operator new’
> >>>   |  | (6) argument 1 (‘’) from
> >>> (4)
> >>> could be NULL where non-null expected
> >>>   |
> >>> /src/gcc/master/gcc/testsuite/g++.dg/analyzer/pr94028.C:19:3: note:
> >>> argument 1 of ‘j::j(B*, int)’ must be non-null
> >>>  19 |   j (B *, int)
> >>> |   ^
> >>>
> >>> but the first argument in j::j(B*, int) is not marked nonnull,
> >>> and it's not the this pointer (treating those as implicitly
> >>> nonnull was one of the changes introduced by my patch).
> >>
> >> Are you sure it's not the "this" pointer?  Bear in mind that the C++
> >> "support" in the analyzer is practically non-existent; I haven't yet
> >> implemented any special-casing about how arguments are numbered, so it
> >> could well be the "this" pointer.
> >>
> >> If it is a complaint about the "this" pointer, that could explain why
> >> this started happening when your patch went in.
> >>
> >> Dave
> >>
> >>>
> >>> FWIW, I don't remember if I saw it when going through the test
> >>> results for my patch but if I did, it's possible that I assumed
> >>> it was unrelated because of the -Wanalyzer- option.  Sorry if
> >>> this was the wrong call.
> >>>
> >>> Martin
> >>
>  How would you like pr94028.C to be adjusted in the testsuite to
>  account for this?
> 
>  Thanks, David
> 
>  On Tue, Jun 30, 2020 at 10:18 AM David Malcolm  > wrote:
> > On Tue, 2020-06-30 at 09:51 -0400, David Edelsohn wrote:
> >> The changes to the non-null warning now produce an additional
> >> warning
> >> for analyzer/pr94028.C on one of the "leak" lines.  This causes
> >> new
> >> failures on trunk.
> >
> > Hi David
> >
> > Do you have the output to hand?  What is the full text of the new
> > diagnostic?
> >
> > Some high level questions:
> > * Ought GCC to warn for that code?  (or not)
> > * Is it behavior we ought to have a test for?
> >
> > Note that AFAIK there are *two* kinds of non-null warnings that
> > GCC can
> > emit: the kind emitted by Martin's code, versus those emitted by
> > -fanalyzer (the latter imply the much more expensive analysis
> > performed
> > by -fanalyzer, and are controlled by the various -Wanalyzer-
> > *null*
> > options; see
> > https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Static-Analyzer-Options.html
> > )
> >
> >
> >> Because non-null is not the purpose of the analyzer test, I
> >> propose
> >> pruning the output to resolve the new failures.
> >
> > Looking back through bugzilla, it seems that the main purpose of
> > adding
> > that test was to ensure that -fanalyzer doesn't ICE on that code.
> >
> > At some point I hope to properly support C++ in 

[PATCH] gcc-changelog: support older GitPython releases.

2020-06-30 Thread Martin Liška

Installed to master.

contrib/ChangeLog:

* gcc-changelog/git_repository.py: Support older releases of
GitPython when renamed_file was named renamed.
---
 contrib/gcc-changelog/git_repository.py | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/contrib/gcc-changelog/git_repository.py 
b/contrib/gcc-changelog/git_repository.py
index 4f0d21af039..90edc3ce3d8 100755
--- a/contrib/gcc-changelog/git_repository.py
+++ b/contrib/gcc-changelog/git_repository.py
@@ -39,11 +39,15 @@ def parse_git_revisions(repo_path, revisions, strict=False):
 
 modified_files = []

 for file in diff:
+if hasattr(file, 'renamed_file'):
+is_renamed = file.renamed_file
+else:
+is_renamed = file.renamed
 if file.new_file:
 t = 'A'
 elif file.deleted_file:
 t = 'D'
-elif file.renamed_file:
+elif is_renamed:
 # Consider that renamed files are two operations:
 # the deletion of the original name
 # and the addition of the new one.
--
2.27.0



c++: Tweak function cloning names

2020-06-30 Thread Nathan Sidwell
On the modules branch I need to expose an intermediate step of the 
function cloning, but before that it'd be nice to rationalize the names 
somewhat, now that we also use that API for copying the equality 
operator.  Jason's recent patch caused me some pain by altering the same 
code.  I can only blame myself for not pushing some bits sooner. Anyway, 
this patch makes the newly renamed copy_fndecl_with_name static, and 
adds a wrapper copy_operator_fn, that takes an operator code.  The cdtor 
cloning functions are renamed to explicitly note they expect cdtors.  A 
followup patch will move some of the logic from copy_fndecl_with_name to 
build_cdtor_clones.


gcc/cp/
* cp-tree.h (copy_fndecl_with_name): Rename to ...
(copy_operatorn_fn): ... this.  Change arg type.
(clone_function_decl): Rename to ...
(clone_cdtor): ... this.
* class.c (copy_fndecl_with_name): Make static.
(copy_operator_fn): New wrapper.
(build_clones): Rename to ...
(build_cdtor_clones): ... this.
(clone_function_decl): Rename to ...
(clone_cdtor): ... this.  Adjust build_clones calls.
(clone_constructors_and_destructors): Adjust 
clone_function_decl

calls.
* method.c (implicitly_declare_fn): Adjust 
copy_fndecl_with_name

call.
(lazily_declare_fn): Adjust clone_function_decl call.
* pt.c (tsubst_function_decl): Likewise.
(instantiate_template_1): Likewise.

libcc1/
* libcp1plugin.cc (plugin_build_decl): Adjust 
clone_function_decl

call.

--
Nathan Sidwell
diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index 94a95854e25..b0cc027e0de 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -4696,7 +4696,7 @@ check_methods (tree t)
 }
 }
 
-tree
+static tree
 copy_fndecl_with_name (tree fn, tree name)
 {
   /* Copy the function.  */
@@ -4804,6 +4804,14 @@ copy_fndecl_with_name (tree fn, tree name)
   return clone;
 }
 
+/* FN is an operator function, create a variant for CODE.  */
+
+tree
+copy_operator_fn (tree fn, tree_code code)
+{
+  return copy_fndecl_with_name (fn, ovl_op_identifier (code));
+}
+
 /* FN is a constructor or destructor.  Clone the declaration to create
a specialized in-charge or not-in-charge version, as indicated by
NAME.  */
@@ -4847,8 +4855,8 @@ build_clone (tree fn, tree name)
 /* Build the clones of FN, return the number of clones built.  These
will be inserted onto DECL_CHAIN of FN.  */
 
-unsigned
-build_clones (tree fn)
+static unsigned
+build_cdtor_clones (tree fn)
 {
   unsigned count = 0;
 
@@ -4891,14 +4899,14 @@ build_clones (tree fn)
CLASSTYPE_MEMBER_VEC.  */
 
 void
-clone_function_decl (tree fn, bool update_methods)
+clone_cdtor (tree fn, bool update_methods)
 {
   /* Avoid inappropriate cloning.  */
   if (DECL_CHAIN (fn)
   && DECL_CLONED_FUNCTION_P (DECL_CHAIN (fn)))
 return;
 
-  unsigned count = build_clones (fn);
+  unsigned count = build_cdtor_clones (fn);
 
   /* Note that this is an abstract function that is never emitted.  */
   DECL_ABSTRACT_P (fn) = true;
@@ -4998,10 +5006,10 @@ clone_constructors_and_destructors (tree t)
   /* While constructors can be via a using declaration, at this point
  we no longer need to know that.  */
   for (ovl_iterator iter (CLASSTYPE_CONSTRUCTORS (t)); iter; ++iter)
-clone_function_decl (*iter, /*update_methods=*/true);
+clone_cdtor (*iter, /*update_methods=*/true);
 
   if (tree dtor = CLASSTYPE_DESTRUCTOR (t))
-clone_function_decl (dtor, /*update_methods=*/true);
+clone_cdtor (dtor, /*update_methods=*/true);
 }
 
 /* Deduce noexcept for a destructor DTOR.  */
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 78e8ca4150a..9b31eaf4a07 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6450,8 +6450,8 @@ extern void check_abi_tags			(tree);
 extern tree missing_abi_tags			(tree);
 extern void fixup_type_variants			(tree);
 extern void fixup_attribute_variants		(tree);
-extern tree copy_fndecl_with_name		(tree, tree);
-extern void clone_function_decl			(tree, bool);
+extern void clone_cdtor(tree, bool);
+extern tree copy_operator_fn			(tree, tree_code code);
 extern void adjust_clone_args			(tree);
 extern void deduce_noexcept_on_destructor   (tree);
 extern bool uniquely_derived_from_p (tree, tree);
diff --git a/gcc/cp/method.c b/gcc/cp/method.c
index 2a98907bfa1..8fd7052d205 100644
--- a/gcc/cp/method.c
+++ b/gcc/cp/method.c
@@ -2688,7 +2688,7 @@ implicitly_declare_fn (special_function_kind kind, tree type,
  comparison operator function. --end note]  */
   if (kind == sfk_comparison)
 {
-  fn = copy_fndecl_with_name (pattern_fn, ovl_op_identifier (EQ_EXPR));
+  fn = copy_operator_fn (pattern_fn, EQ_EXPR);
   DECL_ARTIFICIAL (fn) = 1;
   TREE_TYPE (fn) = change_return_type (boolean_type_node, TREE_TYPE (fn));
   return fn;
@@ 

Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-06-30 Thread Thomas Schwinge
Hi!

Many thanks, Alexandra and Tobias for working this out together!


On 2020-06-23T06:50:26-0300, Alexandre Oliva  wrote:
> On Jun  9, 2020, Thomas Schwinge  wrote:
>
>> Previously, for '-foffload=nvptx-none -foffload=-fdump-rtl-mach
>> -save-temps -o ./nvptx-merged-loop.exe', GCC produced the expected
>> 'nvptx-merged-loop.o.307r.mach'.
>
> I believe the patch I've just installed fixes the UNRESOLVED results
> caused by not finding dump files.

Yes, confirmed, thanks!

A few small issues remain, for those I'll respond elsewhere in this
thread.


>> Consider 'libgomp.oacc-c-c++-common/pr85381-2.c':
>
>> /* { dg-additional-options "-save-temps" } */
>
>> /* { dg-final { scan-assembler-times "bar.sync" 2 } } */
>
>> This expects to scan the PTX offloading compilation assembler code (not
>> host code!), expecting that nvptx offloading code assembly is produced
>> after the host code, and thus overwrites the latter file.  (Yes, that's
>> certainly ugly/fragile...)
>
> I'm afraid this will need further adjusting in the testsuite, as we'll
> store the nvptx asm saved aux output in a separate file.
> scan-assembler-times will no longer work for this purpose, we'll need
> something that knows how to find the offloaded asm.

So, that's (now?) easy enough to repair.  I've pushed "[testsuite]
Replace fragile 'scan-assembler' with 'scan-offload-rtl' in
'libgomp.oacc-c-c++-common/pr85381*.c'", see attached.


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From 8a8efad09811b5c08cfd9d469c6f1b6ba0c848f1 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 30 Jun 2020 05:24:17 +0200
Subject: [PATCH] [testsuite] Replace fragile 'scan-assembler' with
 'scan-offload-rtl' in 'libgomp.oacc-c-c++-common/pr85381*.c'

These test cases use directives similar to:

/* { dg-additional-options "-save-temps" } */

/* { dg-final { scan-assembler-times "bar.sync" 2 } } */

This expects to scan the PTX offloading compilation assembler code (not host
code!), expecting that nvptx offloading code assembly is produced after the
host code, and thus overwrites the latter file.  (Yes, that's certainly
ugly/fragile...)

..., and this broke with recent commit 1dedc12d186a110854537e1279b4e6c29f2df35a
"revamp dump and aux output names" plus fix-up commit commit
efc16503ca10bc0e934e0bace5777500e4dc757a "handle dumpbase in offloading, adjust
testsuite" (short summary: file names changed), so let's finally make that
robust.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: Replace fragile
	'scan-assembler' with 'scan-offload-rtl'.
	* testsuite/libgomp.oacc-c-c++-common/pr85381-3.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/pr85381-5.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/pr85381.c: Likewise.
---
 libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c | 4 ++--
 libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-3.c | 4 ++--
 libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c | 4 ++--
 libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-5.c | 4 ++--
 libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381.c   | 4 ++--
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c
index 6570c64afff5..84b9c01443e5 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c
@@ -1,6 +1,6 @@
-/* { dg-additional-options "-save-temps" } */
 /* { dg-do run { target openacc_nvidia_accel_selected } }
{ dg-skip-if "" { *-*-* } { "*" } { "-O2" } } */
+/* { dg-additional-options "-foffload=-fdump-rtl-mach" } */
 
 int
 main (void)
@@ -33,4 +33,4 @@ main (void)
 
so the loop is not recognized as empty loop (which we detect by seeing if
joining immediately follows forked).  */
-/* { dg-final { scan-assembler-times "bar.sync" 2 } } */
+/* { dg-final { scan-offload-rtl-dump-times "nvptx_barsync" 2 "mach" } } */
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-3.c
index c5d1c5add68e..cddbf2719067 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-3.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-3.c
@@ -1,6 +1,6 @@
-/* { dg-additional-options "-save-temps -w" } */
 /* { dg-do run { target openacc_nvidia_accel_selected } }
{ dg-skip-if "" { *-*-* } { "*" } { "-O2" } } */
+/* { dg-additional-options "-foffload=-fdump-rtl-mach" } */
 
 int a;
 #pragma acc declare create(a)
@@ -32,4 +32,4 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-assembler-not "bar.sync" } } */
+/* { dg-final { scan-offload-rtl-dump-not "nvptx_barsync" "mach" } } */
diff --git 

[PATCH v2, rs6000] Add support to enable vmsumudm behind vec_msum builtin.

2020-06-30 Thread will schmidt via Gcc-patches
Hi,

  Add support for the vmsumudm instruction and tie it into the vec_msum
  built-in to support the variants of that built-in using vector
 _int128 parameters.

  vector _uint128_t vec_msum (vector unsigned long long,
  vector unsigned long long,
  vector _uint128_t);
  vector _int128_t vec_msum (vector signed long long,
 vector signed long long,
 vector _int128_t);

[v2]
  Corrected the define_insn and test requirements to be limited to P9.
  Improve description to clarify the vmsum does
  a widening multiply and horizontal addition.

Fresh regtests currently running on assorted powerpc targets.

OK for trunk?

Thanks,
-Will


[gcc]

2020-06-18  Will Schmidt  

* config/rs6000/altivec.h (vec_vmsumudm): New define.
* config/rs6000/altivec.md (UNSPEC_VMSUMUDM): New unspec.
(altivec_vmsumudm): New define_insn.
* config/rs6000/rs6000-builtin.def (altivec_vmsumudm): New
BU_ALTIVEC_3 entry. (vmsumudm): New BU_ALTIVEC_OVERLOAD_3
entry.
* config/rs6000/rs6000-call.c (altivec_overloaded_builtins):
Add entries for ALTIVEC_BUILTIN_VMSUMUDM variants of vec_msum.

[testsuite]

2020-06-18  Will Schmidt  

* gcc.target/powerpc/builtins-msum-runnable.c: New test.
* gcc.target/powerpc/vsx-builtin-msum.c: New test.


diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index bb1524f4a679..0d199393556d 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -159,10 +159,11 @@
 #define vec_vmsumuhm __builtin_vec_vmsumuhm
 #define vec_vmsummbm __builtin_vec_vmsummbm
 #define vec_vmsumubm __builtin_vec_vmsumubm
 #define vec_vmsumshs __builtin_vec_vmsumshs
 #define vec_vmsumuhs __builtin_vec_vmsumuhs
+#define vec_vmsumudm __builtin_vec_vmsumudm
 #define vec_vmulesb __builtin_vec_vmulesb
 #define vec_vmulesh __builtin_vec_vmulesh
 #define vec_vmuleuh __builtin_vec_vmuleuh
 #define vec_vmuleub __builtin_vec_vmuleub
 #define vec_vmulosh __builtin_vec_vmulosh
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 2ce9227c765a..84fee916d041 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -19,10 +19,11 @@
 ;; .
 
 (define_c_enum "unspec"
   [UNSPEC_VCMPBFP
UNSPEC_VMSUMU
+   UNSPEC_VMSUMUDM
UNSPEC_VMSUMM
UNSPEC_VMSUMSHM
UNSPEC_VMSUMUHS
UNSPEC_VMSUMSHS
UNSPEC_VMHADDSHS
@@ -970,10 +971,20 @@
 UNSPEC_VMSUMU))]
   "TARGET_ALTIVEC"
   "vmsumum %0,%1,%2,%3"
   [(set_attr "type" "veccomplex")])
 
+(define_insn "altivec_vmsumudm"
+  [(set (match_operand:V1TI 0 "register_operand" "=v")
+   (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v")
+ (match_operand:V2DI 2 "register_operand" "v")
+ (match_operand:V1TI 3 "register_operand" "v")]
+UNSPEC_VMSUMUDM))]
+  "TARGET_P9_VECTOR"
+  "vmsumudm %0,%1,%2,%3"
+  [(set_attr "type" "veccomplex")])
+
 (define_insn "altivec_vmsummm"
   [(set (match_operand:V4SI 0 "register_operand" "=v")
 (unspec:V4SI [(match_operand:VIshort 1 "register_operand" "v")
  (match_operand:VIshort 2 "register_operand" "v")
   (match_operand:V4SI 3 "register_operand" "v")]
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 363656ec05cc..ee0d787cfa22 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1140,10 +1140,11 @@ BU_ALTIVEC_3 (VMHADDSHS,  "vmhaddshs",  SAT,
altivec_vmhaddshs)
 BU_ALTIVEC_3 (VMHRADDSHS, "vmhraddshs", SAT,   
altivec_vmhraddshs)
 BU_ALTIVEC_3 (VMLADDUHM,  "vmladduhm",  CONST, fmav8hi4)
 BU_ALTIVEC_3 (VMSUMUBM,   "vmsumubm",   CONST, 
altivec_vmsumubm)
 BU_ALTIVEC_3 (VMSUMMBM,   "vmsummbm",   CONST, 
altivec_vmsummbm)
 BU_ALTIVEC_3 (VMSUMUHM,   "vmsumuhm",   CONST, 
altivec_vmsumuhm)
+BU_ALTIVEC_3 (VMSUMUDM,   "vmsumudm",   CONST, 
altivec_vmsumudm)
 BU_ALTIVEC_3 (VMSUMSHM,   "vmsumshm",   CONST, 
altivec_vmsumshm)
 BU_ALTIVEC_3 (VMSUMUHS,   "vmsumuhs",   SAT,   
altivec_vmsumuhs)
 BU_ALTIVEC_3 (VMSUMSHS,   "vmsumshs",   SAT,   
altivec_vmsumshs)
 BU_ALTIVEC_3 (VNMSUBFP,   "vnmsubfp",   FP,nfmsv4sf4)
 BU_ALTIVEC_3 (VPERM_1TI,  "vperm_1ti",  CONST, 
altivec_vperm_v1ti)
@@ -1497,10 +1498,11 @@ BU_ALTIVEC_OVERLOAD_3 (SEL,"sel")
 BU_ALTIVEC_OVERLOAD_3 (VMSUMMBM,   "vmsummbm")
 BU_ALTIVEC_OVERLOAD_3 (VMSUMSHM,   "vmsumshm")
 BU_ALTIVEC_OVERLOAD_3 (VMSUMSHS,   "vmsumshs")
 BU_ALTIVEC_OVERLOAD_3 (VMSUMUBM,   "vmsumubm")
 BU_ALTIVEC_OVERLOAD_3 (VMSUMUHM,   "vmsumuhm")
+BU_ALTIVEC_OVERLOAD_3 (VMSUMUDM,   "vmsumudm")
 BU_ALTIVEC_OVERLOAD_3 

Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-06-30 Thread Thomas Schwinge
Hi Alexandre!

On 2020-06-22T11:32:46-0300, Alexandre Oliva  wrote:
> Here's a consolidated patch, [...]

Another small issue here:

> --- /dev/null
> +++ b/gcc/testsuite/lib/scanoffload.exp

> +# Utility for scanning offloading dump output, used by libgomp.exp.
> +
> +# Format an offload dump suffix given the offload target name in
> +# OFFTGT and any suffix, probably empty, in SUFFIX.
> +proc scoff-format { offtgt suffix } {
> +return ".x$offtgt.mkoffload$suffix"
> +}
> +
> +# Wrapper for scan procs.
> +# Argument 0 is the index of the argument to replace when calling
> +# argument 1 with the remaining arguments.  Use end-1 or end or so.
> +proc scoff { args } {
> +set idx [lindex $args 0]
> +set prc [lindex $args 1]
> +set args [lreplace $args 0 1]
> +
> +global offload_target
> +if [info exists offload_target] {
> + set target $offload_target
> + if { "$target" != "disable" } {
> + eval $prc [lreplace $args $idx $idx "[scoff-format $target [lindex 
> $args $idx]]"]
> + }
> +} else {
> + global offload_targets
> + foreach target [split $offload_targets ","] {
> + eval $prc [lreplace $args $idx $idx "[scoff-format $target [lindex 
> $args $idx]]"]
> + }
> +}
> +}

> --- a/gcc/testsuite/lib/scanoffloadrtl.exp
> +++ b/gcc/testsuite/lib/scanoffloadrtl.exp

> @@ -36,12 +37,12 @@ proc scan-offload-rtl-dump { args } {
>   return
>  }
>  if { [llength $args] >= 3 } {
> - scan-dump "offload-rtl" [lindex $args 0] \
> -   "\[0-9\]\[0-9\]\[0-9]r.[lindex $args 1]" ".o" \
> -   [lindex $args 2]
> + scoff end-1 scan-dump "offload-rtl" [lindex $args 0] \
> + "\[0-9\]\[0-9\]\[0-9]r.[lindex $args 1]" "" \
> + [lindex $args 2]
>  } else {
> - scan-dump "offload-rtl" [lindex $args 0] \
> -   "\[0-9\]\[0-9\]\[0-9]r.[lindex $args 1]" ".o"
> + scoff end scan-dump "offload-rtl" [lindex $args 0] \
> + "\[0-9\]\[0-9\]\[0-9]r.[lindex $args 1]" ""
>  }
>  }

> [...]

For example, if there are two 'offload_targets' configured, and you do a:

{ dg-final { scan-offload-tree-dump-times "(?n)^main\\\._omp_fn\\\.0 " 1 
"optimized" } }

..., you'll get that reported as:

PASS: libgomp.c++/scan-offload-1.C scan-offload-tree-dump-times optimized 
"(?n)^main\\._omp_fn\\.0 " 1
PASS: libgomp.c++/scan-offload-1.C scan-offload-tree-dump-times optimized 
"(?n)^main\\._omp_fn\\.0 " 1

Can we easily get the respective offload target into that, or even full
'scoff-format' if that's easier?  I surely could hack up something, but
can you see an elegant way?


Grüße
 Thomas
-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


Re: [PATCH 07/13] OpenACC 2.6 deep copy: libgomp parts

2020-06-30 Thread Thomas Schwinge
Hi Julian!

On 2019-12-17T22:03:47-0800, Julian Brown  wrote:
> This part contains the libgomp runtime support for the GOMP_MAP_ATTACH and
> GOMP_MAP_DETACH mapping kinds (etc.), as introduced by the front-end
> patches following in this series.

> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -540,6 +540,7 @@ gomp_map_vars_existing (struct gomp_device_descr *devicep,
>tgt_var->key = oldn;
>tgt_var->copy_from = GOMP_MAP_COPY_FROM_P (kind);
>tgt_var->always_copy_from = GOMP_MAP_ALWAYS_FROM_P (kind);
> +  tgt_var->do_detach = kind == GOMP_MAP_ATTACH;
>tgt_var->offset = newn->host_start - oldn->host_start;
>tgt_var->length = newn->host_end - newn->host_start;
>

For 'kind == GOMP_MAP_ATTACH', this function 'gomp_map_vars_existing' is
actually unreachable.

> @@ -978,8 +979,15 @@ gomp_map_vars_internal (struct gomp_device_descr 
> *devicep,
> has_firstprivate = true;
> continue;
>   }
> +  else if ((kind & typemask) == GOMP_MAP_ATTACH)
> + {
> +   tgt->list[i].key = NULL;
> +   has_firstprivate = true;
> +   continue;
> + }

Given this, the following condition also is always-false:

>cur_node.host_start = (uintptr_t) hostaddrs[i];
> -  if (!GOMP_MAP_POINTER_P (kind & typemask))
> +  if (!GOMP_MAP_POINTER_P (kind & typemask)
> +   && (kind & typemask) != GOMP_MAP_ATTACH)
>   cur_node.host_end = cur_node.host_start + sizes[i];
>else
>   cur_node.host_end = cur_node.host_start + sizeof (void *);

Thus pushed "Mark up unreachable OpenACC 'attach' code path" to master
branch in commit aff43ac0aed5185884724adbdfd4dbbabd87637c, and
releases/gcc-10 branch in commit
4b185ee144d0c53ea7f08d4edaa8b578739498be, see attached.


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From aff43ac0aed5185884724adbdfd4dbbabd87637c Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 26 Jun 2020 10:19:14 +0200
Subject: [PATCH] Mark up unreachable OpenACC 'attach' code path

... introduced in commit 8e7e71ff247fb116dc381c5ef0c09acc0d2b374f (r279625)
"OpenACC 2.6 deep copy: libgomp parts".

	libgomp/
	* target.c (gomp_map_vars_existing): Assert 'kind !=
	GOMP_MAP_ATTACH'.
	(gomp_map_vars_internal): Clean up.
---
 libgomp/target.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/libgomp/target.c b/libgomp/target.c
index 36425477dcb0..d4a4a408b400 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -357,10 +357,12 @@ gomp_map_vars_existing (struct gomp_device_descr *devicep,
 			splay_tree_key newn, struct target_var_desc *tgt_var,
 			unsigned char kind, struct gomp_coalesce_buf *cbuf)
 {
+  assert (kind != GOMP_MAP_ATTACH);
+
   tgt_var->key = oldn;
   tgt_var->copy_from = GOMP_MAP_COPY_FROM_P (kind);
   tgt_var->always_copy_from = GOMP_MAP_ALWAYS_FROM_P (kind);
-  tgt_var->do_detach = kind == GOMP_MAP_ATTACH;
+  tgt_var->do_detach = false;
   tgt_var->offset = newn->host_start - oldn->host_start;
   tgt_var->length = newn->host_end - newn->host_start;
 
@@ -815,8 +817,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
 	  continue;
 	}
   cur_node.host_start = (uintptr_t) hostaddrs[i];
-  if (!GOMP_MAP_POINTER_P (kind & typemask)
-	  && (kind & typemask) != GOMP_MAP_ATTACH)
+  if (!GOMP_MAP_POINTER_P (kind & typemask))
 	cur_node.host_end = cur_node.host_start + sizes[i];
   else
 	cur_node.host_end = cur_node.host_start + sizeof (void *);
-- 
2.27.0

>From 4b185ee144d0c53ea7f08d4edaa8b578739498be Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 26 Jun 2020 10:19:14 +0200
Subject: [PATCH] Mark up unreachable OpenACC 'attach' code path

... introduced in commit 8e7e71ff247fb116dc381c5ef0c09acc0d2b374f (r279625)
"OpenACC 2.6 deep copy: libgomp parts".

	libgomp/
	* target.c (gomp_map_vars_existing): Assert 'kind !=
	GOMP_MAP_ATTACH'.
	(gomp_map_vars_internal): Clean up.

(cherry picked from commit aff43ac0aed5185884724adbdfd4dbbabd87637c)
---
 libgomp/target.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/libgomp/target.c b/libgomp/target.c
index 36425477dcb0..d4a4a408b400 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -357,10 +357,12 @@ gomp_map_vars_existing (struct gomp_device_descr *devicep,
 			splay_tree_key newn, struct target_var_desc *tgt_var,
 			unsigned char kind, struct gomp_coalesce_buf *cbuf)
 {
+  assert (kind != GOMP_MAP_ATTACH);
+
   tgt_var->key = oldn;
   tgt_var->copy_from = GOMP_MAP_COPY_FROM_P (kind);
   tgt_var->always_copy_from = GOMP_MAP_ALWAYS_FROM_P (kind);
-  tgt_var->do_detach = kind == GOMP_MAP_ATTACH;
+  tgt_var->do_detach = false;
   tgt_var->offset = newn->host_start - oldn->host_start;
   tgt_var->length = newn->host_end - newn->host_start;
 
@@ -815,8 +817,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
 	  continue;
 	}
 

[PATCH v2] tree-ssa-threadbackward.c (profitable_jump_thread_path): Do not allow __builtin_constant_p () before IPA.

2020-06-30 Thread Ilya Leoshkevich via Gcc-patches
v1: https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547236.html

This is the implementation of Jakub's suggestion: allow
__builtin_constant_p () after IPA, but fold it into 0.  Smoke test
passed on s390x-redhat-linux, full regtest and bootstrap are running on
x86_64-redhat-linux.

---

Linux Kernel (specifically, drivers/leds/trigger/ledtrig-cpu.c) build
with GCC 10 fails on s390 with "impossible constraint".

The problem is that jump threading makes __builtin_constant_p () lie
when it splits a path containing a non-constant expression in a way
that on each of the resulting paths this expression is constant.

Fix by disallowing __builtin_constant_p () on threading paths before
IPA and fold it into 0 after IPA.

gcc/ChangeLog:

2020-06-30  Ilya Leoshkevich  

* tree-ssa-threadbackward.c (thread_jumps::m_allow_bcp_p): New
member.
(thread_jumps::profitable_jump_thread_path): Do not allow
__builtin_constant_p () on threading paths unless m_allow_bcp_p
is set.
(thread_jumps::find_jump_threads_backwards): Set m_allow_bcp_p.
(pass_thread_jumps::execute): Allow __builtin_constant_p () on
threading paths after IPA.
(pass_early_thread_jumps::execute): Do not allow
__builtin_constant_p () on threading paths before IPA.
* tree-ssa-threadupdate.c (duplicate_thread_path): Fold
__builtin_constant_p () on threading paths into 0.

gcc/testsuite/ChangeLog:

2020-06-30  Ilya Leoshkevich  

* gcc.target/s390/builtin-constant-p-threading.c: New test.
---
 .../s390/builtin-constant-p-threading.c   | 46 +++
 gcc/tree-ssa-threadbackward.c | 22 ++---
 gcc/tree-ssa-threadupdate.c   | 18 
 3 files changed, 80 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/builtin-constant-p-threading.c

diff --git a/gcc/testsuite/gcc.target/s390/builtin-constant-p-threading.c 
b/gcc/testsuite/gcc.target/s390/builtin-constant-p-threading.c
new file mode 100644
index 000..5f0acdce0b0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/builtin-constant-p-threading.c
@@ -0,0 +1,46 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=z196 -mzarch" } */
+
+typedef struct
+{
+  int counter;
+} atomic_t;
+
+static inline __attribute__ ((__gnu_inline__)) int
+__atomic_add (int val, int *ptr)
+{
+  int old;
+  asm volatile("laa %[old],%[val],%[ptr]\n"
+  : [old] "=d" (old), [ptr] "+Q"(*ptr)
+  : [val] "d" (val)
+  : "cc", "memory");
+  return old;
+}
+
+static inline __attribute__ ((__gnu_inline__)) void
+__atomic_add_const (int val, int *ptr)
+{
+  asm volatile("asi %[ptr],%[val]\n"
+  : [ptr] "+Q" (*ptr)
+  : [val] "i" (val)
+  : "cc", "memory");
+}
+
+static inline __attribute__ ((__gnu_inline__)) void
+atomic_add (int i, atomic_t *v)
+{
+  if (__builtin_constant_p (i) && (i > -129) && (i < 128))
+{
+  __atomic_add_const (i, >counter);
+  return;
+}
+  __atomic_add (i, >counter);
+}
+
+static atomic_t num_active_cpus = { (0) };
+
+void
+ledtrig_cpu (_Bool is_active)
+{
+  atomic_add (is_active ? 1 : -1, _active_cpus);
+}
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index 327628f1662..446cb1b2e21 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -40,7 +40,9 @@ along with GCC; see the file COPYING3.  If not see
 class thread_jumps
 {
  public:
-  void find_jump_threads_backwards (basic_block bb, bool speed_p);
+   void find_jump_threads_backwards (basic_block bb, bool speed_p,
+bool allow_bcp_p);
+
  private:
   edge profitable_jump_thread_path (basic_block bbi, tree name, tree arg,
bool *creates_irreducible_loop);
@@ -65,6 +67,8 @@ class thread_jumps
   /* Indicate that we could increase code size to improve the
  code path.  */
   bool m_speed_p;
+  /* Whether to allow __builtin_constant_p () on threading paths.  */
+  bool m_allow_bcp_p;
 };
 
 /* Simple helper to get the last statement from BB, which is assumed
@@ -260,7 +264,9 @@ thread_jumps::profitable_jump_thread_path (basic_block bbi, 
tree name,
   gsi_next_nondebug ())
{
  gimple *stmt = gsi_stmt (gsi);
- if (gimple_call_internal_p (stmt, IFN_UNIQUE))
+ if (gimple_call_internal_p (stmt, IFN_UNIQUE)
+ || (!m_allow_bcp_p
+ && gimple_call_builtin_p (stmt, BUILT_IN_CONSTANT_P)))
{
  m_path.pop ();
  return NULL;
@@ -737,10 +743,13 @@ thread_jumps::fsm_find_control_statement_thread_paths 
(tree name)
It is assumed that BB ends with a control statement and that by
finding a path where NAME is a constant, we can thread the path.
SPEED_P indicates that we could increase code size to improve the
-   code path.  */
+   

Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-06-30 Thread Thomas Schwinge
Hi!

On 2020-06-18T14:06:10+0200, Tobias Burnus  wrote:
> On 6/18/20 12:39 PM, Alexandre Oliva wrote:
>>> and intelmic.
>> How does intelmic get into the picture?
>
> No idea – I just know that it counts as offloading platform,
> ENABLE_OFFLOAD is set for it (but not for has).
> I just wanted to mention it GCC supports it, in principle.
> (What I have heard is that it does not exist in silicon but
> only as simulator, which can be used for testing with GCC,
> but I did not follow this at all.)
>
>>I looked for a mkoffload
>> program for it in the GCC source tree and couldn't find one.

See 'gcc/config/i386/intelmic-mkoffload.c'.  ;-)

>> This
>> suggests that the addition of -dumpbase to the mkoffload interface might
>> require changes elsewhere.
>
> Maybe.

:-) Yes, I quickly tested, and found that similar changes are required
there, too.

> I think Thomas is the only one who runs GCC with
> emulated intelmic once in a while – thus, he surely knows a bit more about it.
> But also Jakub should have an idea.

Can you easily adjust that file as you did for the GCN and nvptx
'mkoffload's?  Due to other workload, resolving it myself would be very
low priority for me, I'm afraid.  (But I can easily test anything anybody
else comes up with.)


Grüße
 Thomas
-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


Re: [PATCH] analyzer/pr94028.C and non-null warning

2020-06-30 Thread David Malcolm via Gcc-patches
On Tue, 2020-06-30 at 10:12 -0600, Martin Sebor wrote:
> On 6/30/20 8:47 AM, David Edelsohn via Gcc-patches wrote:
> > The unexpected warning is
> > 
> > gcc/testsuite/g++.dg/analyzer/pr94028.C:28:21: warning: use of
> > possibly-NULL '' where non-null expected [CWE-690]
> > [-Wanalyzer-possible-null-argument]
> > 
> > This is the same location as one of the existing "leak" warnings.
> 
> I'm surprised that the Wnonull change triggers
> a -Wanalyzer-possible-null-argument warning.  There is no
> -Wnonnull for the test case with or without -fanalyzer (and
> with or without optimization) so that suggests the two are
> independent of one another.
> 
> I'm also not sure I see a justification for a nonnull warning
> in the test.  The note printed after the warning says:
> 
>  |  | (5) possible return of NULL to ‘f’ 
> from ‘j::operator new’
>  |  | (6) argument 1 (‘’) from
> (4) 
> could be NULL where non-null expected
>  |
> /src/gcc/master/gcc/testsuite/g++.dg/analyzer/pr94028.C:19:3: note: 
> argument 1 of ‘j::j(B*, int)’ must be non-null
> 19 |   j (B *, int)
>|   ^
> 
> but the first argument in j::j(B*, int) is not marked nonnull,
> and it's not the this pointer (treating those as implicitly
> nonnull was one of the changes introduced by my patch).

Are you sure it's not the "this" pointer?  Bear in mind that the C++
"support" in the analyzer is practically non-existent; I haven't yet
implemented any special-casing about how arguments are numbered, so it
could well be the "this" pointer.

If it is a complaint about the "this" pointer, that could explain why
this started happening when your patch went in.

Dave

> 
> FWIW, I don't remember if I saw it when going through the test
> results for my patch but if I did, it's possible that I assumed
> it was unrelated because of the -Wanalyzer- option.  Sorry if
> this was the wrong call.
> 
> Martin

> > How would you like pr94028.C to be adjusted in the testsuite to
> > account for this?
> > 
> > Thanks, David
> > 
> > On Tue, Jun 30, 2020 at 10:18 AM David Malcolm  > > wrote:
> > > On Tue, 2020-06-30 at 09:51 -0400, David Edelsohn wrote:
> > > > The changes to the non-null warning now produce an additional
> > > > warning
> > > > for analyzer/pr94028.C on one of the "leak" lines.  This causes
> > > > new
> > > > failures on trunk.
> > > 
> > > Hi David
> > > 
> > > Do you have the output to hand?  What is the full text of the new
> > > diagnostic?
> > > 
> > > Some high level questions:
> > >* Ought GCC to warn for that code?  (or not)
> > >* Is it behavior we ought to have a test for?
> > > 
> > > Note that AFAIK there are *two* kinds of non-null warnings that
> > > GCC can
> > > emit: the kind emitted by Martin's code, versus those emitted by
> > > -fanalyzer (the latter imply the much more expensive analysis
> > > performed
> > > by -fanalyzer, and are controlled by the various -Wanalyzer-
> > > *null*
> > > options; see
> > > https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Static-Analyzer-Options.html
> > > )
> > > 
> > > 
> > > > Because non-null is not the purpose of the analyzer test, I
> > > > propose
> > > > pruning the output to resolve the new failures.
> > > 
> > > Looking back through bugzilla, it seems that the main purpose of
> > > adding
> > > that test was to ensure that -fanalyzer doesn't ICE on that code.
> > > 
> > > At some point I hope to properly support C++ in -fanalyzer, at
> > > which
> > > point some kind of null warning may be warranted on that
> > > code.  Sadly
> > > I'm not yet at that point.  FWIW I'm currently working on a big
> > > rewrite
> > > of how state is tracked within the analyzer, as I've identified
> > > at
> > > least two major flaws in the current implementation, which my
> > > rewrite
> > > addresses.  I'm deferring on C++ support until that rewrite is
> > > done.
> > > 
> > > >Alternatively, I
> > > > could explicitly test for the additional non-null warning.
> > > > 
> > > > Do you have any preferences?
> > > 
> > > This test is controlled by analyzer.exp and thus, for example, is
> > > disabled if the analyzer was disabled at configure time.
> > > 
> > > If this is coming from Martin's non-analyzer code, a third
> > > possibility
> > > would be to use -Wno-something to disable that warning, so that
> > > the
> > > analyzer test can focus on the -fanalyzer test, as it were (and
> > > if this
> > > is a behavior that ought to be checked for in Martin's warning,
> > > then
> > > copy the pertinent fragment of the testcase to one of the g++.dg
> > > cases,
> > > I suppose).  I think I prefer that approach.
> > > 
> > > How does this sound?
> > > 
> > > > I propose appending
> > > > 
> > > > // { dg-prune-output "where non-null expected" }
> > > > 
> > > > to the file to prune the warning output.
> > > > 
> > > > Thanks, David
> > > 
> > > Thanks for looking into this; hope this is constructive.
> > > Dave
> > > 



Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-06-30 Thread Thomas Schwinge
Hi!

On 2020-06-22T11:32:46-0300, Alexandre Oliva  wrote:
> Here's a consolidated patch, [...]

Again, many thanks for working through this, with Tobias' help.

> --- /dev/null
> +++ b/gcc/testsuite/lib/scanoffload.exp

> +# Utility for scanning offloading dump output, used by libgomp.exp.

;-) Yeah, I was about to say that having this file in
'gcc/testsuite/lib/' seems to be a bit of an abstraction violation, given
that...

> +# Format an offload dump suffix given the offload target name in
> +# OFFTGT and any suffix, probably empty, in SUFFIX.
> +proc scoff-format { offtgt suffix } {
> +return ".x$offtgt.mkoffload$suffix"
> +}
> +
> +# Wrapper for scan procs.
> +# Argument 0 is the index of the argument to replace when calling
> +# argument 1 with the remaining arguments.  Use end-1 or end or so.
> +proc scoff { args } {
> +set idx [lindex $args 0]
> +set prc [lindex $args 1]
> +set args [lreplace $args 0 1]
> +
> +global offload_target
> +if [info exists offload_target] {
> + set target $offload_target
> + if { "$target" != "disable" } {
> + eval $prc [lreplace $args $idx $idx "[scoff-format $target [lindex 
> $args $idx]]"]
> + }
> +} else {
> + global offload_targets
> + foreach target [split $offload_targets ","] {
> + eval $prc [lreplace $args $idx $idx "[scoff-format $target [lindex 
> $args $idx]]"]
> + }
> +}
> +}

... only libgomp testing defines the 'offload_target', 'offload_targets'
variables.  But anyway, it works, and we've got worse things in GCC.  ;-)

Good idea to indirect this via 'scoff', 'scoff-format'!  This is very
explicit (good) instead of just using some wildcard file name matching --
which would especially be problematic if you've got offload code
generation configured for several offload targets.

> --- a/libgomp/testsuite/lib/libgomp-dg.exp
> +++ b/libgomp/testsuite/lib/libgomp-dg.exp
> @@ -1,12 +1,4 @@
>  proc libgomp-dg-test { prog do_what extra_tool_flags } {
> -# Force the dumpbase for test.c to test.o, such that scan-offload-*-dump
> -# will work.
> -foreach opt $extra_tool_flags {
> - if { [regexp ^-foffload=-fdump- $opt] } {
> - lappend extra_tool_flags "-save-temps"
> - }
> -}
> -
>  return [gcc-dg-test-1 libgomp_target_compile $prog $do_what 
> $extra_tool_flags]
>  }

Happy to see that hack go away.


I've pushed "[testsuite] Adjust 'scoff' for HSA offloading" to master
branch in commit 01dd58659faae3d6292c458cd9cf96dfd4cf7198, see attached.
"HSA offloading is doing things differently, doesn't use 'mkoffload'", so
we have to skip it in the loop over 'offload_targets'.


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From 01dd58659faae3d6292c458cd9cf96dfd4cf7198 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 30 Jun 2020 07:23:03 +0200
Subject: [PATCH] [testsuite] Adjust 'scoff' for HSA offloading

Fix-up for commit efc16503ca10bc0e934e0bace5777500e4dc757a "handle dumpbase in
offloading, adjust testsuite".

	gcc/testsuite/
	* lib/scanoffload.exp (scoff) : Skip HSA.
---
 gcc/testsuite/lib/scanoffload.exp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/testsuite/lib/scanoffload.exp b/gcc/testsuite/lib/scanoffload.exp
index ec0d7a605eb3..1a26e44113e2 100644
--- a/gcc/testsuite/lib/scanoffload.exp
+++ b/gcc/testsuite/lib/scanoffload.exp
@@ -39,6 +39,9 @@ proc scoff { args } {
 } else {
 	global offload_targets
 	foreach target [split $offload_targets ","] {
+	# HSA offloading is doing things differently, doesn't use 'mkoffload'.
+	if { "$target" == "hsa" } continue
+
 	eval $prc [lreplace $args $idx $idx "[scoff-format $target [lindex $args $idx]]"]
 	}
 }
-- 
2.27.0



Re: [PATCH] nvptx: Add support for subword compare-and-swap

2020-06-30 Thread Kwok Cheung Yeung

On 23/06/2020 5:51 pm, Jakub Jelinek wrote:

On Tue, Jun 23, 2020 at 06:44:26PM +0200, Thomas Schwinge wrote:

On 2020-06-15T21:28:12+0100, Kwok Cheung Yeung  wrote:

This patch adds support on nvptx for __sync_val_compare_and_swap operations on
1- and 2-byte values.


Is this a thorough review that these are the only functions missing, or
did you just implement what you found missing for some test case you've
been looking into?  Other architectures' similar libgcc files seem to be
defining more of such related functions.
It seems a bit unfortunate to have such a thing outlined in a separate
function, given we're talking about performance-critical code here?  Even
more so for GCN, where there's no JIT compiler that can inline it later,
as it's the case for nvptx?


I think this should really be handled by the backend inline, like many other
targets do it when they only support 32-bit+ and not 8/16-bit atomics.
See e.g. sparc backend.


I can see both approaches being used - e.g. Sparc, Alpha, and RS6000 expand the 
subword atomics into loops in the backend. However, there are various 
linux-atomic.c files in the subdirectories of libgcc/config/ (e.g. for Arm, 
PA-RISC, m68k etc.), which use a system-call to do the 32-bit compare-and-swap, 
then express smaller operations in terms of that. TilePro, AMDGCN and RISC-V 
have atomic.c files that do the subword compare-and-swap in terms of a larger 
native compare-and-swap.


I don't know - is it worth doing this in the backend on this architecture? One 
thing to keep in mind is that nvptx is a virtual instruction set, so the JIT 
compiler would likely inline the libgcc library calls anyway.


Thanks

Kwok


Re: [PATCH] analyzer/pr94028.C and non-null warning

2020-06-30 Thread Martin Sebor via Gcc-patches

On 6/30/20 10:22 AM, David Malcolm wrote:

On Tue, 2020-06-30 at 10:12 -0600, Martin Sebor wrote:

On 6/30/20 8:47 AM, David Edelsohn via Gcc-patches wrote:

The unexpected warning is

gcc/testsuite/g++.dg/analyzer/pr94028.C:28:21: warning: use of
possibly-NULL '' where non-null expected [CWE-690]
[-Wanalyzer-possible-null-argument]

This is the same location as one of the existing "leak" warnings.


I'm surprised that the Wnonull change triggers
a -Wanalyzer-possible-null-argument warning.  There is no
-Wnonnull for the test case with or without -fanalyzer (and
with or without optimization) so that suggests the two are
independent of one another.

I'm also not sure I see a justification for a nonnull warning
in the test.  The note printed after the warning says:

  |  | (5) possible return of NULL to ‘f’
from ‘j::operator new’
  |  | (6) argument 1 (‘’) from
(4)
could be NULL where non-null expected
  |
/src/gcc/master/gcc/testsuite/g++.dg/analyzer/pr94028.C:19:3: note:
argument 1 of ‘j::j(B*, int)’ must be non-null
 19 |   j (B *, int)
|   ^

but the first argument in j::j(B*, int) is not marked nonnull,
and it's not the this pointer (treating those as implicitly
nonnull was one of the changes introduced by my patch).


Are you sure it's not the "this" pointer?  Bear in mind that the C++
"support" in the analyzer is practically non-existent; I haven't yet
implemented any special-casing about how arguments are numbered, so it
could well be the "this" pointer.

If it is a complaint about the "this" pointer, that could explain why
this started happening when your patch went in.


Hmm, I guess it is the this pointer it's complaining about after
all.  The note says it's argument 1 but it's really talking about
the implicit this pointer to the j::j (B*, int) constructor.

The simpler test case below emits the corresponding -Wnonnull
warning:

struct S {
  S ();
  void* operator new (__SIZE_TYPE__) { return 0; }
};

void g (void*);

S* f ()
{
  return new S ();
}

$ gcc -O -S -Wall t.C
t.C: In static member function ‘static void* S::operator new(long 
unsigned int)’:
t.C:3:47: warning: ‘operator new’ must not return NULL unless it is 
declared ‘throw()’ (or ‘-fcheck-new’ is in effect)

3 |   void* operator new (__SIZE_TYPE__) { return 0; }
  |   ^
In function ‘S* f()’:
cc1plus: warning: ‘this’ pointer null [-Wnonnull]
t.C:2:3: note: in a call to non-static member function ‘S::S()’
2 |   S ();
  |   ^

My patch changed the warning from:

cc1plus: warning: argument 1 null where non-null expected [-Wnonnull]

because referring to this as argument 1 was confusing (it even
confused me in this instance!)

Martin



Dave



FWIW, I don't remember if I saw it when going through the test
results for my patch but if I did, it's possible that I assumed
it was unrelated because of the -Wanalyzer- option.  Sorry if
this was the wrong call.

Martin



How would you like pr94028.C to be adjusted in the testsuite to
account for this?

Thanks, David

On Tue, Jun 30, 2020 at 10:18 AM David Malcolm 
wrote:
On Tue, 2020-06-30 at 09:51 -0400, David Edelsohn wrote:

The changes to the non-null warning now produce an additional
warning
for analyzer/pr94028.C on one of the "leak" lines.  This causes
new
failures on trunk.


Hi David

Do you have the output to hand?  What is the full text of the new
diagnostic?

Some high level questions:
* Ought GCC to warn for that code?  (or not)
* Is it behavior we ought to have a test for?

Note that AFAIK there are *two* kinds of non-null warnings that
GCC can
emit: the kind emitted by Martin's code, versus those emitted by
-fanalyzer (the latter imply the much more expensive analysis
performed
by -fanalyzer, and are controlled by the various -Wanalyzer-
*null*
options; see
https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Static-Analyzer-Options.html
)



Because non-null is not the purpose of the analyzer test, I
propose
pruning the output to resolve the new failures.


Looking back through bugzilla, it seems that the main purpose of
adding
that test was to ensure that -fanalyzer doesn't ICE on that code.

At some point I hope to properly support C++ in -fanalyzer, at
which
point some kind of null warning may be warranted on that
code.  Sadly
I'm not yet at that point.  FWIW I'm currently working on a big
rewrite
of how state is tracked within the analyzer, as I've identified
at
least two major flaws in the current implementation, which my
rewrite
addresses.  I'm deferring on C++ support until that rewrite is
done.


Alternatively, I
could explicitly test for the additional non-null warning.

Do you have any preferences?


This test is controlled by analyzer.exp and thus, for example, is
disabled if the analyzer was disabled at configure time.

If this is coming from Martin's non-analyzer code, a third
possibility
would be to use -Wno-something to disable 

Re: [PATCH] analyzer/pr94028.C and non-null warning

2020-06-30 Thread Martin Sebor via Gcc-patches

On 6/30/20 8:47 AM, David Edelsohn via Gcc-patches wrote:

The unexpected warning is

gcc/testsuite/g++.dg/analyzer/pr94028.C:28:21: warning: use of
possibly-NULL '' where non-null expected [CWE-690]
[-Wanalyzer-possible-null-argument]

This is the same location as one of the existing "leak" warnings.


I'm surprised that the Wnonull change triggers
a -Wanalyzer-possible-null-argument warning.  There is no
-Wnonnull for the test case with or without -fanalyzer (and
with or without optimization) so that suggests the two are
independent of one another.

I'm also not sure I see a justification for a nonnull warning
in the test.  The note printed after the warning says:

|  | (5) possible return of NULL to ‘f’ 
from ‘j::operator new’
|  | (6) argument 1 (‘’) from (4) 
could be NULL where non-null expected

|
/src/gcc/master/gcc/testsuite/g++.dg/analyzer/pr94028.C:19:3: note: 
argument 1 of ‘j::j(B*, int)’ must be non-null

   19 |   j (B *, int)
  |   ^

but the first argument in j::j(B*, int) is not marked nonnull,
and it's not the this pointer (treating those as implicitly
nonnull was one of the changes introduced by my patch).

FWIW, I don't remember if I saw it when going through the test
results for my patch but if I did, it's possible that I assumed
it was unrelated because of the -Wanalyzer- option.  Sorry if
this was the wrong call.

Martin



How would you like pr94028.C to be adjusted in the testsuite to
account for this?

Thanks, David

On Tue, Jun 30, 2020 at 10:18 AM David Malcolm  wrote:


On Tue, 2020-06-30 at 09:51 -0400, David Edelsohn wrote:

The changes to the non-null warning now produce an additional warning
for analyzer/pr94028.C on one of the "leak" lines.  This causes new
failures on trunk.


Hi David

Do you have the output to hand?  What is the full text of the new
diagnostic?

Some high level questions:
   * Ought GCC to warn for that code?  (or not)
   * Is it behavior we ought to have a test for?

Note that AFAIK there are *two* kinds of non-null warnings that GCC can
emit: the kind emitted by Martin's code, versus those emitted by
-fanalyzer (the latter imply the much more expensive analysis performed
by -fanalyzer, and are controlled by the various -Wanalyzer-*null*
options; see
https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Static-Analyzer-Options.html
)



Because non-null is not the purpose of the analyzer test, I propose
pruning the output to resolve the new failures.


Looking back through bugzilla, it seems that the main purpose of adding
that test was to ensure that -fanalyzer doesn't ICE on that code.

At some point I hope to properly support C++ in -fanalyzer, at which
point some kind of null warning may be warranted on that code.  Sadly
I'm not yet at that point.  FWIW I'm currently working on a big rewrite
of how state is tracked within the analyzer, as I've identified at
least two major flaws in the current implementation, which my rewrite
addresses.  I'm deferring on C++ support until that rewrite is done.


   Alternatively, I
could explicitly test for the additional non-null warning.

Do you have any preferences?


This test is controlled by analyzer.exp and thus, for example, is
disabled if the analyzer was disabled at configure time.

If this is coming from Martin's non-analyzer code, a third possibility
would be to use -Wno-something to disable that warning, so that the
analyzer test can focus on the -fanalyzer test, as it were (and if this
is a behavior that ought to be checked for in Martin's warning, then
copy the pertinent fragment of the testcase to one of the g++.dg cases,
I suppose).  I think I prefer that approach.

How does this sound?


I propose appending

// { dg-prune-output "where non-null expected" }

to the file to prune the warning output.

Thanks, David


Thanks for looking into this; hope this is constructive.
Dave





Re: [PATCH] analyzer/pr94028.C and non-null warning

2020-06-30 Thread Martin Sebor via Gcc-patches

On 6/30/20 8:47 AM, David Edelsohn via Gcc-patches wrote:

The unexpected warning is

gcc/testsuite/g++.dg/analyzer/pr94028.C:28:21: warning: use of
possibly-NULL '' where non-null expected [CWE-690]
[-Wanalyzer-possible-null-argument]

This is the same location as one of the existing "leak" warnings.


I'm surprised that the Wnonull change triggers
a -Wanalyzer-possible-null-argument warning.  There is no
-Wnonnull for the test case with or without -fanalyzer (and
with or without optimization) so that suggests the two are
independent of one another.

I'm also not sure I see a justification for a nonnull warning
in the test.  The note printed after the warning says:

|  | (5) possible return of NULL to ‘f’ 
from ‘j::operator new’
|  | (6) argument 1 (‘’) from (4) 
could be NULL where non-null expected

|
/src/gcc/master/gcc/testsuite/g++.dg/analyzer/pr94028.C:19:3: note: 
argument 1 of ‘j::j(B*, int)’ must be non-null

   19 |   j (B *, int)
  |   ^

but the first argument in j::j(B*, int) is not marked nonnull,
and it's not the this pointer (treating those as implicitly
nonnull was one of the changes introduced by my patch).

FWIW, I don't remember if I saw it when going through the test
results for my patch but if I did, it's possible that I assumed
it was unrelated because of the -Wanalyzer- option.  Sorry if
this was the wrong call.

Martin



How would you like pr94028.C to be adjusted in the testsuite to
account for this?

Thanks, David

On Tue, Jun 30, 2020 at 10:18 AM David Malcolm  wrote:


On Tue, 2020-06-30 at 09:51 -0400, David Edelsohn wrote:

The changes to the non-null warning now produce an additional warning
for analyzer/pr94028.C on one of the "leak" lines.  This causes new
failures on trunk.


Hi David

Do you have the output to hand?  What is the full text of the new
diagnostic?

Some high level questions:
   * Ought GCC to warn for that code?  (or not)
   * Is it behavior we ought to have a test for?

Note that AFAIK there are *two* kinds of non-null warnings that GCC can
emit: the kind emitted by Martin's code, versus those emitted by
-fanalyzer (the latter imply the much more expensive analysis performed
by -fanalyzer, and are controlled by the various -Wanalyzer-*null*
options; see
https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Static-Analyzer-Options.html
)



Because non-null is not the purpose of the analyzer test, I propose
pruning the output to resolve the new failures.


Looking back through bugzilla, it seems that the main purpose of adding
that test was to ensure that -fanalyzer doesn't ICE on that code.

At some point I hope to properly support C++ in -fanalyzer, at which
point some kind of null warning may be warranted on that code.  Sadly
I'm not yet at that point.  FWIW I'm currently working on a big rewrite
of how state is tracked within the analyzer, as I've identified at
least two major flaws in the current implementation, which my rewrite
addresses.  I'm deferring on C++ support until that rewrite is done.


   Alternatively, I
could explicitly test for the additional non-null warning.

Do you have any preferences?


This test is controlled by analyzer.exp and thus, for example, is
disabled if the analyzer was disabled at configure time.

If this is coming from Martin's non-analyzer code, a third possibility
would be to use -Wno-something to disable that warning, so that the
analyzer test can focus on the -fanalyzer test, as it were (and if this
is a behavior that ought to be checked for in Martin's warning, then
copy the pertinent fragment of the testcase to one of the g++.dg cases,
I suppose).  I think I prefer that approach.

How does this sound?


I propose appending

// { dg-prune-output "where non-null expected" }

to the file to prune the warning output.

Thanks, David


Thanks for looking into this; hope this is constructive.
Dave





Re: [PATCH] analyzer/pr94028.C and non-null warning

2020-06-30 Thread Martin Sebor via Gcc-patches

On 6/30/20 11:59 AM, David Edelsohn wrote:

Why did you mark PR96008 as a duplicate?  The ICE is a duplicate, but
the wrong IL is a C++ FE bug.


They're both caused by the same problem: the -Wnonnull warning
is triggered by the C++ FE bug (assuming it is one) and the ICE
by the C++ pretty printer calling itself recursively because of
it.  Somehow the variadic lambda is triggering this in both tests.

Martin



Thanks, David

On Tue, Jun 30, 2020 at 1:45 PM Martin Sebor  wrote:


On 6/30/20 10:47 AM, David Edelsohn wrote:

Also, cpp1y/lambda-generic-69078-1.C elicits a similar, new warning:

FAIL: g++.dg/cpp1y/lambda-generic-69078-1.C  -std=gnu++14 (test for
excess errors)
Excess errors:
g++.dg/cpp1y/lambda-generic-69078-1.C:23:13: warning: 'this' pointer
null [-Wnonnull]


I think this (mainly the ICE) is the subject of pr95984.  AFAICT,
the warning is working correctly but the IL the C++ front end
emits doesn't look right: AFAICS, it creates a function object
for the lambda and calls its operator() with a null this pointer:

; Function static decltype (((const main()::*)0)->operator()(static_cast(main::._anon_0::_FUN::) ...))
main()_FUN(auto:1 ...) [with auto:1 = {int};
decltype (((const main()::*)0)->operator()(static_cast(main::._anon_0::_FUN::) ...)) =
void] (null)
;; enabled by -tree-original


<::operator() (0B, D.2440) >;
return;

Martin



Thanks, David

On Tue, Jun 30, 2020 at 12:23 PM David Malcolm  wrote:


On Tue, 2020-06-30 at 10:12 -0600, Martin Sebor wrote:

On 6/30/20 8:47 AM, David Edelsohn via Gcc-patches wrote:

The unexpected warning is

gcc/testsuite/g++.dg/analyzer/pr94028.C:28:21: warning: use of
possibly-NULL '' where non-null expected [CWE-690]
[-Wanalyzer-possible-null-argument]

This is the same location as one of the existing "leak" warnings.


I'm surprised that the Wnonull change triggers
a -Wanalyzer-possible-null-argument warning.  There is no
-Wnonnull for the test case with or without -fanalyzer (and
with or without optimization) so that suggests the two are
independent of one another.

I'm also not sure I see a justification for a nonnull warning
in the test.  The note printed after the warning says:

   |  | (5) possible return of NULL to ‘f’
from ‘j::operator new’
   |  | (6) argument 1 (‘’) from
(4)
could be NULL where non-null expected
   |
/src/gcc/master/gcc/testsuite/g++.dg/analyzer/pr94028.C:19:3: note:
argument 1 of ‘j::j(B*, int)’ must be non-null
  19 |   j (B *, int)
 |   ^

but the first argument in j::j(B*, int) is not marked nonnull,
and it's not the this pointer (treating those as implicitly
nonnull was one of the changes introduced by my patch).


Are you sure it's not the "this" pointer?  Bear in mind that the C++
"support" in the analyzer is practically non-existent; I haven't yet
implemented any special-casing about how arguments are numbered, so it
could well be the "this" pointer.

If it is a complaint about the "this" pointer, that could explain why
this started happening when your patch went in.

Dave



FWIW, I don't remember if I saw it when going through the test
results for my patch but if I did, it's possible that I assumed
it was unrelated because of the -Wanalyzer- option.  Sorry if
this was the wrong call.

Martin



How would you like pr94028.C to be adjusted in the testsuite to
account for this?

Thanks, David

On Tue, Jun 30, 2020 at 10:18 AM David Malcolm 
wrote:
On Tue, 2020-06-30 at 09:51 -0400, David Edelsohn wrote:

The changes to the non-null warning now produce an additional
warning
for analyzer/pr94028.C on one of the "leak" lines.  This causes
new
failures on trunk.


Hi David

Do you have the output to hand?  What is the full text of the new
diagnostic?

Some high level questions:
 * Ought GCC to warn for that code?  (or not)
 * Is it behavior we ought to have a test for?

Note that AFAIK there are *two* kinds of non-null warnings that
GCC can
emit: the kind emitted by Martin's code, versus those emitted by
-fanalyzer (the latter imply the much more expensive analysis
performed
by -fanalyzer, and are controlled by the various -Wanalyzer-
*null*
options; see
https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Static-Analyzer-Options.html
)



Because non-null is not the purpose of the analyzer test, I
propose
pruning the output to resolve the new failures.


Looking back through bugzilla, it seems that the main purpose of
adding
that test was to ensure that -fanalyzer doesn't ICE on that code.

At some point I hope to properly support C++ in -fanalyzer, at
which
point some kind of null warning may be warranted on that
code.  Sadly
I'm not yet at that point.  FWIW I'm currently working on a big
rewrite
of how state is tracked within the analyzer, as I've identified
at
least two major flaws in the current implementation, which my
rewrite
addresses.  I'm deferring on C++ support until that rewrite is
done.


 Alternatively, I
could explicitly 

Re: [PATCH] Fix unnecessary register spill that depends on function ordering

2020-06-30 Thread Richard Sandiford
Hi,

Thanks for the patch.

Omar Tahir  writes:
> Hi,
>
> The variables first_moveable_pseudo and last_moveable_pseudo aren't reset
> after compiling a function, which means they leak into the first scheduler
> pass of the following function. In some cases, this can cause an extra spill
> during register allocation of the second function.
>
> This patch fixes this by setting
>
> first_moveable_pseudo = last_moveable_pseudo
>
> at the beginning of the first scheduler pass.
>
> Because the spill occurs in the middle of the IRA pass it is highly
> target-sensitive, so I have provided a test case that works on aarch64. There
> doesn't seem to be a target-independent way to trigger this bug, since the
> RTL at the IRA stage is already quite target-specific.
>
> Interestingly, the aarch64 test case here doesn't actually perform a complete
> register spill - the value is stored on the stack but never retrieved.
> Perhaps this points to another, deeper bug?
>
> Bootstrapped and regression tested on aarch64-none-linux-gnu.
>
> I don't have write privileges, so if it's fine could someone push for me?
>
> Thanks,
> Omar
>
> 
>
> gcc/ChangeLog:
>
> 2020-06-30: Omar Tahir 
>
> * sched-rgn.c: Include ira-int.h, ira.h, regs.h.
> (rest_of_handle_sched): Clear moveable pseudo range.
>
> gcc/testsuite/ChangeLog:
>
> 2020-06-30: Omar Tahir 
>
> * gcc.target/aarch64/nospill.c: New test.
>
> 
>
> diff --git a/gcc/sched-rgn.c b/gcc/sched-rgn.c
> index 7f5dfdb..51329fc 100644
> --- a/gcc/sched-rgn.c
> +++ b/gcc/sched-rgn.c
> @@ -65,6 +65,9 @@ along with GCC; see the file COPYING3.  If not see
> #include "dbgcnt.h"
> #include "pretty-print.h"
> #include "print-rtl.h"
> +#include "ira.h"
> +#include "regs.h"
> +#include "ira-int.h"
>
> /* Disable warnings about quoting issues in the pp_xxx calls below
> that (intentionally) don't follow GCC diagnostic conventions.  */
> @@ -3719,6 +3722,7 @@ static unsigned int
> rest_of_handle_sched (void)
> {
> #ifdef INSN_SCHEDULING
> +  first_moveable_pseudo = last_moveable_pseudo;
>if (flag_selective_scheduling
>&& ! maybe_skip_selective_scheduling ())
>  run_selective_scheduling ();

I think instead we should zero out both variables at the end of IRA.
There are other places besides the scheduler that call into the IRA code,
so tackling the problem there seems more general.

> diff --git a/gcc/testsuite/gcc.target/aarch64/nospill.c 
> b/gcc/testsuite/gcc.target/aarch64/nospill.c
> new file mode 100644
> index 000..60399d8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/nospill.c
> @@ -0,0 +1,35 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +
> +/* The pseudo for P is marked as moveable in the IRA pass. */
> +float
> +func_0 (float a, float b, float c)
> +{
> +  float p = c / a;
> +
> +  if (b > 1)
> +{
> +  b /= p;
> +  if (c > 2)
> +a /= 3;
> +}
> +
> +  return b / c * a;
> +}
> +
> +/* If first_moveable_pseudo and last_moveable_pseudo are not reset correctly,
> +   they will carry over and spill the pseudo for Q. */
> +float
> +func_1 (float a, float b, float c)
> +{
> +  float q = a + b;
> +
> +  c *= a / (b + b);
> +  if (a > 0)
> +c *= q;
> +
> +  return a * b * c;
> +}
> +
> +/* We have plenty of spare registers, so check nothing has been spilled. */
> +/* { dg-final { scan-assembler-not "str" } } */

The testcase looks good, but it's probably better to make that “\tstr\t”.
The problem with plain “str” is that assembly output can often include
pathnames and version strings, and it's possible that one of those could
contain “str”.

Thanks,
Richard


[PATCH, committed] [8/9/10/11 Regression] ICE with allocatable coarray, class and associate in resolve_assoc_var, at fortran/resolve.c:8750

2020-06-30 Thread Harald Anlauf
Committed as obvious to master.

Will backport as seems fit.

Thanks,
Harald


PR fortran/88379 - ICE with allocatable coarray, class and associate

Catch NULL pointer dereference for ASSOCIATE on allocatable coarray variable.

gcc/fortran/
PR fortran/88379
* resolve.c (resolve_assoc_var): Avoid NULL pointer dereference.diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index f3e8ffc204c..4a2abd00f4a 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -9045,7 +9045,7 @@ resolve_assoc_var (gfc_symbol* sym, bool resolve_target)
 	  as = NULL;
 	  sym->ts = *ts;
 	  sym->ts.type = BT_CLASS;
-	  attr = CLASS_DATA (sym)->attr;
+	  attr = CLASS_DATA (sym) ? CLASS_DATA (sym)->attr : sym->attr;
 	  attr.class_ok = 0;
 	  attr.associate_var = 1;
 	  attr.dimension = attr.codimension = 0;
diff --git a/gcc/testsuite/gfortran.dg/pr88379.f90 b/gcc/testsuite/gfortran.dg/pr88379.f90
new file mode 100644
index 000..48a23af50c5
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr88379.f90
@@ -0,0 +1,11 @@
+! { dg-do compile }
+! { dg-options "-fcoarray=single" }
+! PR fortran/88379 - ICE with allocatable coarray, class and associate
+
+program p
+  type t
+  end type t
+  class(t), allocatable :: x[:]
+  associate (y => x)
+  end associate
+end


Re: Deque rotate on current node

2020-06-30 Thread François Dumont via Gcc-patches

Hi

Any feedback regarding this patch ?

François


On 17/01/20 6:27 pm, François Dumont wrote:

On 1/17/20 11:01 AM, Jonathan Wakely wrote:

On 20/05/19 07:39 +0200, François Dumont wrote:

Hi

  std::deque is supposed to be like a double-queue that you can 
attack from front and back. But currrently its implementation makes 
it behave differently when starting from a fresh deque. If push_back 
then all goes well, it is copy/move to the current internal 'node'. 
If push_front then a new 'node' is created and on next pop_back the 
initial node will be deallocated without having ever be used.


  This patch changes this behavior. As long as deque is empty we can 
play with its state to make it push_back- or push_front-ready. It 
will also benefit to the usage of deque in the std::stack.


  More generally it could really improve scenario in which deque is 
used as queue of elements exchanged between different threads. As 
you need to make sure that consumers are faster than producers then 
your deque should remain empty most of the time and so this proposed 
patch should avoid nodes to be allocated/deallocated all the time.


This benefits the case where you always push at the same end. But with
your patch if I do push_front then push_back,
we still allocate a second node even though the first one is nearly
empty.

Would it be better to point into the middle of the node, not right at
the end or right at the beginning?


That's purely empirical but I tend to think that when you need 
push_back only you go with std::vector or std::deque, when you need 
push_front only you go with std::deque and when you need both you go 
with std::list. At least std::stack and std::queue are following this 
axiom per default.


The tradeoff you are describing is not worst than current situation 
where you will eventually deallocate a node that you haven't used at all.


Are you proposing to just init in the middle or also to always reset 
when empty in the middle ? If we also reset in the middle those doing 
always push_back or push_front will live with half a node unreachable.


We could just init in the middle however. It would be a different 
patch as this one is focus on just recycling the last empty node.


I consider adding a flag keeping the info that the deque is 
push_back|push_front|both-oriented updated based on its usage but I 
doubt it would worth it. Moreover it would introduce an abi breakage.





    * include/bits/deque.tcc (deque<>::_M_push_back_aux):
    Rotate on current node if deque is empty.
    (deue<>::_M_push_front_aux): Likewise.

Tested under Linux x86_64, ok to commit ?

François



diff --git a/libstdc++-v3/include/bits/deque.tcc 
b/libstdc++-v3/include/bits/deque.tcc

index 2053afe1d69..245e3e712d8 100644
--- a/libstdc++-v3/include/bits/deque.tcc
+++ b/libstdc++-v3/include/bits/deque.tcc
@@ -507,6 +507,19 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  _M_push_back_aux(const value_type& __t)
#endif
  {
+    if (empty())
+  {
+    // Move iterators to point to the current node begin.
+    this->_M_impl._M_start._M_cur = 
this->_M_impl._M_start._M_first;
+    this->_M_impl._M_finish._M_cur = 
this->_M_impl._M_finish._M_first;

+#if __cplusplus >= 201103L
+    emplace_back(std::forward<_Args>(__args)...);
+#else
+    push_back(__t);
+#endif
+    return;
+  }
+
if (size() == max_size())
  __throw_length_error(
  __N("cannot create std::deque larger than max_size()"));
@@ -546,6 +559,19 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  _M_push_front_aux(const value_type& __t)
#endif
  {
+    if (empty())
+  {
+    // Move iterators to point to the current node end.
+    this->_M_impl._M_finish._M_cur = 
this->_M_impl._M_finish._M_last - 1;
+    this->_M_impl._M_start._M_cur = 
this->_M_impl._M_start._M_last - 1;

+#if __cplusplus >= 201103L
+    emplace_front(std::forward<_Args>(__args)...);
+#else
+    push_front(__t);
+#endif
+    return;
+  }
+
if (size() == max_size())
  __throw_length_error(
  __N("cannot create std::deque larger than max_size()"));






diff --git a/libstdc++-v3/include/bits/deque.tcc b/libstdc++-v3/include/bits/deque.tcc
index 7d1ec86456a..e0fb7f07bc4 100644
--- a/libstdc++-v3/include/bits/deque.tcc
+++ b/libstdc++-v3/include/bits/deque.tcc
@@ -486,6 +486,19 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   _M_push_back_aux(const value_type& __t)
 #endif
   {
+	if (empty())
+	  {
+	// Move iterators to point to the current node begin.
+	this->_M_impl._M_start._M_cur = this->_M_impl._M_start._M_first;
+	this->_M_impl._M_finish._M_cur = this->_M_impl._M_finish._M_first;
+#if __cplusplus >= 201103L
+	emplace_back(std::forward<_Args>(__args)...);
+#else
+	push_back(__t);
+#endif
+	return;
+	  }
+
 	if (size() == max_size())
 	  __throw_length_error(
 	  __N("cannot create std::deque larger than max_size()"));
@@ -525,6 +538,19 @@ 

Re: [PATCH v2, rs6000] Add support to enable vmsumudm behind vec_msum builtin.

2020-06-30 Thread will schmidt via Gcc-patches
On Tue, 2020-06-30 at 12:57 -0500, will schmidt via Gcc-patches wrote:
> Hi,
> 
>   Add support for the vmsumudm instruction and tie it into the
> vec_msum
>   built-in to support the variants of that built-in using vector
>  _int128 parameters.
> 
>   vector _uint128_t vec_msum (vector unsigned long long,
>   vector unsigned long long,
>   vector _uint128_t);
>   vector _int128_t vec_msum (vector signed long long,
>  vector signed long long,
>  vector _int128_t);
> 
> [v2]
>   Corrected the define_insn and test requirements to be limited to
> P9.
>   Improve description to clarify the vmsum does
>   a widening multiply and horizontal addition.
> 
> Fresh regtests currently running on assorted powerpc targets.
> 
> OK for trunk?
> 
> Thanks,
> -Will
> 
> 
> [gcc]
> 
> 2020-06-18  Will Schmidt  
> 
> * config/rs6000/altivec.h (vec_vmsumudm): New define.
> * config/rs6000/altivec.md (UNSPEC_VMSUMUDM): New unspec.
> (altivec_vmsumudm): New define_insn.
> * config/rs6000/rs6000-builtin.def (altivec_vmsumudm): New
> BU_ALTIVEC_3 entry. (vmsumudm): New BU_ALTIVEC_OVERLOAD_3
> entry.
> * config/rs6000/rs6000-call.c (altivec_overloaded_builtins):
> Add entries for ALTIVEC_BUILTIN_VMSUMUDM variants of
> vec_msum.
> 
> [testsuite]
> 
> 2020-06-18  Will Schmidt  
> 
> * gcc.target/powerpc/builtins-msum-runnable.c: New test.
> * gcc.target/powerpc/vsx-builtin-msum.c: New test.
> 
> 


> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-msum.c
> b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-msum.c
> new file mode 100644
> index ..1974864de00c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-msum.c
> @@ -0,0 +1,25 @@
> +/* Verify that overloaded built-ins for vec_msum with __int128
> +   inputs generate the proper code.  */
> +
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_p9vector_ok } */
> +/* { dg-options "-mdejagnu-cpu=power9  -O3" } */

Per my latest regtest, this fails on a BE -m32 due to "__int128" is not
suported.  I'll be adding some form of require-target int128 to clear
that up.

thanks
-Will



> +
> +#include 
> +
> +vector signed __int128
> +test_msum_si (vector signed long long vsll_1, vector signed long
> long vsll_2,
> +vector signed __int128 vsi128)
> +{
> +  return vec_msum (vsll_1, vsll_2, vsi128);
> +}
> +
> +vector unsigned __int128
> +test_msum_ui (vector unsigned long long vull_1, vector unsigned long
> long vull_2,
> +vector unsigned __int128 vui128)
> +{
> +  return vec_msum (vull_1, vull_2, vui128);
> +}
> +
> +/* { dg_final { scan_assembler_times "vmsumudm" 2 } } */
> +
> 



Re: *ping* [patch, fortran] PR 27318, warn if interfaces do not match

2020-06-30 Thread Thomas Koenig

OK,

so here is an updated version, which includes the updated test cases.

Anything else?  OK for trunk?

(And will I "see" anybody at the Fortran Conference ? )

Best regards

Thomas

Test global identifiers against what is specified interfaces.

Apart from calling gfc_compare_interfaces to check interfaces against
global identifiers, this also sets and check a few sym->error flags
to avoid duplicate error messages.  I thought about issuing errors
on mismatched interfaces, but when the procedure is not invoked,
a warning should be enough to alert the user.

gcc/fortran/ChangeLog:

PR fortran/27318
* frontend-passes.c (check_against_globals): New function.
(gfc_check_externals): Split; also invoke check_against_globals
via gfc_traverse_ns.
(gfc_check_externals0): Recursive part formerly in
gfc_check_externals.
* resolve.c (resolve_global_procedure): Set sym->error on
interface mismatch.
* symbol.c (ambiguous_symbol): Check for, and set sym->error.
diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
index d5d71b5fda4..69f9ca64c97 100644
--- a/gcc/fortran/frontend-passes.c
+++ b/gcc/fortran/frontend-passes.c
@@ -5493,26 +5493,75 @@ check_externals_expr (gfc_expr **ep, int *walk_subtrees ATTRIBUTE_UNUSED,
   return check_externals_procedure (sym, loc, actual);
 }
 
-/* Called routine.  */
+/* Function to check if any interface clashes with a global
+   identifier, to be invoked via gfc_traverse_ns.  */
 
-void
-gfc_check_externals (gfc_namespace *ns)
+static void
+check_against_globals (gfc_symbol *sym)
 {
+  gfc_gsymbol *gsym;
+  gfc_symbol *def_sym = NULL;
+  const char *sym_name;
+  char buf  [200];
 
-  gfc_clear_error ();
+  if (sym->attr.if_source != IFSRC_IFBODY || sym->attr.flavor != FL_PROCEDURE
+  || sym->attr.generic || sym->error)
+return;
 
-  /* Turn errors into warnings if the user indicated this.  */
+  if (sym->binding_label)
+sym_name = sym->binding_label;
+  else
+sym_name = sym->name;
 
-  if (!pedantic && flag_allow_argument_mismatch)
-gfc_errors_to_warnings (true);
+  gsym = gfc_find_gsymbol (gfc_gsym_root, sym_name);
+  if (gsym && gsym->ns)
+gfc_find_symbol (sym->name, gsym->ns, 0, _sym);
+
+  if (!def_sym || def_sym->error || def_sym->attr.generic)
+return;
+
+  buf[0] = 0;
+  gfc_compare_interfaces (sym, def_sym, sym->name, 0, 1, buf, sizeof(buf),
+			  NULL, NULL, NULL);
+  if (buf[0] != 0)
+{
+  gfc_warning (0, "%s between %L and %L", buf, _sym->declared_at,
+		   >declared_at);
+  sym->error = 1;
+  def_sym->error = 1;
+}
+
+}
+
+/* Do the code-walkling part for gfc_check_externals.  */
 
+static void
+gfc_check_externals0 (gfc_namespace *ns)
+{
   gfc_code_walker (>code, check_externals_code, check_externals_expr, NULL);
 
   for (ns = ns->contained; ns; ns = ns->sibling)
 {
   if (ns->code == NULL || ns->code->op != EXEC_BLOCK)
-	gfc_check_externals (ns);
+	gfc_check_externals0 (ns);
 }
 
+}
+
+/* Called routine.  */
+
+void gfc_check_externals (gfc_namespace *ns)
+{
+  gfc_clear_error ();
+
+  /* Turn errors into warnings if the user indicated this.  */
+
+  if (!pedantic && flag_allow_argument_mismatch)
+gfc_errors_to_warnings (true);
+
+  gfc_check_externals0 (ns);
+  gfc_traverse_ns (ns, check_against_globals);
+
   gfc_errors_to_warnings (false);
 }
+
diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index f3e8ffc204c..570a41b3ac8 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -2618,6 +2618,7 @@ resolve_global_procedure (gfc_symbol *sym, locus *where, int sub)
 
 	  gfc_error ("Interface mismatch in global procedure %qs at %L: %s",
 		 sym->name, >declared_at, reason);
+	  sym->error = 1;
 	  gfc_errors_to_warnings (false);
 	  goto done;
 	}
diff --git a/gcc/fortran/symbol.c b/gcc/fortran/symbol.c
index 96e4cee3040..abd3b5ccfd0 100644
--- a/gcc/fortran/symbol.c
+++ b/gcc/fortran/symbol.c
@@ -3145,18 +3145,24 @@ gfc_new_symbol (const char *name, gfc_namespace *ns)
 }
 
 
-/* Generate an error if a symbol is ambiguous.  */
+/* Generate an error if a symbol is ambiguous, and set the error flag
+   on it.  */
 
 static void
 ambiguous_symbol (const char *name, gfc_symtree *st)
 {
 
+  if (st->n.sym->error)
+return;
+
   if (st->n.sym->module)
 gfc_error ("Name %qs at %C is an ambiguous reference to %qs "
 	   "from module %qs", name, st->n.sym->name, st->n.sym->module);
   else
 gfc_error ("Name %qs at %C is an ambiguous reference to %qs "
 	   "from current program unit", name, st->n.sym->name);
+
+  st->n.sym->error = 1;
 }
 
 
diff --git a/gcc/testsuite/gfortran.dg/error_recovery_1.f90 b/gcc/testsuite/gfortran.dg/error_recovery_1.f90
index 7f19ab51e50..9e2540c0787 100644
--- a/gcc/testsuite/gfortran.dg/error_recovery_1.f90
+++ b/gcc/testsuite/gfortran.dg/error_recovery_1.f90
@@ -2,14 +2,14 @@
 ! PR fortran/24549 (and duplicate PR fortran/27487)
 module 

Re: [PATCH 5/7 v6] vect: Support vector load/store with length in vectorizer

2020-06-30 Thread Richard Sandiford
"Kewen.Lin"  writes:
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 06a04e3d7dd..284c15705ea 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -13389,6 +13389,13 @@ by the copy loop headers pass.
>  @item vect-epilogues-nomask
>  Enable loop epilogue vectorization using smaller vector size.
>  
> +@item vect-with-length-scope

In principle there's nothing length-specific about this option.
We could do the same for masks or for any future loop control
mechanism.  So how about vect-partial-vector-usage instead?

> +Control the scope of vector memory access with length exploitation.  0 means 
> we
> +don't expliot any vector memory access with length, 1 means we only exploit
> +vector memory access with length for those loops whose iteration number are
> +less than VF, such as very small loop or epilogue, 2 means we want to exploit
> +vector memory access with length for any loops if possible.

Maybe:

  Controls when the loop vectorizer considers using partial vector loads
  and stores as an alternative to falling back to scalar code.  0 stops
  the vectorizer from ever using partial vector loads and stores.  1 allows
  partial vector loads and stores if vectorization removes the need for the
  code to iterate.  2 allows partial vector loads and stores in all loops.
  The parameter only has an effect on targets that support partial
  vector loads and stores.
  
> diff --git a/gcc/optabs-query.c b/gcc/optabs-query.c
> index 215d68e4225..9c351759204 100644
> --- a/gcc/optabs-query.c
> +++ b/gcc/optabs-query.c
> @@ -606,6 +606,60 @@ can_vec_mask_load_store_p (machine_mode mode,
>return false;
>  }
>  
> +/* Return true if target supports vector load/store with length for vector
> +   mode MODE.  There are two flavors for vector load/store with length, one
> +   is to measure length with bytes, the other is to measure length with 
> lanes.
> +   As len_{load,store} optabs point out, for the flavor with bytes, we use
> +   VnQI to wrap the other supportable same size vector modes.  Here the
> +   pointer FACTOR is to indicate that it is using VnQI to wrap if its value
> +   more than 1 and how many bytes for one element of wrapped vector mode.  */
> +
> +bool
> +can_vec_len_load_store_p (machine_mode mode, bool is_load, unsigned int 
> *factor)
> +{
> +  optab op = is_load ? len_load_optab : len_store_optab;
> +  gcc_assert (VECTOR_MODE_P (mode));
> +
> +  /* Check if length in lanes supported for this mode directly.  */
> +  if (direct_optab_handler (op, mode))
> +{
> +  *factor = 1;
> +  return true;
> +}
> +
> +  /* Check if length in bytes supported for VnQI with the same vector size.  
> */
> +  scalar_mode emode = QImode;
> +  poly_uint64 esize = GET_MODE_SIZE (emode);

This is always equal to 1, so…

> +  poly_uint64 vsize = GET_MODE_SIZE (mode);
> +  poly_uint64 nunits;
> +
> +  /* To get how many nunits it would have if the element is QImode.  */
> +  if (multiple_p (vsize, esize, ))
> +{

…we can just set nunits to GET_MODE_SIZE (mode).

> +  machine_mode vmode;
> +  /* Check whether the related VnQI vector mode exists, as well as
> +  optab supported.  */
> +  if (related_vector_mode (mode, emode, nunits).exists ()
> +   && direct_optab_handler (op, vmode))
> + {
> +   unsigned int mul;
> +   scalar_mode orig_emode = GET_MODE_INNER (mode);
> +   poly_uint64 orig_esize = GET_MODE_SIZE (orig_emode);
> +
> +   if (constant_multiple_p (orig_esize, esize, ))
> + *factor = mul;
> +   else
> + gcc_unreachable ();

This is just:

  *factor = GET_MODE_UNIT_SIZE (mode);

However, I think it would be better to return the vector mode that the
load or store should use, instead of this factor.  That way we can reuse
it when generating the load and store statements.

So maybe call the function get_len_load_store_mode and return an
opt_machine_mode.

> +
> +   return true;
> + }
> +}
> +  else
> +gcc_unreachable ();
> +
> +  return false;
> +}
> +
>  /* Return true if there is a compare_and_swap pattern.  */
>  
>  bool
> […]
> diff --git a/gcc/params.opt b/gcc/params.opt
> index 9b564bb046c..daa6e8a2beb 100644
> --- a/gcc/params.opt
> +++ b/gcc/params.opt
> @@ -968,4 +968,8 @@ Bound on number of runtime checks inserted by the 
> vectorizer's loop versioning f
>  Common Joined UInteger Var(param_vect_max_version_for_alignment_checks) 
> Init(6) Param Optimization
>  Bound on number of runtime checks inserted by the vectorizer's loop 
> versioning for alignment check.
>  
> +-param=vect-with-length-scope=
> +Common Joined UInteger Var(param_vect_with_length_scope) Init(0) 
> IntegerRange(0, 2) Param Optimization
> +Control the vector with length exploitation scope.

Think this should be a bit more descriptive, at least saying what the
three values are (but in a more abbreviated form than the .texi above).

I think the default should be 2, with targets actively turning it down

Re: [PATCH] RISC-V: Preserve arch version info during normalizing arch string

2020-06-30 Thread Jim Wilson
On Mon, Jun 29, 2020 at 7:00 PM Kito Cheng  wrote:
> - Arch version should preserved if user explicitly specified the version.
>   e.g.
> After normalize, -march=rv32if3d should be -march=rv32i_f3p0d
> instead of-march=rv32ifd.

This looks good to me.

> +explicit_version_p(false)

I'd suggest a space before the open paren.

This isn't a criticism of your patch, but I noticed while looking at
this that we accept gXpY and expand to iXpY_mXpY_ etc.  However, imafd
are all numbered independently.  It doesn't make much sense to specify
a version with g and assume it is correct for all of imafd.  Similarly
with the implied extensions, e.g. dXpY is expanded to fXpY_dXpY,
though in this case it is likely that f and d numbers will remain
compatible.  And q too.  Still it looks a bit funny to be making that
assumption and there probably will be other examples where the
versions of the implied extensions won't match the specified
extension.  Actually I see that f implies Zicsr, and those two have
different version numbers, we just haven't implemented Zicsr yet.  We
are correctly implementing the rules as specified, the rules just
don't make sense here.  We probably need to file an issue against the
riscv-isa-manual project for a clarification of how to handle this
stuff.

Jim


Re: [PATCH 2/7] PowerPC tests: Add PLI/PADDI tests.

2020-06-30 Thread Segher Boessenkool
Hi!

On Tue, Jun 30, 2020 at 01:58:50AM -0400, Michael Meissner wrote:
> On Mon, Jun 29, 2020 at 01:42:56PM -0500, Segher Boessenkool wrote:
> > On Mon, Jun 29, 2020 at 02:23:22PM -0400, Michael Meissner wrote:
> > > +/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
> > > +
> > > +/* Test that PLI (PADDI) is generated to load a large constant.  */
> > > +unsigned long long
> > > +large (void)
> > > +{
> > > +  return 0x12345678ULL;
> > > +}
> > > +
> > > +/* { dg-final { scan-assembler {\mpli\M} } } */
> > 
> > I have no idea why 64-bit mode (or 64-bit addressing) is needed here.
> > *Is* it needed?
> 
> Yes it is needed.  Otherwise two separate load immediates would be needed to
> load each part of the DI constant that is held in 2 registers.

But that will work just fine, because one of those insns will be a pli,
exactly as tested for!

> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/powerpc/prefix-no-update.c
> > > @@ -0,0 +1,51 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-require-effective-target powerpc_prefixed_addr } */
> > > +/* { dg-require-effective-target lp64 } */
> > > +/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
> > 
> > For this testcase, I have no idea at all why you want lp64?
> 
> Becuase to show the bug you need a stack frame larger than 64K.

I don't see it?  (Also, why would that require lp64?)


Segher


Re: [PATCH v2, rs6000] Add support to enable vmsumudm behind vec_msum builtin.

2020-06-30 Thread Segher Boessenkool
Hi!

On Tue, Jun 30, 2020 at 04:39:03PM -0500, will schmidt wrote:
> On Tue, 2020-06-30 at 12:57 -0500, will schmidt via Gcc-patches wrote:
> >   vector _uint128_t vec_msum (vector unsigned long long,
> >   vector unsigned long long,
> >   vector _uint128_t);
> >   vector _int128_t vec_msum (vector signed long long,
> >  vector signed long long,
> >  vector _int128_t);

> > +/* Verify that overloaded built-ins for vec_msum with __int128
> > +   inputs generate the proper code.  */
> > +
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target powerpc_p9vector_ok } */
> > +/* { dg-options "-mdejagnu-cpu=power9  -O3" } */
> 
> Per my latest regtest, this fails on a BE -m32 due to "__int128" is not
> suported.  I'll be adding some form of require-target int128 to clear
> that up.

Yes, that is a generic problem, not specific to rs6000.  It would be
very nice if that was fixed some day!


Segher


[PATCH] rs6000: Add execution tests for mma builtins.

2020-06-30 Thread Aaron Sawdey via Gcc-patches
This patch adds execution tests that use the MMA builtins,
checks for the right answer, and checks that __builtin_cpu_supports
and __builtin_cpu_is return sane answers given that the code
executed correctly.

Tested against P10 sim, should not execute anywhere else due to 
requiring power10_hw. Actually the power10_hw test I think requires
current glibc to pick up the change that lets
__builtin_cpu_is("power10") work. OK for trunk?

Thanks,
   Aaron

2020-06-30  Rajalakshmi Srinivasaraghavan  
Aaron Sawdey  

gcc/testsuite/
* gcc.target/powerpc/mma-single-test.c: New file.
* gcc.target/powerpc/mma-double-test.c: New file.
---
 .../gcc.target/powerpc/mma-double-test.c  | 211 +
 .../gcc.target/powerpc/mma-single-test.c  | 220 ++
 2 files changed, 431 insertions(+)
 create mode 100755 gcc/testsuite/gcc.target/powerpc/mma-double-test.c
 create mode 100755 gcc/testsuite/gcc.target/powerpc/mma-single-test.c

diff --git a/gcc/testsuite/gcc.target/powerpc/mma-double-test.c 
b/gcc/testsuite/gcc.target/powerpc/mma-double-test.c
new file mode 100755
index 000..e3807fa2eab
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mma-double-test.c
@@ -0,0 +1,211 @@
+/* { dg-do run } */
+/* { dg-require-effective-target power10_hw } */
+/* { dg-options "-Wno-psabi -mdejagnu-cpu=power10 -O2" } */
+
+#include 
+#include 
+#include 
+
+typedef unsigned char vec_t __attribute__ ((vector_size (16)));
+typedef double v4sf_t __attribute__ ((vector_size (16)));
+#define SAVE_ACC(ACC, ldc, J)  \
+ __builtin_mma_disassemble_acc (result, ACC); \
+ rowC = (v4sf_t *) [0*ldc+J]; \
+  rowC[0] += result[3] ; \
+  rowC = (v4sf_t *) [1*ldc+J]; \
+  rowC[0] += result[2] ; \
+  rowC = (v4sf_t *) [2*ldc+J]; \
+  rowC[0] += result[1] ; \
+  rowC = (v4sf_t *) [3*ldc+J]; \
+ rowC[0] += result[0] ;
+
+void
+MMA (int m, int n, int k, double *A, double *B, double *C)
+{
+  __vector_quad acc0, acc1, acc2, acc3, acc4, acc5, acc6, acc7;
+  v4sf_t result[4];
+  v4sf_t *rowC;
+  for (int l = 0; l < n; l += 4)
+{
+  double *CO;
+  double *AO;
+  AO = A;
+  CO = C;
+  C += m * 4;
+  for (int j = 0; j < m; j += 16)
+   {
+ double *BO = B;
+ __builtin_mma_xxsetaccz ();
+ __builtin_mma_xxsetaccz ();
+ __builtin_mma_xxsetaccz ();
+ __builtin_mma_xxsetaccz ();
+ __builtin_mma_xxsetaccz ();
+ __builtin_mma_xxsetaccz ();
+ __builtin_mma_xxsetaccz ();
+ __builtin_mma_xxsetaccz ();
+ unsigned long i;
+
+ for (i = 0; i < k; i++)
+   {
+ vec_t *rowA = (vec_t *) & AO[i * 16];
+ __vector_pair rowB;
+ vec_t *rb = (vec_t *) & BO[i * 4];
+ __builtin_mma_assemble_pair (, rb[1], rb[0]);
+ __builtin_mma_xvf64gerpp (, rowB, rowA[0]);
+ __builtin_mma_xvf64gerpp (, rowB, rowA[1]);
+ __builtin_mma_xvf64gerpp (, rowB, rowA[2]);
+ __builtin_mma_xvf64gerpp (, rowB, rowA[3]);
+ __builtin_mma_xvf64gerpp (, rowB, rowA[4]);
+ __builtin_mma_xvf64gerpp (, rowB, rowA[5]);
+ __builtin_mma_xvf64gerpp (, rowB, rowA[6]);
+ __builtin_mma_xvf64gerpp (, rowB, rowA[7]);
+   }
+ SAVE_ACC (, m, 0);
+ SAVE_ACC (, m, 4);
+ SAVE_ACC (, m, 2);
+ SAVE_ACC (, m, 6);
+ SAVE_ACC (, m, 8);
+ SAVE_ACC (, m, 12);
+ SAVE_ACC (, m, 10);
+ SAVE_ACC (, m, 14);
+ AO += k * 16;
+ BO += k * 4;
+ CO += 16;
+   }
+  B += k * 4;
+}
+}
+
+void
+init (double *matrix, int row, int column)
+{
+  for (int j = 0; j < column; j++)
+{
+  for (int i = 0; i < row; i++)
+   {
+ matrix[j * row + i] = (i * 16 + 2 + j) / 0.123;
+   }
+}
+}
+
+void
+init0 (double *matrix, double *matrix1, int row, int column)
+{
+  for (int j = 0; j < column; j++)
+for (int i = 0; i < row; i++)
+  matrix[j * row + i] = matrix1[j * row + i] = 0;
+}
+
+
+void
+print (const char *name, const double *matrix, int row, int column)
+{
+  printf ("Matrix %s has %d rows and %d columns:\n", name, row, column);
+  for (int i = 0; i < row; i++)
+{
+  for (int j = 0; j < column; j++)
+   {
+ printf ("%f ", matrix[j * row + i]);
+   }
+  printf ("\n");
+}
+  printf ("\n");
+}
+
+int
+main (int argc, char *argv[])
+{
+  int rowsA, colsB, common;
+  int i, j, k;
+  int ret = 0;
+
+  for (int t = 16; t <= 128; t += 16)
+{
+  for (int t1 = 4; t1 <= 16; t1 += 4)
+   {
+ rowsA = t;
+ colsB = t1;
+ common = 1;
+ /* printf ("Running test for rows = %d,cols = %d\n", t, t1); */
+ double A[rowsA * common];
+ double B[common * colsB];
+ double C[rowsA * colsB];
+ double D[rowsA * colsB];
+
+
+ init (A, rowsA, 

Re: [PATCH] RISC-V: Handle multi-letter extension for multilib-generator

2020-06-30 Thread Jim Wilson
On Mon, Jun 29, 2020 at 7:35 PM Kito Cheng  wrote:
> * config/riscv/multilib-generator (arch_canonicalize): Handle
> multi-letter extension.
> Using underline as separator between different extensions.

This looks good to me.

> +  # Multi-letter extension must in lexicographic order.

Suggest "must in" -> "must be in"

> @@ -77,7 +91,7 @@ for cfg in sys.argv[1:]:
>abis[abi] = 1
>extra = list(filter(None, extra.split(',')))
>ext = list(filter(None, ext.split(',')))
> -  alts = sum([[x] + [x + y for y in ext] for x in [arch] + extra], [])
> +  alts = sum([[x] + [x + "_" + y for y in ext] for x in [arch] + extra], [])
># TODO: We should expand g to imadzifencei once we support newer spec.
>alts = alts + [x.replace('imafd', 'g') for x in alts if 'imafd' in x]
>alts = list(map(arch_canonicalize, alts))

Maybe add a line
  arch = arch_canonicalize (arch)
after parsing arch?

I notice with your example I get
MULTILIB_OPTIONS = march=rv32izfh/march=rv32ic_zfh mabi=ilp32
which has inconsistent handling of _ before long extensions.  This is
harmless, but this could be fixed by canonicalizing arch too.  And
that might help make the script work better with minor user input
errors.

Jim


Re: [PATCH v2, rs6000] Add support to enable vmsumudm behind vec_msum builtin.

2020-06-30 Thread Segher Boessenkool
Hi!

On Tue, Jun 30, 2020 at 12:57:45PM -0500, will schmidt wrote:
>   Add support for the vmsumudm instruction and tie it into the vec_msum
>   built-in to support the variants of that built-in using vector
>  _int128 parameters.

> 2020-06-18  Will Schmidt  
> 
> * config/rs6000/altivec.h (vec_vmsumudm): New define.
> * config/rs6000/altivec.md (UNSPEC_VMSUMUDM): New unspec.
> (altivec_vmsumudm): New define_insn.
> * config/rs6000/rs6000-builtin.def (altivec_vmsumudm): New
> BU_ALTIVEC_3 entry. (vmsumudm): New BU_ALTIVEC_OVERLOAD_3
> entry.

No line break before (vmsumudm) please.

> * config/rs6000/rs6000-call.c (altivec_overloaded_builtins):
> Add entries for ALTIVEC_BUILTIN_VMSUMUDM variants of vec_msum.

Tha patch is okay for trunk with that (and some int128 selector in the
testcases that need one).  Thanks!


Segher


Re: [PATCH] adjust expected -Wnonnull messages

2020-06-30 Thread Jonathan Wakely via Gcc-patches

On 30/06/20 17:25 -0600, Martin Sebor via Libstdc++ wrote:

In testing a recent change to -Wnonnull, including the wording
of the warning, I overlooked failures in a number of tests,
including a few in libstdc++.  Attached is a patch that makes
adjustments to let them pass again.


OK for master, thanks.



Sorry about the unexpected fallout!

Martin



libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string_view/cons/char/nonnull.cc: Adjust 
text of expected warning.
* testsuite/21_strings/basic_string_view/cons/wchar_t/nonnull.cc: Same.
* 
testsuite/21_strings/basic_string_view/operations/compare/char/nonnull.cc: Same.
* 
testsuite/21_strings/basic_string_view/operations/find/char/nonnull.cc: Same.
* 
testsuite/21_strings/basic_string_view/operations/rfind/char/nonnull.cc: Same.

diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string_view/cons/char/nonnull.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string_view/cons/char/nonnull.cc
index dc93cd67a1f..81c6fc5d29a 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string_view/cons/char/nonnull.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string_view/cons/char/nonnull.cc
@@ -23,7 +23,7 @@
void
test01()
{
-  std::string_view s((const char*)nullptr); // { dg-warning "null arg" }
-  std::string_view t((char*)nullptr);  // { dg-warning "null arg" }
-  std::string_view u(nullptr); // { dg-warning "null arg" }
+  std::string_view s((const char*)nullptr); // { dg-warning "\\\[-Wnonnull" }
+  std::string_view t((char*)nullptr);  // { dg-warning "\\\[-Wnonnull" }
+  std::string_view u(nullptr); // { dg-warning "\\\[-Wnonnull" }
}
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string_view/cons/wchar_t/nonnull.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string_view/cons/wchar_t/nonnull.cc
index ded89c2ce45..ce95884e5de 100644
--- 
a/libstdc++-v3/testsuite/21_strings/basic_string_view/cons/wchar_t/nonnull.cc
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string_view/cons/wchar_t/nonnull.cc
@@ -23,7 +23,7 @@
void
test01()
{
-  std::wstring_view s((const wchar_t*)nullptr);// { dg-warning "null 
arg" }
-  std::wstring_view t((wchar_t*)nullptr);  // { dg-warning "null arg" }
-  std::wstring_view u(nullptr);// { dg-warning "null 
arg" }
+  std::wstring_view s((const wchar_t*)nullptr);// { dg-warning 
"\\\[-Wnonnull" }
+  std::wstring_view t((wchar_t*)nullptr);  // { dg-warning "\\\[-Wnonnull" 
}
+  std::wstring_view u(nullptr);// { dg-warning 
"\\\[-Wnonnull" }
}
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string_view/operations/compare/char/nonnull.cc
 
b/libstdc++-v3/testsuite/21_strings/basic_string_view/operations/compare/char/nonnull.cc
index 0deba716ed3..7060ffb0109 100644
--- 
a/libstdc++-v3/testsuite/21_strings/basic_string_view/operations/compare/char/nonnull.cc
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string_view/operations/compare/char/nonnull.cc
@@ -24,6 +24,6 @@ int
test01()
{
  std::string_view s = "abcd";
-  return s.compare((const char*)nullptr);  // { dg-warning "null arg" }
-  return s.compare(0, 2, (const char*)nullptr);// { dg-warning "null 
arg" }
+  return s.compare((const char*)nullptr);  // { dg-warning "\\\[-Wnonnull" 
}
+  return s.compare(0, 2, (const char*)nullptr);// { dg-warning 
"\\\[-Wnonnull" }
}
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string_view/operations/find/char/nonnull.cc
 
b/libstdc++-v3/testsuite/21_strings/basic_string_view/operations/find/char/nonnull.cc
index d097866b704..97c7629d388 100644
--- 
a/libstdc++-v3/testsuite/21_strings/basic_string_view/operations/find/char/nonnull.cc
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string_view/operations/find/char/nonnull.cc
@@ -24,10 +24,10 @@ int
test01()
{
  std::string_view s = "abcd";
-  return s.find((const char*)nullptr); // { dg-warning "null arg" }
-  return s.find((const char*)nullptr, 1);  // { dg-warning "null arg" }
-  return s.find_first_of((const char*)nullptr);// { dg-warning "null 
arg" }
-  return s.find_first_of((const char*)nullptr, 1); // { dg-warning "null arg" }
-  return s.find_first_not_of((const char*)nullptr); // { dg-warning "null arg" 
}
-  return s.find_first_not_of((const char*)nullptr, 1); // { dg-warning "null 
arg" }
+  return s.find((const char*)nullptr); // { dg-warning "\\\[-Wnonnull" 
}
+  return s.find((const char*)nullptr, 1);  // { dg-warning "\\\[-Wnonnull" 
}
+  return s.find_first_of((const char*)nullptr);// { dg-warning 
"\\\[-Wnonnull" }
+  return s.find_first_of((const char*)nullptr, 1); // { dg-warning 
"\\\[-Wnonnull" }
+  return s.find_first_not_of((const char*)nullptr); // { dg-warning 
"\\\[-Wnonnull" }
+  return s.find_first_not_of((const char*)nullptr, 1); // { dg-warning 
"\\\[-Wnonnull" }
}
diff --git 

[PATCH] correct memcmp expansion of constant representations containing embedded nuls (PR 95189)

2020-06-30 Thread Martin Sebor via Gcc-patches

An enhancement to GCC 10 to improve the expansion of strncmp
calls with strings with embedded nuls introduced a regression
in similar calls to memcmp.  A review of the changes that led
up to the regression exposed a number of questionable choices
that likely conspired to cause the bug.

For example, the name of the function with both the strncmp
enhancement as well as the memcmp bug is
inline_expand_builtin_string_cmp().  It's easy to assume that
the function handles calls to strcmp and strncmp but not also
memcmp.

Another similar example is the name of the second c_getstr()
argument -- strlen -- that doesn't actually store the length
of the retrieved string but rather its size in bytes
(including any embedded nuls, but excluding those appended
implicitly to zero out the remainder of an array the string
is stored in, up to the array's size).

Yet another example of a dubious choice is string_constant()
returning the empty string (i.e., STRING_CST with size 1) for
zero initializers of constants of any type (as opposed to one
of the same size as the constant object).

Besides fixing the memcmp bug the attached patch (hopefully)
also rectifies some of the otherwise more or less benign
mistakes that precipitated it, mostly by clarifying comments
and changing misleading names of functions, their arguments,
or local variables.

A happy consequence of the fixes is that they improve codegen
for calls to memcpy with constants whose representation includes
embedded nuls.

Tested on x86_64-linux.

Martin

Correct handling of constant representations containing embedded nuls.

Resolves:
PR middle-end/95189 - memcmp being wrongly stripped like strcm
PR middle-end/95886 - suboptimal memcpy with embedded zero bytes

gcc/ChangeLog:

	PR middle-end/95189
	PR middle-end/95886
	* builtins.c (inline_expand_builtin_string_cmp): Rename...
	(inline_expand_builtin_bytecmp): ...to this.
	(builtin_memcpy_read_str): Don't expect data to be nul-terminated.
	(expand_builtin_memory_copy_args): Handle object representations
	with embedded nul bytes.
	(expand_builtin_memcmp): Same.
	(expand_builtin_strcmp): Adjust call to naming change.
	(expand_builtin_strncmp): Same.
	* expr.c (string_constant): Create empty strings with nonzero size.
	* fold-const.c (c_getstr): Rename locals and update comments.
	* tree.c (build_string): Accept null pointer argument.
	(build_string_literal): Same.
	* tree.h (build_string): Provide a default.
	(build_string_literal): Same.

gcc/testsuite/ChangeLog:

	PR middle-end/95189
	PR middle-end/95886
	* gcc.dg/memcmp-pr95189.c: New test.
	* gcc/testsuite/gcc.dg/strncmp-3.c: New test.
	* gcc.target/i386/memcpy-pr95886.c: New test.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 4754602e0ec..2279b6bfdf7 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -119,7 +119,7 @@ static rtx expand_builtin_next_arg (void);
 static rtx expand_builtin_va_start (tree);
 static rtx expand_builtin_va_end (tree);
 static rtx expand_builtin_va_copy (tree);
-static rtx inline_expand_builtin_string_cmp (tree, rtx);
+static rtx inline_expand_builtin_bytecmp (tree, rtx);
 static rtx expand_builtin_strcmp (tree, rtx);
 static rtx expand_builtin_strncmp (tree, rtx, machine_mode);
 static rtx builtin_memcpy_read_str (void *, HOST_WIDE_INT, scalar_int_mode);
@@ -3227,20 +3227,18 @@ expand_builtin_strnlen (tree exp, rtx target, machine_mode target_mode)
 }
 
 /* Callback routine for store_by_pieces.  Read GET_MODE_BITSIZE (MODE)
-   bytes from constant string DATA + OFFSET and return it as target
-   constant.  */
+   bytes from bytes at DATA + OFFSET and return it reinterpreted as
+   a target constant.  */
 
 static rtx
 builtin_memcpy_read_str (void *data, HOST_WIDE_INT offset,
 			 scalar_int_mode mode)
 {
-  const char *str = (const char *) data;
+  /* The REPresentation pointed to by DATA need not be a nul-terminated
+ string but the caller guarantees it's large enough for MODE.  */
+  const char *rep = (const char *) data;
 
-  gcc_assert (offset >= 0
-	  && ((unsigned HOST_WIDE_INT) offset + GET_MODE_SIZE (mode)
-		  <= strlen (str) + 1));
-
-  return c_readstr (str + offset, mode);
+  return c_readstr (rep + offset, mode, /*nul_terminated=*/false);
 }
 
 /* LEN specify length of the block of memcpy/memset operation.
@@ -4411,7 +4409,6 @@ expand_builtin_memory_copy_args (tree dest, tree src, tree len,
  rtx target, tree exp, memop_ret retmode,
  bool might_overlap)
 {
-  const char *src_str;
   unsigned int src_align = get_pointer_alignment (src);
   unsigned int dest_align = get_pointer_alignment (dest);
   rtx dest_mem, src_mem, dest_addr, len_rtx;
@@ -4443,24 +4440,29 @@ expand_builtin_memory_copy_args (tree dest, tree src, tree len,
   len_rtx = expand_normal (len);
   determine_block_size (len, len_rtx, _size, _size,
 			_max_size);
-  src_str = c_getstr (src);
-
-  /* If SRC is a string constant and block move would be done by
- pieces, we can avoid loading the string from memory and only
- 

RE: [PATCH PR95855]A missing ifcvt optimization to generate fcsel

2020-06-30 Thread yangyang (ET)
> On Tue, Jun 30, 2020 at 1:31 PM yangyang (ET) 
> wrote:
> >
> > > On Tue, Jun 30, 2020 at 4:29 AM yangyang (ET)
> > > 
> > > wrote:
> > > >
> > > > Hi,
> > > >
> > > > > > Hi,
> > > > > >
> > > > > > This is a simple fix for pr95855.
> > > > > >
> > > > > > With this fix, pass_split_paths can recognize the
> > > > > > if-conversion
> > > > > opportunity of the testcase and doesn't duplicate the corresponding
> block.
> > > > > >
> > > > > > Added one testcase for this. Bootstrap and tested on both
> > > > > > aarch64 and
> > > > > x86 Linux platform, no new regression witnessed.
> > > > > >
> > > > > > Ok for trunk?
> > > > >
> > > > > Can you try using the num_stmts_in_pred[12] counts instead of
> > > > > using empty_block_p?
> > > >
> > > > It' ok to using num_stmts_in_pred[12] to judge whether the
> > > > pred[12] is empty since bb's immediate dominator can't meet the
> > > > constraints "single_pred_p (pred[12]) && single_pred (pred[12]) ==
> pred[21]".
> > > >
> > > > >
> > > > > Your matching doesn't allow for FP constants like
> > > > >
> > > > >  dmax[0] = d1[i] < 1.0 ? 1.0 : d1[i];
> > > > >
> > > > > since FP constants are not shared.  You likely want to use
> > > > > operand_equal_p to do the PHI argument comparison.
> > > >
> > > > That's right, after using operand_equal_p instead of == to do the
> > > > PHI argument Comparison, the mentioned case can be covered as well.
> > > >
> > > > >
> > > > > Thanks,
> > > > > Richard.
> > > >
> > > > Thanks for your suggestions. We have revised our patch based on
> > > > your
> > > suggestions.
> > > >
> > > > Bootstrap and tested on both aarch64 and x86 Linux platform. Does
> > > > the v1
> > > patch looks better?
> > >
> > > Yes.  This variant is OK.
> > >
> > > Thanks,
> > > Richard.
> >
> > Thanks for reviewing this. Could you please help install it?
> 
> Installed.  Please double-check your ChangeLog with gcc-verify, it had a typo 
> in
> the testcase filename.
> 
> Richard.

Thanks and Sorry for the typo, we will double-check it next time.

Yang Yang


Re: [PATCH] RISC-V: Preserve arch version info during normalizing arch string

2020-06-30 Thread Kito Cheng
Committed with coding style fixing.

On Wed, Jul 1, 2020 at 11:16 AM Kito Cheng  wrote:
>
> Hi Jim:
>
> > This isn't a criticism of your patch, but I noticed while looking at
> > this that we accept gXpY and expand to iXpY_mXpY_ etc.  However, imafd
> > are all numbered independently.  It doesn't make much sense to specify
> > a version with g and assume it is correct for all of imafd.  Similarly
> > with the implied extensions, e.g. dXpY is expanded to fXpY_dXpY,
> > though in this case it is likely that f and d numbers will remain
> > compatible.  And q too.  Still it looks a bit funny to be making that
> > assumption and there probably will be other examples where the
> > versions of the implied extensions won't match the specified
> > extension.  Actually I see that f implies Zicsr, and those two have
> > different version numbers, we just haven't implemented Zicsr yet.  We
> > are correctly implementing the rules as specified, the rules just
> > don't make sense here.  We probably need to file an issue against the
> > riscv-isa-manual project for a clarification of how to handle this
> > stuff.
>
> I agree the version of G is kind of problematic for GCC implementation,
> That reminds me there was a long discussion[1] last year,
> The conclusion is version of G is too confusing, it might just don't
> accept any version for G.
> I thought it could deprecate the version for G in GCC 11 and then drop
> that in GCC 12?
> For riscv-isa-manual, we could ask them to add a note about the G
> don't accept version?
> What do you think about this?
>
> And the -misa-spec is supported on the binutils side, so I guess it's
> time to start to improve
> those things on GCC now.
>
> [1] 
> https://groups.google.com/a/groups.riscv.org/g/sw-dev/c/aZhMG7NIVTk/m/hX2BRWsMEQAJ
>
> Thanks :)
>
> >
> > Jim


[committed] Fix PA bootstrap

2020-06-30 Thread Jeff Law via Gcc-patches
The change to C++-17 broke PA bootstrap as pa.c uses the "register" keyword. 
This patch drops the keyword and should hopefully restore bootstrapping.  It'll
take a day or so to verify.

Committed to the trunk,

Jeff
commit a580aca1b6dad32e621d4c8c8ceaab6d0c52f6c1
Author: Jeff Law 
Date:   Tue Jun 30 23:03:33 2020 -0600

Fix bootstrap failure on PA

gcc/
* config/pa/pa.c (pa_emit_move_sequence): Drop register keyword.
(pa_output_ascii): Likewise.

diff --git a/gcc/config/pa/pa.c b/gcc/config/pa/pa.c
index 711234e597a..07d32877154 100644
--- a/gcc/config/pa/pa.c
+++ b/gcc/config/pa/pa.c
@@ -1631,9 +1631,9 @@ pa_cannot_force_const_mem (machine_mode mode 
ATTRIBUTE_UNUSED, rtx x)
 int
 pa_emit_move_sequence (rtx *operands, machine_mode mode, rtx scratch_reg)
 {
-  register rtx operand0 = operands[0];
-  register rtx operand1 = operands[1];
-  register rtx tem;
+  rtx operand0 = operands[0];
+  rtx operand1 = operands[1];
+  rtx tem;
 
   /* We can only handle indexed addresses in the destination operand
  of floating point stores.  Thus, we need to break out indexed
@@ -3381,7 +3381,7 @@ pa_output_ascii (FILE *file, const char *p, int size)
   int io = 0;
   for (io = 0, co = 0; io < MIN (4, size - i); io++)
{
- register unsigned int c = (unsigned char) p[i + io];
+ unsigned int c = (unsigned char) p[i + io];
 
  if (c == '\"' || c == '\\')
partial_output[co++] = '\\';


Re: [PATCH] RISC-V: Preserve arch version info during normalizing arch string

2020-06-30 Thread Kito Cheng
Hi Jim:

> This isn't a criticism of your patch, but I noticed while looking at
> this that we accept gXpY and expand to iXpY_mXpY_ etc.  However, imafd
> are all numbered independently.  It doesn't make much sense to specify
> a version with g and assume it is correct for all of imafd.  Similarly
> with the implied extensions, e.g. dXpY is expanded to fXpY_dXpY,
> though in this case it is likely that f and d numbers will remain
> compatible.  And q too.  Still it looks a bit funny to be making that
> assumption and there probably will be other examples where the
> versions of the implied extensions won't match the specified
> extension.  Actually I see that f implies Zicsr, and those two have
> different version numbers, we just haven't implemented Zicsr yet.  We
> are correctly implementing the rules as specified, the rules just
> don't make sense here.  We probably need to file an issue against the
> riscv-isa-manual project for a clarification of how to handle this
> stuff.

I agree the version of G is kind of problematic for GCC implementation,
That reminds me there was a long discussion[1] last year,
The conclusion is version of G is too confusing, it might just don't
accept any version for G.
I thought it could deprecate the version for G in GCC 11 and then drop
that in GCC 12?
For riscv-isa-manual, we could ask them to add a note about the G
don't accept version?
What do you think about this?

And the -misa-spec is supported on the binutils side, so I guess it's
time to start to improve
those things on GCC now.

[1] 
https://groups.google.com/a/groups.riscv.org/g/sw-dev/c/aZhMG7NIVTk/m/hX2BRWsMEQAJ

Thanks :)

>
> Jim