Re: [PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-14 Thread Prathamesh Kulkarni
On Wed, 11 Sep 2019 at 11:50, Wilco Dijkstra  wrote:
>
> While code hoisting generally improves codesize, it can affect performance
> negatively.  Benchmarking shows it doesn't help SPEC and negatively affects
> embedded benchmarks, so only enable code hoisting with -Os on Arm.
>
> Bootstrap OK, OK for commit?
Hi Wilco,
My only concern with the patch is that the issue isn't specific to
code-hoisting.
For this particular case (reproducible with pr77445-2.c), disabling
jump threading
doesn't cause the register spill with hoisting enabled.
Likewise disabling forwprop3 and forwprop4 prevents the spill.
The last time I tried, setting setting
max-jump-thread-duplication-stmts to 20 and
fsm-scale-path-stmts to 3, not only removed the spill but also
resulted in 9 more hoistings,
but regressed some other test on jump threading.
So I was wondering whether we should look into fine-tuning relevant params,
instead of disabling code hoisting entirely (we will end up regressing
some test cases either way) ?

Thanks,
Prathamesh
>
> ChangeLog:
> 2019-09-11  Wilco Dijkstra  
>
>
> PR tree-optimization/80155
> * common/config/arm/arm-common.c (arm_option_optimization_table):
> Enable -fcode-hoisting with -Os.
>
> --
> diff --git a/gcc/common/config/arm/arm-common.c 
> b/gcc/common/config/arm/arm-common.c
> index 
> 41a920f6dc96833e778faa8dbcc19beac483734c..b0d5fb300bf01acc1fb6f4631635f8a1bfe6441c
>  100644
> --- a/gcc/common/config/arm/arm-common.c
> +++ b/gcc/common/config/arm/arm-common.c
> @@ -39,6 +39,8 @@ static const struct default_options 
> arm_option_optimization_table[] =
>  /* Enable section anchors by default at -O1 or higher.  */
>  { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
>  { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
> +/* Enable code hoisting only with -Os.  */
> +{ OPT_LEVELS_SIZE, OPT_fcode_hoisting, NULL, 1 },
>  { OPT_LEVELS_NONE, 0, NULL, 0 }
>};
>
>


Re: [PATCH, AArch64, v3 0/6] LSE atomics out-of-line

2019-09-14 Thread Richard Henderson
On 9/5/19 10:35 AM, Wilco Dijkstra wrote:
> Agreed. I've got a couple of general comments:
> 
> * The option name -matomic-ool sounds too abbreviated. I think eg.
> -moutline-atomics is more descriptive and user friendlier.

Changed.

> * Similarly the exported __aa64_have_atomics variable could be named
>   __aarch64_have_lse_atomics so it's clear that it is about LSE atomics.

Changed.

> +@item -matomic-ool
> +@itemx -mno-atomic-ool
> +Enable or disable calls to out-of-line helpers to implement atomic 
> operations.
> +These helpers will, at runtime, determine if ARMv8.1-Atomics instructions
> +should be used; if not, they will use the load/store-exclusive instructions
> +that are present in the base ARMv8.0 ISA.
> +
> +This option is only applicable when compiling for the base ARMv8.0
> +instruction set.  If using a later revision, e.g. @option{-march=armv8.1-a}
> +or @option{-march=armv8-a+lse}, the ARMv8.1-Atomics instructions will be
> +used directly. 
> 
> So what is the behaviour when you explicitly select a specific CPU?

Selecting a specific cpu selects the specific architecture that the cpu
supports, does it not?  Thus the architecture example above still applies.

Unless I don't understand what distinction that you're making?

> +/* Branch to LABEL if LSE is enabled.
> +   The branch should be easily predicted, in that it will, after 
> constructors,
> +   always branch the same way.  The expectation is that systems that 
> implement
> +   ARMv8.1-Atomics are "beefier" than those that omit the extension.
> +   By arranging for the fall-through path to use load-store-exclusive insns,
> +   we aid the branch predictor of the smallest cpus.  */ 
> 
> I'd say that by the time GCC10 is released and used in distros, systems 
> without
> LSE atomics would be practically non-existent. So we should favour LSE atomics
> by default.

I suppose.  Does it not continue to be true that an a53 is more impacted by the
branch prediction than an a76?


r~


[testsuite, obvious] Remove explicit "dg-do run" from g++.dg/vect/pr87914.cc

2019-09-14 Thread Sandra Loosemore
I found another instance of PR testsuite/83889 in the g++ testsuite. 
I've committed the attached patch to fix it in the same way as all the 
others.


-Sandra
2019-09-14  Sandra Loosemore  

	PR testsuite/83889

	gcc/testsuite/
	* g++.dg/vect/pr87914.cc: Remove explicit dg-do run.
Index: gcc/testsuite/g++.dg/vect/pr87914.cc
===
--- gcc/testsuite/g++.dg/vect/pr87914.cc	(revision 275717)
+++ gcc/testsuite/g++.dg/vect/pr87914.cc	(working copy)
@@ -1,4 +1,3 @@
-// { dg-do run }
 // { dg-additional-options "-fopenmp-simd" }
 // { dg-additional-options "-mavx2" { target { avx2_runtime } } }
 


Re: [PATCH V3] Generalized predicate/condition for parameter reference in IPA (PR ipa/91088)

2019-09-14 Thread Jan Hubicka
> +/* Analyze EXPR if it represents a series of simple operations performed on
> +   a function parameter and return true if so.  FBI, STMT, EXPR, INDEX_P and
> +   AGGPOS have the same meaning like in unmodified_parm_or_parm_agg_item.
> +   Type of the parameter or load from an aggregate via the parameter is
> +   stored in *TYPE_P.  Operations on the parameter are recorded to
> +   PARAM_OPS_P if it is not NULL.  */
> +
> +static bool
> +decompose_param_expr (struct ipa_func_body_info *fbi,
> +   gimple *stmt, tree expr,
> +   int *index_p, tree *type_p,
> +   struct agg_position_info *aggpos,
> +   expr_eval_ops *param_ops_p = NULL)
> +{
> +  int op_limit = PARAM_VALUE (PARAM_IPA_MAX_PARAM_EXPR_OPS);
> +  int op_count = 0;
> +
> +  if (param_ops_p)
> +*param_ops_p = NULL;
> +
> +  while (true)
> +{
> +  expr_eval_op eval_op;
> +  poly_int64 size;
> +
> +  if (unmodified_parm_or_parm_agg_item (fbi, stmt, expr, index_p, ,
> + aggpos))
> + {
> +  tree type = TREE_TYPE (expr);
> +
> +   /* Stop if found bit-field whose size does not equal any natural
> +  integral type.  */
> +   if (maybe_ne (tree_to_poly_int64 (TYPE_SIZE (type)), size))
> + break;

Why is this needed?
> +
> +   *type_p = type;
> +   return true;
> + }
> +
> +  if (TREE_CODE (expr) != SSA_NAME || SSA_NAME_IS_DEFAULT_DEF (expr))
> + break;
> +
> +  if (!is_gimple_assign (stmt = SSA_NAME_DEF_STMT (expr)))
> + break;
> +
> +  switch (gimple_assign_rhs_class (stmt))
> + {
> +   case GIMPLE_SINGLE_RHS:
> + expr = gimple_assign_rhs1 (stmt);
> + continue;
> +
> +   case GIMPLE_UNARY_RHS:
> + expr = gimple_assign_rhs1 (stmt);
> +
> + eval_op.val_is_rhs = false;

I find val_is_rhs to be confusing, lhs/rhs is usually used for the
gimple assignments.  Here everything is rhs, but it depends whether
the val is operand0 or operand1.  So I guess we could do val_is_op1.
> @@ -1183,10 +1301,8 @@ set_cond_stmt_execution_predicate (struct 
> ipa_func_body_info *fbi,
>if (!is_gimple_ip_invariant (gimple_cond_rhs (last)))
>  return;
>op = gimple_cond_lhs (last);
> -  /* TODO: handle conditionals like
> - var = op0 < 4;
> - if (var != 0).  */
> -  if (unmodified_parm_or_parm_agg_item (fbi, last, op, , , 
> ))
> +
> +  if (decompose_param_expr (fbi, last, op, , , , 
> _ops))
>  {
>code = gimple_cond_code (last);
>inverted_code = invert_tree_comparison (code, HONOR_NANS (op));
> @@ -1201,13 +1317,13 @@ set_cond_stmt_execution_predicate (struct 
> ipa_func_body_info *fbi,
> if (this_code != ERROR_MARK)
>   {
> predicate p
> - = add_condition (summary, index, size, , this_code,
> -  unshare_expr_without_location
> -  (gimple_cond_rhs (last)));
> + = add_condition (summary, index, type, , this_code,
> +  gimple_cond_rhs (last), param_ops);

An why unsharing is no longer needed here?
It is importnat to avoid anything which reffers to function local blocks
to get into the global LTO stream.
> +/* Check whether two set of operations have same effects.  */
> +static bool
> +expr_eval_ops_equal_p (expr_eval_ops ops1, expr_eval_ops ops2)
> +{
> +  if (ops1)
> +{
> +  if (!ops2 || ops1->length () != ops2->length ())
> + return false;
> +
> +  for (unsigned i = 0; i < ops1->length (); i++)
> + {
> +   expr_eval_op  = (*ops1)[i];
> +   expr_eval_op  = (*ops2)[i];
> +
> +   if (op1.code != op2.code
> +   || op1.val_is_rhs != op2.val_is_rhs
> +   || !vrp_operand_equal_p (op1.val, op2.val)

Why you need vrp_operand_equal_p instead of operand_equal_p here?

Overall the patch looks very reasonable. We may end up with bit more
general expressions (i.e. supporting arithmetics involving more than one
operand).

I see you also fixed a lot of typos in comments, please go head and
commit them separately as obvious.

Thank for all the work!
Honza


Re: [PATCH V2] Setup predicate for switch default case in IPA (PR ipa/91089)

2019-09-14 Thread Jan Hubicka
> It is somewhat hard to write a testcase to show role of range info only with
> this patch. If another patch "Generalized predicate/condition for parameter
> reference in IPA (PR ipa/91088)" is accepted, it will become easy and I will
> update this testcase to show that.
> 
> And this new version also resolves a problem that we might generate very
> complicated predicate for basic block in convergence point reached from
> all switch cases.
> 
>switch (index)
> /   |   \
> case1   case2  ... default
> \   |   /
>convergence point
> 
> For switch/if statement, current algorithm synthesizes predicate of
> convergence point by OR-combining predicates of all its cases/branches, which
> might give us a very complicated one.  Actually, we can get simplified 
> predicate
> by using the fact that predicate for a basic block must also hold true for 
> its post
> dominators.  To be specific, convergence point should include (at least equal 
> to)
> predicate of the switch/if statement.
> 
> Honza & Martin,
> 
>   Would please review this new patch when you have time? Thanks.
> 
> Feng
> 
> -
> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
> index 278bf606661..dc5752fc005 100644
> --- a/gcc/ipa-fnsummary.c
> +++ b/gcc/ipa-fnsummary.c
> @@ -1198,7 +1198,8 @@ set_cond_stmt_execution_predicate (struct 
> ipa_func_body_info *fbi,
> /* invert_tree_comparison will return ERROR_MARK on FP
>comparsions that are not EQ/NE instead of returning proper
>unordered one.  Be sure it is not confused with NON_CONSTANT.  */
> -   if (this_code != ERROR_MARK)
> +   if (this_code != ERROR_MARK
> +   && !dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest))
>   {
> predicate p
>   = add_condition (summary, index, size, , this_code,

So this change is handling the diamond conditional you describe above?
This is bit of hack since it leaves the edge predicate unnecesarily
conservative though I see it saves some conditions to be inserted and
makes things to go smoother.

Please add a comment that explain this and reffers to the other places
where we do this (in the switch handling below).

> @@ -1269,13 +1270,31 @@ set_switch_stmt_execution_predicate (struct 
> ipa_func_body_info *fbi,
>if (!unmodified_parm_or_parm_agg_item (fbi, last, op, , , 
> ))
>  return;
>  
> +  auto_vec > ranges;
> +  tree type = TREE_TYPE (op);
> +  tree one_cst = build_one_cst (type);
> +  int bound_limit = PARAM_VALUE (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS);
> +  int bound_count = 0;
> +  value_range_base vr;
> +
> +  get_range_info (op, vr);
> +
>FOR_EACH_EDGE (e, ei, bb->succs)
>  {
>e->aux = edge_predicate_pool.allocate ();
>*(predicate *) e->aux = false;
>  }
> +
> +  e = gimple_switch_edge (cfun, last, 0);
> +  /* Set BOUND_COUNT to maximum count to bypass computing predicate for
> + default case if its target basic block is in convergence point of all
> + switch cases, which can be determined by checking whether it
> + post-dominates the switch statement.  */
> +  if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest))
> +bound_count = INT_MAX;
> +
>n = gimple_switch_num_labels (last);
> -  for (case_idx = 0; case_idx < n; ++case_idx)
> +  for (case_idx = 1; case_idx < n; ++case_idx)
>  {
>tree cl = gimple_switch_label (last, case_idx);
>tree min, max;
> @@ -1285,10 +1304,10 @@ set_switch_stmt_execution_predicate (struct 
> ipa_func_body_info *fbi,
>min = CASE_LOW (cl);
>max = CASE_HIGH (cl);
>  
> -  /* For default we might want to construct predicate that none
> - of cases is met, but it is bit hard to do not having negations
> - of conditionals handy.  */
> -  if (!min && !max)
> +  /* The case's target basic block is in convergence point of all switch
> +  cases, its predicate should be at least as that of the switch
> +  statement.  */
> +  if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest))
>   p = true;
>else if (!max)
>   p = add_condition (summary, index, size, , EQ_EXPR,
> @@ -1304,7 +1323,111 @@ set_switch_stmt_execution_predicate (struct 
> ipa_func_body_info *fbi,
>   }
>*(class predicate *) e->aux
>   = p.or_with (summary->conds, *(class predicate *) e->aux);
> +
> +  /* If there are too many disjoint case ranges, predicate for default
> +  case might become too complicated.  So add a limit here.  */
> +  if (bound_count > bound_limit)
> + continue;
> +
> +  bool new_range = true;
> +
> +  if (!ranges.is_empty ())
> + {
> +   tree last_max_plus_one
> + = int_const_binop (PLUS_EXPR, ranges.last ().second, one_cst);
> +
> +   /* Merge case ranges if they are continuous.  */
> +   if (tree_int_cst_equal (min, last_max_plus_one))
> + new_range = false;
> +   else if (vr.kind () == 

Re: [patch, fortran] Improve error messages for mismatched arguments

2019-09-14 Thread Steve Kargl
On Sat, Sep 14, 2019 at 02:27:15PM +0200, Thomas Koenig wrote:
> 
> the attached patch improves the rather hard to read error
> messages for argument mismatches.  With this patch, this reads
> 
> argument_checking_21.f90:7:11:
> 
>  6 |   call foo(1.0) ! { dg-warning "Rank mismatch" }
>|   2
>  7 |   call foo(b)   ! { dg-warning "Rank mismatch" }
>|   1
> Fehler: Rank mismatch between actual argument at (1) and actual argument 
> at (2) (scalar and rank-2)
> 
> which I think is fairly clear.  It also makes sure that warnings are
> always emitted by -fallow-argument-mismatch by removing
> -Wargument-mismatch.  Finally, for people who do not want to have too
> many warnings cluttering up their logs, a type mismatch is only
> shown once if it is a warning.
> 
> While I was going on about fixing warnings, I also fixed PR 91557 with
> the bit in trans-expr.c.  This part is trivial, I will backport it
> to the other affected branches.
> 
> After this, I think we can close PR 91556. Regression-tested. OK for
> trunk?
> 

Looks good to me.  Thanks for working of this issue.

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow


[PR target/85401] initialize the move cost table before using it

2019-09-14 Thread coypu
This seems to be the way the rest of ira-color.c does it.
I hope it's OK. It does fix the segfault.

2019-09-10  Maya Rashish 

PR target/85401
* ira-color.c: (allocno_copy_cost_saving) Call
ira_init_register_move_cost_if_necessary
diff --git a/gcc/ira-color.c b/gcc/ira-color.c
index 99236994d64..5d721102e19 100644
--- a/gcc/ira-color.c
+++ b/gcc/ira-color.c
@@ -2828,6 +2828,7 @@ allocno_copy_cost_saving (ira_allocno_t allocno, int 
hard_regno)
}
   else
gcc_unreachable ();
+  ira_init_register_move_cost_if_necessary(allocno_mode);
   cost += cp->freq * ira_register_move_cost[allocno_mode][rclass][rclass];
 }
   return cost;


[patch, fortran] Improve error messages for mismatched arguments

2019-09-14 Thread Thomas Koenig

Hello world,

the attached patch improves the rather hard to read error
messages for argument mismatches.  With this patch, this reads

argument_checking_21.f90:7:11:

6 |   call foo(1.0) ! { dg-warning "Rank mismatch" }
  |   2
7 |   call foo(b)   ! { dg-warning "Rank mismatch" }
  |   1
Fehler: Rank mismatch between actual argument at (1) and actual argument 
at (2) (scalar and rank-2)


which I think is fairly clear.  It also makes sure that warnings are
always emitted by -fallow-argument-mismatch by removing
-Wargument-mismatch.  Finally, for people who do not want to have too
many warnings cluttering up their logs, a type mismatch is only
shown once if it is a warning.

While I was going on about fixing warnings, I also fixed PR 91557 with
the bit in trans-expr.c.  This part is trivial, I will backport it
to the other affected branches.

After this, I think we can close PR 91556. Regression-tested. OK for
trunk?

2019-09-13  Thomas Koenig  

PR fortran/91557
PR fortran/91556
* frontend-passes.c (check_externals_procedure): Reformat argument
list. Use gfc_compare_actual_formal instead of gfc_procedure_use.
* gfortran.h (gfc_symbol): Add flag error.
* interface.c (gfc_compare_interfaces): Reformat.
(argument_rank_mismatch): Add where_formal argument. If it is
present, note that the error is between different calls.
(compare_parameter): Change warnings that previously dependended
on -Wargument-mismatch to unconditional.  Issue an error / warning
on type mismatch only once.  Pass where_formal to
argument_rank_mismatch for artificial variables.
(compare_actual_formal): Change warnings that previously
dependeded on -Wargument-mismatch to unconditional.
(gfc_check_typebound_override): Likewise.
(gfc_get_formal_from_actual_arglist): Set declared_at for
artificial symbol.
* invoke.texi: Extend description of -fallow-argument-mismatch.
Delete -Wargument-mismatch.
* lang.opt: Change -Wargument-mismatch to do-nothing option.
* resolve.c (resolve_structure_cons): Change warnings that
previously depended on -Wargument-mismatch to unconditional.
* trans-decl.c (generate_local_decl): Do not warn if the symbol is
artificial.

2019-09-13  Thomas Koenig  

PR fortran/91557
PR fortran/91556
* gfortran.dg/argument_checking_20.f90: New test.
* gfortran.dg/argument_checking_21.f90: New test.
* gfortran.dg/argument_checking_22.f90: New test.
* gfortran.dg/argument_checking_23.f90: New test.
* gfortran.dg/warn_unused_dummy_argument_5.f90: New test.
* gfortran.dg/bessel_3.f90: Add pattern for type mismatch.
* gfortran.dg/g77/20010519-1.f: Adjust dg-warning messages to new
handling.
* gfortran.dg/pr24823.f: Likewise.
* gfortran.dg/pr39937.f: Likewise.
Index: fortran/frontend-passes.c
===
--- fortran/frontend-passes.c	(Revision 275713)
+++ fortran/frontend-passes.c	(Arbeitskopie)
@@ -5373,7 +5373,8 @@ gfc_code_walker (gfc_code **c, walk_code_fn_t code
 /* Common tests for argument checking for both functions and subroutines.  */
 
 static int
-check_externals_procedure (gfc_symbol *sym, locus *loc, gfc_actual_arglist *actual)
+check_externals_procedure (gfc_symbol *sym, locus *loc,
+			   gfc_actual_arglist *actual)
 {
   gfc_gsymbol *gsym;
   gfc_symbol *def_sym = NULL;
@@ -5396,7 +5397,7 @@ static int
 
   if (def_sym)
 {
-  gfc_procedure_use (def_sym, , loc);
+  gfc_compare_actual_formal (, def_sym->formal, 0, 0, 0, loc);
   return 0;
 }
 
Index: fortran/gfortran.h
===
--- fortran/gfortran.h	(Revision 275713)
+++ fortran/gfortran.h	(Arbeitskopie)
@@ -1610,6 +1610,9 @@ typedef struct gfc_symbol
   /* Set if this is a module function or subroutine with the
  abreviated declaration in a submodule.  */
   unsigned abr_modproc_decl:1;
+  /* Set if a previous error or warning has occurred and no other
+ should be reported.  */
+  unsigned error:1;
 
   int refs;
   struct gfc_namespace *ns;	/* namespace containing this symbol */
Index: fortran/interface.c
===
--- fortran/interface.c	(Revision 275713)
+++ fortran/interface.c	(Arbeitskopie)
@@ -1807,9 +1807,9 @@ gfc_compare_interfaces (gfc_symbol *s1, gfc_symbol
 	if (!compare_rank (f2->sym, f1->sym))
 	  {
 		if (errmsg != NULL)
-		  snprintf (errmsg, err_len, "Rank mismatch in argument '%s' "
-			"(%i/%i)", f1->sym->name, symbol_rank (f1->sym),
-			symbol_rank (f2->sym));
+		  snprintf (errmsg, err_len, "Rank mismatch in argument "
+			"'%s' (%i/%i)", f1->sym->name,
+			symbol_rank (f1->sym), symbol_rank (f2->sym));
 		return 

[patch,committed][og9][openacc-gcc-9-branch] Fix segfault with plugin-gcn under libgomp

2019-09-14 Thread Tobias Burnus

Hi all,

exactly the same as for plugin-hsa on the trunk: 
https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00718.html


Only a different file (which is only in this branch).

Tobias


commit 392044a8db285d9aea0a280983ce7c5014a4e99c
Author: Tobias Burnus 
Date:   Thu Sep 12 18:07:53 2019 +0200

libgomp plugin-gcn - init string

libgomp/
2019-09-13  Tobias Burnus  

* plugin/plugin-gcn.c (hsa_warn, hsa_fatal, hsa_error): Ensure
string is initialized.

diff --git a/libgomp/ChangeLog.openacc b/libgomp/ChangeLog.openacc
index 355e406d4e3..14ed4e0ec2c 100644
--- a/libgomp/ChangeLog.openacc
+++ b/libgomp/ChangeLog.openacc
@@ -1,3 +1,8 @@
+2019-09-13  Tobias Burnus  
+
+   * plugin/plugin-gcn.c (hsa_warn, hsa_fatal, hsa_error): Ensure
+   string is initialized.
+
 2019-09-10  Julian Brown  
 
* plugin/plugin-gcn.c (GOMP_hsa_kernel_dispatch): Remove
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index f7e3554f297..b8ec96391f7 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -489,7 +489,7 @@ hsa_warn (const char *str, hsa_status_t status)
   if (!debug)
 return;
 
-  const char *hsa_error_msg;
+  const char *hsa_error_msg = "[unknown]";
   hsa_fns.hsa_status_string_fn (status, _error_msg);
 
   fprintf (stderr, "GCN warning: %s\nRuntime message: %s\n", str,
@@ -502,7 +502,7 @@ hsa_warn (const char *str, hsa_status_t status)
 static void
 hsa_fatal (const char *str, hsa_status_t status)
 {
-  const char *hsa_error_msg;
+  const char *hsa_error_msg = "[unknown]";
   hsa_fns.hsa_status_string_fn (status, _error_msg);
   GOMP_PLUGIN_fatal ("GCN fatal error: %s\nRuntime message: %s\n", str,
 hsa_error_msg);
@@ -514,7 +514,7 @@ hsa_fatal (const char *str, hsa_status_t status)
 static bool
 hsa_error (const char *str, hsa_status_t status)
 {
-  const char *hsa_error_msg;
+  const char *hsa_error_msg = "[unknown]";
   hsa_fns.hsa_status_string_fn (status, _error_msg);
   GOMP_PLUGIN_error ("GCN fatal error: %s\nRuntime message: %s\n", str,
 hsa_error_msg);


Re: [PATCH v6 3/3] PR80791 Consider doloop cmp use in ivopts

2019-09-14 Thread Kewen.Lin
on 2019/9/12 下午4:14, Richard Biener wrote:
> On Wed, 11 Sep 2019, Kewen.Lin wrote:
> 
>> Hi,
>>
>> Sorry for the late update.  I've updated the words of target hooks part.
>>
>> Could someone help to review it?  Thanks in advance!
>>
>> By the way, as previous emails in this thread, Bin has approved the IVOPTs
>> part, while Segher has approved the rs6000 part.
> 
> The target hooks part is OK.  I guess we'll have to extend it eventually
> in case other targets want to make use of it.
> 

Thanks Richard!  Committed by r275713.

Yes, it's enough when doloop IV costs zero or infinite for generic/address use,
but if one target wants some other values, we may have to take it as one common
cost shared for all generic/address uses.  It's like IV candidate cost but not
the same since it's only needed when doloop IV is used for generic/address uses,
I guess it requires some changes in candidate set cost calculation.  I chose to
keep it simple at the first place, but radar on for any other target adoptions.


Thanks,
Kewen

> Thanks,
> Richard.
> 
>>
>> Thanks,
>> Kewen
>>
>> -
>>
>> gcc/ChangeLog
>>
>> 2019-09-11  Kewen Lin  
>>
>>  PR middle-end/80791
>>  * config/rs6000/rs6000.c (TARGET_HAVE_COUNT_REG_DECR_P): New macro.
>>  (TARGET_DOLOOP_COST_FOR_GENERIC): Likewise.
>>  (TARGET_DOLOOP_COST_FOR_ADDRESS): Likewise.
>>  * target.def (have_count_reg_decr_p): New hook.
>>  (doloop_cost_for_generic): Likewise.
>>  (doloop_cost_for_address): Likewise.
>>  * doc/tm.texi.in (TARGET_HAVE_COUNT_REG_DECR_P): Likewise.
>>  (TARGET_DOLOOP_COST_FOR_GENERIC): Likewise.
>>  (TARGET_DOLOOP_COST_FOR_ADDRESS): Likewise.
>>  * doc/tm.texi: Regenerate.
>>  * tree-ssa-loop-ivopts.c (comp_cost::operator+=): Consider infinite cost
>>  addend.
>>  (record_group): Init doloop_p.
>>  (add_candidate_1): Add optional argument doloop, change the handlings
>>  accordingly.
>>  (add_candidate): Likewise.
>>  (generic_predict_doloop_p): Update attribute.
>>  (force_expr_to_var_cost): Add costing for expressions COND_EXPR/LT_EXPR/
>>  LE_EXPR/GT_EXPR/GE_EXPR/EQ_EXPR/NE_EXPR/UNORDERED_EXPR/ORDERED_EXPR/
>>  UNLT_EXPR/UNLE_EXPR/UNGT_EXPR/UNGE_EXPR/UNEQ_EXPR/LTGT_EXPR/MAX_EXPR/
>>  MIN_EXPR.
>>  (get_computation_cost): Update for doloop IV cand extra cost.   
>>  (determine_group_iv_cost_cond): Update for doloop IV cand.
>>  (determine_iv_cost): Likewise.
>>  (ivopts_estimate_reg_pressure): Likewise.
>>  (may_eliminate_iv): Update handlings for doloop IV cand.
>>  (add_iv_candidate_for_doloop): New function.
>>  (find_iv_candidates): Call function add_iv_candidate_for_doloop.
>>  (iv_ca_set_no_cp): Update for doloop IV cand.
>>  (iv_ca_set_cp): Likewise.
>>  (iv_ca_dump): Dump register cost.
>>  (find_doloop_use): New function.
>>  (analyze_and_mark_doloop_use): Likewise.
>>  (tree_ssa_iv_optimize_loop): Call function analyze_and_mark_doloop_use.
>>
>> gcc/testsuite/ChangeLog
>>
>> 2019-09-11  Kewen Lin  
>>
>>  PR middle-end/80791
>>  * gcc.dg/tree-ssa/ivopts-3.c: Adjust for doloop change.
>>  * gcc.dg/tree-ssa/ivopts-lt.c: Likewise.
>>  * gcc.dg/tree-ssa/pr32044.c: Likewise.
>>
>>
>> on 2019/8/23 下午6:18, Segher Boessenkool wrote:
>>> Hi!
>>>
>>> On Fri, Aug 23, 2019 at 05:43:32PM +0800, Bin.Cheng wrote:
 On Fri, Aug 23, 2019 at 4:27 PM Kewen.Lin  wrote:
 Not sure if non-ivopts parts are already approved?  If so, the patch
 is okay with above issues addressed.
>>>
>>> The rs6000 part is fine.  The target.def entries need some spell check
>>> and copy-editing, but are obvious and trivial otherwise, and/or you can
>>> approve it as ivopts maintainer.
>>>
 Thanks very much for your time!
>>>
>>> And thank you as well Bin :-)
>>>
>>>
>>> Segher
>>>
>>
>