Re: [PATCH v2] Missed function specialization + partial devirtualization

2019-07-14 Thread Martin Liška
On 7/12/19 10:51 AM, Xiong Hu Luo wrote:
>  2. Remove duplicate logic in ipa-profile.  As
>  get_most_common_single_value could only return single value, but this
>  multiple indirect call needs store each hist value, will
>  consider specialize it later.

Hi.

I would like to see this patch rebased on the suggested patch that will
change get_most_common_value.

Thanks,
Martin


Re: [PATCH] Generalize get_most_common_single_value to return k_th value & count

2019-07-14 Thread Martin Liška
On 7/15/19 4:42 AM, Xiong Hu Luo wrote:
> Currently get_most_common_single_value could only return the max hist
> , add two paramter to enable this function return kth
> value if needed.
> 
> gcc/ChangeLog:
> 
>   2019-07-15  Xiong Hu Luo  
> 
>   * value-prof.c (get_most_common_single_value): Add input params
>   k_th and k, return the k_th  if k_th is true.
>   * value-prof.h (get_most_common_single_value): Add input params
>   k_th and k, default to false.
> ---
>  gcc/value-prof.c | 16 
>  gcc/value-prof.h |  7 +++
>  2 files changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/gcc/value-prof.c b/gcc/value-prof.c
> index 32e6ddd8165..e1a3e0bd4b5 100644
> --- a/gcc/value-prof.c
> +++ b/gcc/value-prof.c
> @@ -719,9 +719,9 @@ gimple_divmod_fixed_value (gassign *stmt, tree value, 
> profile_probability prob,
>  
>  bool
>  get_most_common_single_value (gimple *stmt, const char *counter_type,

Hi.

I would rename the function as it's not going to return only the most common 
value.

> -   histogram_value hist,
> -   gcov_type *value, gcov_type *count,
> -   gcov_type *all)
> +   histogram_value hist, gcov_type *value,
> +   gcov_type *count, gcov_type *all, bool k_th,
> +   unsigned k)
>  {
>if (hist->hvalue.counters[2] == -1)
>  return false;
> @@ -743,7 +743,15 @@ get_most_common_single_value (gimple *stmt, const char 
> *counter_type,
>  
>*all = read_all;
>  
> -  if (c > *count)
> +  /* Return the kth value in hist instead of the max value for indirect
> +  multiple call usage.  */
> +  if (k_th && i == k)

This is probably wrong as the tuples in a histogram are not sorted by count. I 
would recommend
to sort them when we read them. And then this function can be quite simple to 
return N-th tuple.

Thanks,
Martin

> + {
> +   *value = v;
> +   *count = c;
> +   break;
> +  }
> +  else if (c > *count)
>   {
> *value = v;
> *count = c;
> diff --git a/gcc/value-prof.h b/gcc/value-prof.h
> index ca846d08cbd..0a064a71f7d 100644
> --- a/gcc/value-prof.h
> +++ b/gcc/value-prof.h
> @@ -90,10 +90,9 @@ void stringop_block_profile (gimple *, unsigned int *, 
> HOST_WIDE_INT *);
>  gcall *gimple_ic (gcall *, struct cgraph_node *, profile_probability);
>  bool check_ic_target (gcall *, struct cgraph_node *);
>  bool get_most_common_single_value (gimple *stmt, const char *counter_type,
> -histogram_value hist,
> -gcov_type *value, gcov_type *count,
> -gcov_type *all);
> -
> +histogram_value hist, gcov_type *value,
> +gcov_type *count, gcov_type *all,
> +bool k_th = false, unsigned k = 0);
>  
>  /* In tree-profile.c.  */
>  extern void gimple_init_gcov_profiler (void);
> 



Re: [PING^1][PATCH v4 3/3] PR80791 Consider doloop cmp use in ivopts

2019-07-14 Thread Bin.Cheng
On Fri, Jul 12, 2019 at 8:11 PM Richard Biener  wrote:
>
> On Wed, 10 Jul 2019, Kewen.Lin wrote:
>
> > Hi all,
> >
> > I'd like to gentle ping the below patch:
> > https://gcc.gnu.org/ml/gcc-patches/2019-06/msg01225.html
> >
> > The previous version for more context/background:
> > https://gcc.gnu.org/ml/gcc-patches/2019-06/msg01126.html
> >
> > Thanks a lot in advance!
>
> Again I would have hoped Bin to chime in here.
Sorry for missing this one, will get to the patch this week.  Sorry
again for the inconvenience.

Thanks,
bin
>
> Am I correct that doloop HW implementations are constrainted
> by a decrement of one?  I see no code in the patch to constrain
> things this way.  I'm not familiar with the group code at all
> but I would have expected the patch to only affect costing
> of IVs of the appropriate form (decrement one and possibly
> no uses besides the one in the compare/decrement).  Since
> ivcanon already adds a canonical counter IV it's not
> necessary to generate an artificial candidate IV of the
> wanted style (that's something I might have expected as well).
>
> The rest should be just magic from the IVOPTs side?
>
> There might be the need to only consider at most one counter IV
> in the costing code.
>
> Richard.
>
> >
> > on 2019/6/20 下午8:16, Kewen.Lin wrote:
> > > Hi,
> > >
> > > Sorry, the previous patch is incomplete.
> > > New one attached.  Sorry for inconvenience.
> > >
> > > on 2019/6/20 下午8:08, Kewen.Lin wrote:
> > >> Hi Segher,
> > >>
> > >>> On Wed, Jun 19, 2019 at 07:47:34PM +0800, Kewen.Lin wrote:
> >  +/* Return true if count register for branch is supported.  */
> >  +
> >  +static bool
> >  +rs6000_have_count_reg_decr_p ()
> >  +{
> >  +  return flag_branch_on_count_reg;
> >  +}
> > >>>
> > >>> rs6000 unconditionally supports these instructions, not just when that
> > >>> flag is set.  If you need to look at the flag, the *caller* of this new
> > >>> hook should, not every implementation of the hook.  So just "return 
> > >>> true"
> > >>> here?
> > >>
> > >> Good point!  Updated it as hookpod.
> > >>
> >  +/* For doloop use, if the algothrim selects some candidate which 
> >  invalid for
> > >>>
> > >>> "algorithm", "which is invalid".
> > >>
> >  +   some cost like zero rather than original inifite cost.  The point 
> >  is to
> > >>>
> > >>> "infinite"
> > >>>
> > >>
> > >> Thanks for catching!  I should run spelling check next time.  :)
> > >>
> > >> New version attached with comments addressed.
> > >>
> > >>
> > >> Thanks,
> > >> Kewen
> > >>
> >
> >
>
> --
> Richard Biener 
> SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
> GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)


Re: [PING^1][PATCH v4 3/3] PR80791 Consider doloop cmp use in ivopts

2019-07-14 Thread Kewen.Lin
Hi Richard,

on 2019/7/12 下午8:11, Richard Biener wrote:
> On Wed, 10 Jul 2019, Kewen.Lin wrote:
> 
>> Hi all,
>>
>> I'd like to gentle ping the below patch:
>> https://gcc.gnu.org/ml/gcc-patches/2019-06/msg01225.html
>>
>> The previous version for more context/background:
>> https://gcc.gnu.org/ml/gcc-patches/2019-06/msg01126.html
>>
>> Thanks a lot in advance!
> 
> Again I would have hoped Bin to chime in here.
> 
> Am I correct that doloop HW implementations are constrainted
> by a decrement of one?  I see no code in the patch to constrain
> things this way.  

If my understanding is correct, under have_count_reg_decr_p
I don't think we should check the decrement one pattern, doloop
can transform the loop closing to decrement by 1 since it knows
the iteration total count.  Since it uses special hardware register
like Power count register, we don't expect it to be shared with
other uses.  Btw, it also doesn't require the compare should be the 
comp/decrement pattern, so this patch more focuses on this comp
is needed or not (should be considered in selection or not).

> I'm not familiar with the group code at all
> but I would have expected the patch to only affect costing
> of IVs of the appropriate form (decrement one and possibly
> no uses besides the one in the compare/decrement).  

But since we select IV cand for every IV uses, we never knows
this IV cand will have the only use till the whole selection
done.

> Since
> ivcanon already adds a canonical counter IV it's not
> necessary to generate an artificial candidate IV of the
> wanted style (that's something I might have expected as well).

This patch is only for the case guarded in have_count_reg_decr_p.
It doesn't requires to have the artificial candidate IV as well
as decrement-compare-jump code sequence.  The code on power looks
like:  mtctr Rx   // move Rx (which holding total_counter) 
  // to ctr register
  L:   
   loop body...
   bnze L // decrease ctr register and jump to L if 
  // ctr nonzero

> 
> The rest should be just magic from the IVOPTs side?
> 
> There might be the need to only consider at most one counter IV
> in the costing code.

The current patch doesn't introduce any IV cands but focus on 
zeroing the cost of comp IV use since we know it will be eliminated.
Still to leverage the existing candidate selection algorithm to decide
the final optimal IV set.  Bring back the canonical counter IV only
if it's not selected by any IV uses to keep the doloop comp use
rewriting correct, but it shouldn't affect anything since the use will
be eliminated and is the only use, the IV and its related will be
removed as well.


Thanks,
Kewen



[PATCH] Generalize get_most_common_single_value to return k_th value & count

2019-07-14 Thread Xiong Hu Luo
Currently get_most_common_single_value could only return the max hist
, add two paramter to enable this function return kth
value if needed.

gcc/ChangeLog:

2019-07-15  Xiong Hu Luo  

* value-prof.c (get_most_common_single_value): Add input params
k_th and k, return the k_th  if k_th is true.
* value-prof.h (get_most_common_single_value): Add input params
k_th and k, default to false.
---
 gcc/value-prof.c | 16 
 gcc/value-prof.h |  7 +++
 2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/gcc/value-prof.c b/gcc/value-prof.c
index 32e6ddd8165..e1a3e0bd4b5 100644
--- a/gcc/value-prof.c
+++ b/gcc/value-prof.c
@@ -719,9 +719,9 @@ gimple_divmod_fixed_value (gassign *stmt, tree value, 
profile_probability prob,
 
 bool
 get_most_common_single_value (gimple *stmt, const char *counter_type,
- histogram_value hist,
- gcov_type *value, gcov_type *count,
- gcov_type *all)
+ histogram_value hist, gcov_type *value,
+ gcov_type *count, gcov_type *all, bool k_th,
+ unsigned k)
 {
   if (hist->hvalue.counters[2] == -1)
 return false;
@@ -743,7 +743,15 @@ get_most_common_single_value (gimple *stmt, const char 
*counter_type,
 
   *all = read_all;
 
-  if (c > *count)
+  /* Return the kth value in hist instead of the max value for indirect
+multiple call usage.  */
+  if (k_th && i == k)
+   {
+ *value = v;
+ *count = c;
+ break;
+  }
+  else if (c > *count)
{
  *value = v;
  *count = c;
diff --git a/gcc/value-prof.h b/gcc/value-prof.h
index ca846d08cbd..0a064a71f7d 100644
--- a/gcc/value-prof.h
+++ b/gcc/value-prof.h
@@ -90,10 +90,9 @@ void stringop_block_profile (gimple *, unsigned int *, 
HOST_WIDE_INT *);
 gcall *gimple_ic (gcall *, struct cgraph_node *, profile_probability);
 bool check_ic_target (gcall *, struct cgraph_node *);
 bool get_most_common_single_value (gimple *stmt, const char *counter_type,
-  histogram_value hist,
-  gcov_type *value, gcov_type *count,
-  gcov_type *all);
-
+  histogram_value hist, gcov_type *value,
+  gcov_type *count, gcov_type *all,
+  bool k_th = false, unsigned k = 0);
 
 /* In tree-profile.c.  */
 extern void gimple_init_gcov_profiler (void);
-- 
2.22.0.428.g6d5b264208



Re: [Patch, fortran] PR90903 - Implement runtime checks for bit manipulation intrinsics

2019-07-14 Thread Steve Kargl
Harald, thanks for the patch.  I'm that the best person
for reading the trans-* file, but your patch and changes
look good to me.  If no one else speaks up, in the next
day or so, please commit.

-- 
steve

On Sun, Jul 14, 2019 at 09:37:27PM +0200, Harald Anlauf wrote:
> Ping!
> 
> On 06/23/19 23:36, Harald Anlauf wrote:
> > Dear all,
> >
> > the attached patch provides run-time checks for the bit manipulation
> > intrinsic functions (IBSET/IBCLR/BTEST/SHIFT[RLA]/ISHFT/ISHFTC).
> > I am using only one testcase whose purpose is mainly to verify that
> > there are no false positives, which I consider essential, and one
> > "failing" test at the end.
> >
> > What is still missing are run-time checks for the subroutine MVBITS.
> > I am not sure yet how to handle that case (frontend or library?),
> > and I am open to suggestions.  For this purpose I intend to leave
> > the PR open until a good solution is found.
> >
> > Regtested on x86_64-pc-linux-gnu.  OK for trunk?
> >
> > Harald
> >
> > 2019-06-23  Harald Anlauf  
> >
> > PR fortran/90903
> > * libgfortran.h: Add mask for -fcheck=bits option.
> > * options.c (gfc_handle_runtime_check_option): Add option "bits"
> > to run-time checks selectable via -fcheck.
> > * trans-intrinsic.c (gfc_conv_intrinsic_btest)
> > (gfc_conv_intrinsic_singlebitop, gfc_conv_intrinsic_ibits)
> > (gfc_conv_intrinsic_shift, gfc_conv_intrinsic_ishft)
> > (gfc_conv_intrinsic_ishftc): Implement run-time checks for the
> > POS, LEN, SHIFT, and SIZE arguments.
> > * gfortran.texi: Document run-time checks for bit manipulation
> > intrinsics.
> > * invoke.texi: Document new -fcheck=bits option.
> >
> > 2019-06-23  Harald Anlauf  
> >
> > PR fortran/90903
> > * gfortran.dg/check_bits_1.f90: New testcase.
> >
> 

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow


Ping agian: [PATCH V2] Loop split upon semi-invariant condition (PR tree-optimization/89134)

2019-07-14 Thread Feng Xue OS
Some time passed, so ping again. I made this patch, because it can reward us 
with 7%

performance benefit in some real application. For convenience, the optimization 
to be

implemented was listed in the following again. And hope your comments on the 
patch, or

design suggestions. Thanks!


Suppose a loop as:

void f (std::map m)
{
for (auto it = m.begin (); it != m.end (); ++it) {
/* if (b) is semi-invariant. */
if (b) {
b = do_something();/* Has effect on b */
} else {
/* No effect on b */
}
statements;  /* Also no effect on b */
}
}

A transformation, kind of loop split, could be:

void f (std::map m)
{
for (auto it = m.begin (); it != m.end (); ++it) {
if (b) {
b = do_something();
} else {
++it;
statements;
break;
}
statements;
}

for (; it != m.end (); ++it) {
statements;
}
}

If "statements" contains nothing, the second loop becomes an empty one, which 
can be removed.
And if "statements" are straight line instructions, we get an opportunity to 
vectorize the second loop.


Feng


From: Feng Xue OS
Sent: Tuesday, June 18, 2019 3:00 PM
To: Richard Biener; Michael Matz
Cc: gcc-patches@gcc.gnu.org
Subject: Ping: [PATCH V2] Loop split upon semi-invariant condition (PR 
tree-optimization/89134)

Richard & Michael,

   I made some adjustments on coding style and added test cases for this 
version.

   Would you please take a look at the patch? It is long a little bit and might 
steal some
   of your time.

Thanks a lot.


diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9a46f93d89d..2334b184945 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,23 @@
+2019-06-18  Feng Xue 
+
+   PR tree-optimization/89134
+   * doc/invoke.texi (max-cond-loop-split-insns): Document new --params.
+   (min-cond-loop-split-prob): Likewise.
+   * params.def: Add max-cond-loop-split-insns, min-cond-loop-split-prob.
+   * passes.def (pass_cond_loop_split) : New pass.
+   * timevar.def (TV_COND_LOOP_SPLIT): New time variable.
+   * tree-pass.h (make_pass_cond_loop_split): New declaration.
+   * tree-ssa-loop-split.c (split_info): New class.
+   (find_vdef_in_loop, vuse_semi_invariant_p): New functions.
+   (ssa_semi_invariant_p, stmt_semi_invariant_p): Likewise.
+   (branch_removable_p, get_cond_invariant_branch): Likewise.
+   (is_cond_in_hidden_loop, compute_added_num_insns): Likewise.
+   (can_split_loop_on_cond, mark_cond_to_split_loop): Likewise.
+   (split_loop_for_cond, tree_ssa_split_loops_for_cond): Likewise.
+   (pass_data_cond_loop_split): New variable.
+   (pass_cond_loop_split): New class.
+   (make_pass_cond_loop_split): New function.
+
 2019-06-18  Kewen Lin  

 PR middle-end/80791
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index eaef4cd63d2..0427fede3d6 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11352,6 +11352,14 @@ The maximum number of branches unswitched in a single 
loop.
 @item lim-expensive
 The minimum cost of an expensive expression in the loop invariant motion.

+@item max-cond-loop-split-insns
+The maximum number of insns to be increased due to loop split on
+semi-invariant condition statement.
+
+@item min-cond-loop-split-prob
+The minimum threshold for probability of semi-invaraint condition
+statement to trigger loop split.
+
 @item iv-consider-all-candidates-bound
 Bound on number of candidates for induction variables, below which
 all candidates are considered for each use in induction variable
diff --git a/gcc/params.def b/gcc/params.def
index 0db60951413..5384f7d1c4d 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -386,6 +386,20 @@ DEFPARAM(PARAM_MAX_UNSWITCH_LEVEL,
 "The maximum number of unswitchings in a single loop.",
 3, 0, 0)

+/* The maximum number of increased insns due to loop split on semi-invariant
+   condition statement.  */
+DEFPARAM(PARAM_MAX_COND_LOOP_SPLIT_INSNS,
+   "max-cond-loop-split-insns",
+   "The maximum number of insns to be increased due to loop split on "
+   "semi-invariant condition statement.",
+   100, 0, 0)
+
+DEFPARAM(PARAM_MIN_COND_LOOP_SPLIT_PROB,
+   "min-cond-loop-split-prob",
+   "The minimum threshold for probability of semi-invaraint condition "
+   "statement to trigger loop split.",
+   30, 0, 100)
+
 /* The maximum number of insns in loop header duplicated by the copy loop
headers pass.  */
 DEFPARAM(PARAM_MAX_LOOP_HEADER_INSNS,
diff --git a/gcc/passes.def b/gcc/passes.def
index ad2efabd385..bb32b88738e 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -261,6 +261,7 @@ along with GCC; see the file COPYI

Re: [patch, fortran] Pr87233 Constraint C1279 still followed after f2008 standard revision

2019-07-14 Thread Jerry DeLisle
Could not get the use of gfc_get_errors to work right, it missed one of the 
errors in initialazation_30.f90. So I just did the deed.


Regards,

Jerry

Committing to svn+ssh://jvdeli...@gcc.gnu.org/svn/gcc/trunk ...
A   gcc/testsuite/gfortran.dg/initialization_30.f90
M   gcc/fortran/ChangeLog
M   gcc/fortran/expr.c
M   gcc/testsuite/ChangeLog
M   gcc/testsuite/gfortran.dg/initialization_14.f90
Committed r273484




Re: [PATCH] i386: Expand roundeven for SSE4.1+

2019-07-14 Thread Uros Bizjak
> This patch is for expanding roundeven inline for SSE4.1 and later.
> Note that this patch is to be applied on top of
> . The patch
> is bootstrapped and regression tested on x86_64-linux-gnu.

Actually, your patch at [1] is the way to go, but you need several
other changes to get x87 mode switching in order. Please also note
that there is no corresponding non-SSE4 ix86_expand_... function for
roundeven, so non-SSE4 SSE FP-math 2 expander has
to be disabled for ROUNDEVEN int iterator. Please see (otherwise
untested) attached patch which fixes both issues.

[1] https://gcc.gnu.org/ml/gcc/2019-06/msg00352.html

Uros.
Index: builtins.c
===
--- builtins.c  (revision 273480)
+++ builtins.c  (working copy)
@@ -2056,6 +2056,7 @@ mathfn_built_in_2 (tree type, combined_fn fn)
 CASE_MATHFN (REMQUO)
 CASE_MATHFN_FLOATN (RINT)
 CASE_MATHFN_FLOATN (ROUND)
+CASE_MATHFN_FLOATN (ROUNDEVEN)
 CASE_MATHFN (SCALB)
 CASE_MATHFN (SCALBLN)
 CASE_MATHFN (SCALBN)
Index: builtins.def
===
--- builtins.def(revision 273480)
+++ builtins.def(working copy)
@@ -548,6 +548,12 @@ DEF_C99_BUILTIN(BUILT_IN_ROUNDL, "roundl",
 #define ROUND_TYPE(F) BT_FN_##F##_##F
 DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_ROUND, "round", ROUND_TYPE, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 #undef ROUND_TYPE
+DEF_EXT_LIB_BUILTIN(BUILT_IN_ROUNDEVEN, "roundeven", BT_FN_DOUBLE_DOUBLE, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_ROUNDEVENF, "roundevenf", BT_FN_FLOAT_FLOAT, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_ROUNDEVENL, "roundevenl", 
BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#define ROUNDEVEN_TYPE(F) BT_FN_##F##_##F
+DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_ROUNDEVEN, "roundeven", 
ROUNDEVEN_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#undef ROUNDEVEN_TYPE
 DEF_EXT_LIB_BUILTIN(BUILT_IN_SCALB, "scalb", BT_FN_DOUBLE_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_EXT_LIB_BUILTIN(BUILT_IN_SCALBF, "scalbf", BT_FN_FLOAT_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_EXT_LIB_BUILTIN(BUILT_IN_SCALBL, "scalbl", 
BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 273480)
+++ config/i386/i386.c  (working copy)
@@ -13593,6 +13593,11 @@ ix86_i387_mode_needed (int entity, rtx_insn *insn)
 
   switch (entity)
 {
+case I387_ROUNDEVEN:
+  if (mode == I387_CW_ROUNDEVEN)
+   return mode;
+  break;
+
 case I387_TRUNC:
   if (mode == I387_CW_TRUNC)
return mode;
@@ -13627,6 +13632,7 @@ ix86_mode_needed (int entity, rtx_insn *insn)
   return ix86_dirflag_mode_needed (insn);
 case AVX_U128:
   return ix86_avx_u128_mode_needed (insn);
+case I387_ROUNDEVEN:
 case I387_TRUNC:
 case I387_FLOOR:
 case I387_CEIL:
@@ -13687,6 +13693,7 @@ ix86_mode_after (int entity, int mode, rtx_insn *i
   return mode;
 case AVX_U128:
   return ix86_avx_u128_mode_after (mode, insn);
+case I387_ROUNDEVEN:
 case I387_TRUNC:
 case I387_FLOOR:
 case I387_CEIL:
@@ -13739,6 +13746,7 @@ ix86_mode_entry (int entity)
   return ix86_dirflag_mode_entry ();
 case AVX_U128:
   return ix86_avx_u128_mode_entry ();
+case I387_ROUNDEVEN:
 case I387_TRUNC:
 case I387_FLOOR:
 case I387_CEIL:
@@ -13776,6 +13784,7 @@ ix86_mode_exit (int entity)
   return X86_DIRFLAG_ANY;
 case AVX_U128:
   return ix86_avx_u128_mode_exit ();
+case I387_ROUNDEVEN:
 case I387_TRUNC:
 case I387_FLOOR:
 case I387_CEIL:
@@ -13810,6 +13819,12 @@ emit_i387_cw_initialization (int mode)
 
   switch (mode)
 {
+case I387_CW_ROUNDEVEN:
+  /* round to nearest */
+  emit_insn (gen_andhi3 (reg, reg, GEN_INT (~0x0c00)));
+  slot = SLOT_CW_ROUNDEVEN;
+  break;
+
 case I387_CW_TRUNC:
   /* round toward zero (truncate) */
   emit_insn (gen_iorhi3 (reg, reg, GEN_INT (0x0c00)));
@@ -13856,6 +13871,7 @@ ix86_emit_mode_set (int entity, int mode, int prev
   if (mode == AVX_U128_CLEAN)
emit_insn (gen_avx_vzeroupper ());
   break;
+case I387_ROUNDEVEN:
 case I387_TRUNC:
 case I387_FLOOR:
 case I387_CEIL:
Index: config/i386/i386.h
===
--- config/i386/i386.h  (revision 273480)
+++ config/i386/i386.h  (working copy)
@@ -2471,6 +2471,7 @@ enum ix86_stack_slot
 {
   SLOT_TEMP = 0,
   SLOT_CW_STORED,
+  SLOT_CW_ROUNDEVEN,
   SLOT_CW_TRUNC,
   SLOT_CW_FLOOR,
   SLOT_CW_CEIL,
@@ -2482,6 +2483,7 @@ enum ix86_entity
 {
   X86_DIRFLAG = 0,
   AVX_U128,
+  I387_ROUNDEVEN,
   I387_TRUNC,
   I387_FLOOR,
   I387_CEIL,
@@ -2517,7 +2519,7 @@ enum avx_u128_state
 

Re: [Patch, fortran] PR90903 - Implement runtime checks for bit manipulation intrinsics

2019-07-14 Thread Harald Anlauf
Ping!

On 06/23/19 23:36, Harald Anlauf wrote:
> Dear all,
>
> the attached patch provides run-time checks for the bit manipulation
> intrinsic functions (IBSET/IBCLR/BTEST/SHIFT[RLA]/ISHFT/ISHFTC).
> I am using only one testcase whose purpose is mainly to verify that
> there are no false positives, which I consider essential, and one
> "failing" test at the end.
>
> What is still missing are run-time checks for the subroutine MVBITS.
> I am not sure yet how to handle that case (frontend or library?),
> and I am open to suggestions.  For this purpose I intend to leave
> the PR open until a good solution is found.
>
> Regtested on x86_64-pc-linux-gnu.  OK for trunk?
>
> Harald
>
> 2019-06-23  Harald Anlauf  
>
>   PR fortran/90903
>   * libgfortran.h: Add mask for -fcheck=bits option.
>   * options.c (gfc_handle_runtime_check_option): Add option "bits"
>   to run-time checks selectable via -fcheck.
>   * trans-intrinsic.c (gfc_conv_intrinsic_btest)
>   (gfc_conv_intrinsic_singlebitop, gfc_conv_intrinsic_ibits)
>   (gfc_conv_intrinsic_shift, gfc_conv_intrinsic_ishft)
>   (gfc_conv_intrinsic_ishftc): Implement run-time checks for the
>   POS, LEN, SHIFT, and SIZE arguments.
>   * gfortran.texi: Document run-time checks for bit manipulation
>   intrinsics.
>   * invoke.texi: Document new -fcheck=bits option.
>
> 2019-06-23  Harald Anlauf  
>
>   PR fortran/90903
>   * gfortran.dg/check_bits_1.f90: New testcase.
>



Re: Rewrite some jump.c routines to use flags

2019-07-14 Thread Eric Botcazou
> AIUI, neither ORDERED nor UNEQ trap on signalling NaNs.  Without this,
> the follow-on patch would fold
> 
>(and (ordered x y) (uneq x y)) -> (eq x y)
> 
> which is the same thing for quiet NaNs but not for signalling NaNs.

Note that GCC defaults to -fno-signaling-nans and the transformation would be 
valid in this mode.

-- 
Eric Botcazou


[PATCH, i386]: Adjust operand predicate of several test instructions

2019-07-14 Thread Uros Bizjak
Operand constraints accept only register and immediate operands, so
adjust operand predicate to reject memory operands.

2019-07-14  Uroš Bizjak  

* config/i386/i386.md (nonmemory_szext_operand): New mode attribute.
(test_ccno_1): Macroize insn pattern from testsi_ccno_1
and testdi_ccno_1 using SWI48 mode attribute.
(*testdi_1): Use x86_64_szext_nonmemory_operand instead of
x86_64_szext_general_operand.
(*testqi_1_maybe_si): Use nonmemory_operand instead of general_operand.
(*test_1): Use nonmemory_szext_operand mode attribute
instead of genera_operand mode attribute.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index db5fa9ae3cae..58797baa6dc5 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -1122,6 +1122,12 @@
 (SI "x86_64_szext_general_operand")
 (DI "x86_64_szext_general_operand")])
 
+(define_mode_attr nonmemory_szext_operand
+   [(QI "nonmemory_operand")
+(HI "nonmemory_operand")
+(SI "x86_64_szext_nonmemory_operand")
+(DI "x86_64_szext_nonmemory_operand")])
+
 ;; Immediate operand predicate for integer modes.
 (define_mode_attr immediate_operand
[(QI "immediate_operand")
@@ -8118,11 +8124,12 @@
 ;; On Pentium, "test imm, reg" is pairable only with eax, ax, and al.
 ;; Note that this excludes ah.
 
-(define_expand "testsi_ccno_1"
+(define_expand "test_ccno_1"
   [(set (reg:CCNO FLAGS_REG)
(compare:CCNO
- (and:SI (match_operand:SI 0 "nonimmediate_operand")
- (match_operand:SI 1 "x86_64_nonmemory_operand"))
+ (and:SWI48
+   (match_operand:SWI48 0 "nonimmediate_operand")
+   (match_operand:SWI48 1 ""))
  (const_int 0)))])
 
 (define_expand "testqi_ccz_1"
@@ -8131,23 +8138,14 @@
 (match_operand:QI 1 "nonmemory_operand"))
 (const_int 0)))])
 
-(define_expand "testdi_ccno_1"
-  [(set (reg:CCNO FLAGS_REG)
-   (compare:CCNO
- (and:DI (match_operand:DI 0 "nonimmediate_operand")
- (match_operand:DI 1 "x86_64_szext_general_operand"))
- (const_int 0)))]
-  "TARGET_64BIT && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
-
 (define_insn "*testdi_1"
   [(set (reg FLAGS_REG)
(compare
 (and:DI
  (match_operand:DI 0 "nonimmediate_operand" "%!*a,r,!*a,r,rm")
- (match_operand:DI 1 "x86_64_szext_general_operand" "Z,Z,e,e,re"))
+ (match_operand:DI 1 "x86_64_szext_nonmemory_operand" "Z,Z,e,e,re"))
 (const_int 0)))]
-  "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode)
-   && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
+  "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode)"
   "@
test{l}\t{%k1, %k0|%k0, %k1}
test{l}\t{%k1, %k0|%k0, %k1}
@@ -8163,12 +8161,12 @@
 (compare
  (and:QI
(match_operand:QI 0 "nonimmediate_operand" "%!*a,q,qm,r")
-   (match_operand:QI 1 "general_operand" "n,n,qn,n"))
+   (match_operand:QI 1 "nonmemory_operand" "n,n,qn,n"))
  (const_int 0)))]
-   "!(MEM_P (operands[0]) && MEM_P (operands[1]))
-&& ix86_match_ccmode (insn,
-CONST_INT_P (operands[1])
-&& INTVAL (operands[1]) >= 0 ? CCNOmode : CCZmode)"
+
+  "ix86_match_ccmode (insn,
+ CONST_INT_P (operands[1])
+ && INTVAL (operands[1]) >= 0 ? CCNOmode : CCZmode)"
 {
   if (which_alternative == 3)
 {
@@ -8188,10 +8186,9 @@
(compare
 (and:SWI124
  (match_operand:SWI124 0 "nonimmediate_operand" "%!*a,,m")
- (match_operand:SWI124 1 "" ",,"))
+ (match_operand:SWI124 1 "" ",,"))
 (const_int 0)))]
-  "ix86_match_ccmode (insn, CCNOmode)
-   && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
+  "ix86_match_ccmode (insn, CCNOmode)"
   "test{}\t{%1, %0|%0, %1}"
   [(set_attr "type" "test")
(set_attr "modrm" "0,1,1")


Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads

2019-07-14 Thread Steve Kargl
On Sun, Jul 14, 2019 at 12:07:58PM +0200, Thomas Koenig wrote:
> OK, so here is a new version.
> 
> I think the discussion has shown that enlaring the buffer makes sense,
> and that the buffer size for unformatted seems to be too bad.
> 
> I've reversed the names of the environment variables according to
> Behnard's suggestion.
> 
> So, OK for trunk?
> 
> Also, what should we do about gcc-9?  I have now come to think
> that we should add the environment variables to set the buffer lengths,
> but leave the old default (8192).
> 
> What do you think?
> 

If you are inclined to back port a portion of the patch to 9-branch,
then bumping up the old default would seem to be the most important
part.  As dje noted, users seem to have an aversion to reading the
documentation, so finding the environment variables may not happen.

Isn't 8192 an internal implementation detail for libgfortran?  Can
bumping it to larger value in 9-branch cause an issue for a normal
user?

-- 
Steve


Remove array_index inliner hint

2019-07-14 Thread Jan Hubicka
Hi,
array_index hint marks functions contains an array reference that
is indexed by value that will become constant after inlining.
This hint is later used by ipa-inline in rather agressive way updating
inline-insns-auto to inline-insns-single for such functions.  While tuning
-finline-functions for -O2 I have noticed that this leads to 30% code size
growth of SPEC versions of GCC. This is because all predicates like
register_operand contains such array references based on mode parameter and
this makes GCC to conclude about agressively inlining them all which bloats
insn-* a lot.

This hint was quite early experiment with propagating additional stuff through
inliner.  I think it is better to actualy account the instructions which will
be generated for array indexing rather then handling this specially.

This patch makes us to accound 1 instruction for every non-constant array
access.  This is still probably making inliner overly optimistic about inlining
benefits since these accesses will later be CSEed, but it kills bit of magic
and makes things more robust.

This will make inliner notice that function will simplify and give it
higher priority for inlining possibly still bypassing the bounds if big
speedup is achieved. This is however a lot more rare than before.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

* ipa-fnsummary.c (ipa_dump_hints): Do not dump array_index.
(ipa_fn_summary::~ipa_fn_summary): Do not destroy array_index.
(ipa_fn_summary_t::duplicate): Do not duplicate array_index.
(array_index_predicate): Remove.
(analyze_function_body): Account cost for variable ofsetted array
indexing.
(estimate_node_size_and_time): Do not compute array index hint.
(ipa_merge_fn_summary_after_inlining): Do not merge array index hint.
(inline_read_section): Do not read array index hint.
(ipa_fn_summary_write): Do not write array index hint.
* doc/invoke.texi (ipa-cp-array-index-hint-bonus): Remove.
* ipa-cp.c (hint_time_bonus): Remove.
* ipa-fnsummary.h (ipa_hints_vals): Remove array_index.
(ipa_fnsummary): Remove array_index.
* ipa-inline.c (want_inline_small_function_p): Do not use
array_index.
(edge_badness): Likewise.
* params.def (PARAM_IPA_CP_ARRAY_INDEX_HINT_BONUS): Remove.

Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 273450)
+++ doc/invoke.texi (working copy)
@@ -11895,12 +11895,6 @@ of iterations of a loop known, it adds a
 @option{ipa-cp-loop-hint-bonus} to the profitability score of
 the candidate.
 
-@item ipa-cp-array-index-hint-bonus
-When IPA-CP determines that a cloning candidate would make the index of
-an array access known, it adds a bonus of
-@option{ipa-cp-array-index-hint-bonus} to the profitability
-score of the candidate.
-
 @item ipa-max-aa-steps
 During its analysis of function bodies, IPA-CP employs alias analysis
 in order to track values pointed to by function parameters.  In order
Index: ipa-cp.c
===
--- ipa-cp.c(revision 273450)
+++ ipa-cp.c(working copy)
@@ -2607,8 +2607,6 @@ hint_time_bonus (ipa_hints hints)
   int result = 0;
   if (hints & (INLINE_HINT_loop_iterations | INLINE_HINT_loop_stride))
 result += PARAM_VALUE (PARAM_IPA_CP_LOOP_HINT_BONUS);
-  if (hints & INLINE_HINT_array_index)
-result += PARAM_VALUE (PARAM_IPA_CP_ARRAY_INDEX_HINT_BONUS);
   return result;
 }
 
Index: ipa-fnsummary.c
===
--- ipa-fnsummary.c (revision 273450)
+++ ipa-fnsummary.c (working copy)
@@ -134,11 +134,6 @@ ipa_dump_hints (FILE *f, ipa_hints hints
   hints &= ~INLINE_HINT_declared_inline;
   fprintf (f, " declared_inline");
 }
-  if (hints & INLINE_HINT_array_index)
-{
-  hints &= ~INLINE_HINT_array_index;
-  fprintf (f, " array_index");
-}
   if (hints & INLINE_HINT_known_hot)
 {
   hints &= ~INLINE_HINT_known_hot;
@@ -549,8 +544,6 @@ ipa_fn_summary::~ipa_fn_summary ()
 edge_predicate_pool.remove (loop_iterations);
   if (loop_stride)
 edge_predicate_pool.remove (loop_stride);
-  if (array_index)
-edge_predicate_pool.remove (array_index);
   vec_free (conds);
   vec_free (size_time_table);
 }
@@ -703,8 +696,6 @@ ipa_fn_summary_t::duplicate (cgraph_node
  possible_truths);
   remap_hint_predicate_after_duplication (&info->loop_stride,
  possible_truths);
-  remap_hint_predicate_after_duplication (&info->array_index,
- possible_truths);
 
   /* If inliner or someone after inliner will ever start producing
  non-trivial clones, we will get trouble with lack of information
@@ -727,12 +718,6 @@ ipa_fn_summary_t::duplic

Re: [PATCH] gdbhooks.py: dump-fn, dot-fn: cast ret values of fopen/fclose

2019-07-14 Thread Tom de Vries
On 09-07-19 16:10, Vladislav Ivanishin wrote:
> Hi,
> 
> Without the patch, I see these error messages with gdb 8.3:
> 
> (gdb) Python Exception  'fclose@@GLIBC_2.2.5' has
> unknown return type; cast the call to its declared return type:
> (gdb) Error occurred in Python: 'fclose@@GLIBC_2.2.5' has unknown
> return type; cast the call to its declared return type
> 
> One doesn't have to use python to reproduce that: start debugging cc1
> and issue
> 
> (gdb) call fopen ("", "")
> 
> This actually looks like a GDB bug: from looking at cc1's (built with
> either -g, or -ggdb3) DWARF with either dwarfdump, or readelf I see that
> there is info about the return type (for fopen it's FILE *, and `ptype
> FILE` in gdb gives the full struct).
> 
> Tom, you contributed the {dot,dump}-fn functions.  Do they still work
> for you without the patch?  (And if so, do you happen to have debuginfo
> for libc installed on you machine?)
> 

I just observed the same problem, and managed to make it go away by
installing libc debuginfo.

> I think, the patch itself is obvious (as a workaround).  I've only
> tested it with the version of GDB I have (8.3, which is the latest
> release), but expect this to work for older versions as well.
> 
> (Comparisons of gdb.Value's returned from parse_and_eval, like fp == 0
> and their conversion to python strings in "%s" % fp work automagically.)
> 
> 

LGTM. I think you can check this in under the obvious rule.

Thanks,
- Tom


Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads

2019-07-14 Thread Thomas Koenig

... of course, better with the actual patch.

Index: gcc/fortran/gfortran.texi
===
--- gcc/fortran/gfortran.texi	(Revision 273183)
+++ gcc/fortran/gfortran.texi	(Arbeitskopie)
@@ -611,6 +611,8 @@ Malformed environment variables are silently ignor
 * GFORTRAN_LIST_SEPARATOR::  Separator for list output
 * GFORTRAN_CONVERT_UNIT::  Set endianness for unformatted I/O
 * GFORTRAN_ERROR_BACKTRACE:: Show backtrace on run-time errors
+* GFORTRAN_FORMATTED_BUFFER_SIZE:: Buffer size for formatted files.
+* GFORTRAN_UNFORMATTED_BUFFER_SIZE:: Buffer size for unformatted files.
 @end menu
 
 @node TMPDIR
@@ -782,6 +784,20 @@ the backtracing, set the variable to @samp{n}, @sa
 Default is to print a backtrace unless the @option{-fno-backtrace}
 compile option was used.
 
+@node GFORTRAN_FORMATTED_BUFFER_SIZE
+@section @env{GFORTRAN_FORMATTED_BUFFER_SIZE}---Set buffer size for formatted I/O
+
+The @env{GFORTRAN_FORMATTED_BUFFER_SIZE} environment variable
+specifies buffer size in bytes to be used for formatted output.
+The default value is 8192.
+
+@node GFORTRAN_UNFORMATTED_BUFFER_SIZE
+@section @env{GFORTRAN_UNFORMATTED_BUFFER_SIZE}---Set buffer size for unformatted I/O
+
+The @env{GFORTRAN_UNFORMATTED_BUFFER_SIZE} environment variable
+specifies buffer size in bytes to be used for unformatted output.
+The default value is 131072.
+
 @c =
 @c PART II: LANGUAGE REFERENCE
 @c =
Index: libgfortran/io/unix.c
===
--- libgfortran/io/unix.c	(Revision 273183)
+++ libgfortran/io/unix.c	(Arbeitskopie)
@@ -193,7 +193,8 @@ fallback_access (const char *path, int mode)
 
 /* Unix and internal stream I/O module */
 
-static const int BUFFER_SIZE = 8192;
+static const int FORMATTED_BUFFER_SIZE_DEFAULT = 8192;
+static const int UNFORMATTED_BUFFER_SIZE_DEFAULT = 128*1024;
 
 typedef struct
 {
@@ -205,6 +206,7 @@ typedef struct
   gfc_offset file_length;	/* Length of the file. */
 
   char *buffer; /* Pointer to the buffer.  */
+  ssize_t buffer_size;   /* Length of the buffer.  */
   int fd;   /* The POSIX file descriptor.  */
 
   int active;			/* Length of valid bytes in the buffer */
@@ -592,9 +594,9 @@ buf_read (unix_stream *s, void *buf, ssize_t nbyte
   && raw_seek (s, new_logical, SEEK_SET) < 0)
 return -1;
   s->buffer_offset = s->physical_offset = new_logical;
-  if (to_read <= BUFFER_SIZE/2)
+  if (to_read <= s->buffer_size/2)
 {
-  did_read = raw_read (s, s->buffer, BUFFER_SIZE);
+  did_read = raw_read (s, s->buffer, s->buffer_size);
 	  if (likely (did_read >= 0))
 	{
 	  s->physical_offset += did_read;
@@ -632,11 +634,11 @@ buf_write (unix_stream *s, const void *buf, ssize_
 s->buffer_offset = s->logical_offset;
 
   /* Does the data fit into the buffer?  As a special case, if the
- buffer is empty and the request is bigger than BUFFER_SIZE/2,
+ buffer is empty and the request is bigger than s->buffer_size/2,
  write directly. This avoids the case where the buffer would have
  to be flushed at every write.  */
-  if (!(s->ndirty == 0 && nbyte > BUFFER_SIZE/2)
-  && s->logical_offset + nbyte <= s->buffer_offset + BUFFER_SIZE
+  if (!(s->ndirty == 0 && nbyte > s->buffer_size/2)
+  && s->logical_offset + nbyte <= s->buffer_offset + s->buffer_size
   && s->buffer_offset <= s->logical_offset
   && s->buffer_offset + s->ndirty >= s->logical_offset)
 {
@@ -651,7 +653,7 @@ buf_write (unix_stream *s, const void *buf, ssize_
  the request is bigger than the buffer size, write directly
  bypassing the buffer.  */
   buf_flush (s);
-  if (nbyte <= BUFFER_SIZE/2)
+  if (nbyte <= s->buffer_size/2)
 {
   memcpy (s->buffer, buf, nbyte);
   s->buffer_offset = s->logical_offset;
@@ -688,7 +690,7 @@ buf_write (unix_stream *s, const void *buf, ssize_
 static int
 buf_markeor (unix_stream *s)
 {
-  if (s->unbuffered || s->ndirty >= BUFFER_SIZE / 2)
+  if (s->unbuffered || s->ndirty >= s->buffer_size / 2)
 return buf_flush (s);
   return 0;
 }
@@ -765,11 +767,32 @@ static const struct stream_vtable buf_vtable = {
 };
 
 static int
-buf_init (unix_stream *s)
+buf_init (unix_stream *s, bool unformatted)
 {
   s->st.vptr = &buf_vtable;
 
-  s->buffer = xmalloc (BUFFER_SIZE);
+  /* Try to guess a good value for the buffer size.  For formatted
+ I/O, we use so many CPU cycles converting the data that there is
+ more sense in converving memory and especially cache.  For
+ unformatted, a bigger block can have a large impact in some
+ environments.  */
+
+  if (unformatted)
+{
+  if (options.unformatted_buffer_size > 0)
+	s->buffer_size = options.unformatted_buffer_size;
+  

Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads

2019-07-14 Thread Thomas Koenig

OK, so here is a new version.

I think the discussion has shown that enlaring the buffer makes sense,
and that the buffer size for unformatted seems to be too bad.

I've reversed the names of the environment variables according to
Behnard's suggestion.

So, OK for trunk?

Also, what should we do about gcc-9?  I have now come to think
that we should add the environment variables to set the buffer lengths,
but leave the old default (8192).

What do you think?

Regards

Thomas

2019-07-14  Thomas König  

PR libfortran/91030
* gfortran.texi (GFORTRAN_FORMATTED_BUFFER_SIZE): Document
(GFORTRAN_UNFORMATTED_BUFFER_SIZE): Likewise.

2019-07-14  Thomas König  

PR libfortran/91030
* io/unix.c (BUFFER_SIZE): Delete.
(BUFFER_FORMATTED_SIZE_DEFAULT): New variable.
(BUFFER_UNFORMATTED_SIZE_DEFAULT): New variable.
(unix_stream): Add buffer_size.
(buf_read): Use s->buffer_size instead of BUFFER_SIZE.
(buf_write): Likewise.
(buf_init): Add argument unformatted.  Handle block sizes
for unformatted vs. formatted, using defaults if provided.
(fd_to_stream): Add argument unformatted in call to buf_init.
* libgfortran.h (options_t): Add buffer_size_formatted and
buffer_size_unformatted.
* runtime/environ.c (variable_table): Add
GFORTRAN_UNFORMATTED_BUFFER_SIZE and
GFORTRAN_FORMATTED_BUFFER_SIZE.


Backports to 9.2

2019-07-14 Thread Jakub Jelinek
Hi!

I've backported 3 patches from trunk to gcc-9-branch, bootstrapped/regtested
them on x86_64-linux and i686-linux, committed to gcc-9-branch.

Jakub
2019-07-14  Jakub Jelinek  

Backported from mainline
2019-07-04  Jakub Jelinek  

PR rtl-optimization/90756
* explow.c (promote_ssa_mode): Always use TYPE_MODE, don't bypass it
for VECTOR_TYPE_P.

* gcc.dg/pr90756.c: New test.

--- gcc/explow.c(revision 273035)
+++ gcc/explow.c(revision 273036)
@@ -892,16 +892,7 @@ promote_ssa_mode (const_tree name, int *
 
   tree type = TREE_TYPE (name);
   int unsignedp = TYPE_UNSIGNED (type);
-  machine_mode mode = TYPE_MODE (type);
-
-  /* Bypass TYPE_MODE when it maps vector modes to BLKmode.  */
-  if (mode == BLKmode)
-{
-  gcc_assert (VECTOR_TYPE_P (type));
-  mode = type->type_common.mode;
-}
-
-  machine_mode pmode = promote_mode (type, mode, &unsignedp);
+  machine_mode pmode = promote_mode (type, TYPE_MODE (type), &unsignedp);
   if (punsignedp)
 *punsignedp = unsignedp;
 
--- gcc/testsuite/gcc.dg/pr90756.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/pr90756.c  (revision 273036)
@@ -0,0 +1,26 @@
+/* PR rtl-optimization/90756 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wno-psabi" } */
+/* { dg-additional-options "-mno-sse" { target ia32 } } */
+
+typedef float B __attribute__((vector_size(4 * sizeof (float;
+typedef unsigned long long C __attribute__((vector_size(4 * sizeof (long 
long;
+typedef short D __attribute__((vector_size(4 * sizeof (short;
+B z;
+void foo (C);
+C bar (D);
+B baz ();
+D qux (B);
+
+void
+quux (int x)
+{
+  B n = z, b = z;
+  while (1)
+switch (x)
+  {
+  case 0: n = baz (); /* FALLTHRU */
+  case 1: { B o = n; n = b; b = o; } /* FALLTHRU */
+  case 2: { D u = qux (b); C v = bar (u); foo (v); }
+  }
+}
2019-07-14  Jakub Jelinek  

Backported from mainline
2019-07-04  Jakub Jelinek  

PR middle-end/78884
* gimplify.c (struct gimplify_omp_ctx): Add add_safelen1 member.
(gimplify_bind_expr): If seeing TREE_ADDRESSABLE VLA inside of simd
loop body, set ctx->add_safelen1 instead of making it GOVD_PRIVATE.
(gimplify_adjust_omp_clauses): Add safelen (1) clause if
ctx->add_safelen1 is set.

* gcc.dg/gomp/pr78884.c: New test.

--- gcc/gimplify.c  (revision 273095)
+++ gcc/gimplify.c  (revision 273096)
@@ -210,6 +210,7 @@ struct gimplify_omp_ctx
   bool combined_loop;
   bool distribute;
   bool target_firstprivatize_array_bases;
+  bool add_safelen1;
   int defaultmap[4];
 };
 
@@ -1319,12 +1320,17 @@ gimplify_bind_expr (tree *expr_p, gimple
  || splay_tree_lookup (ctx->variables,
(splay_tree_key) t) == NULL))
{
+ int flag = GOVD_LOCAL;
  if (ctx->region_type == ORT_SIMD
  && TREE_ADDRESSABLE (t)
  && !TREE_STATIC (t))
-   omp_add_variable (ctx, t, GOVD_PRIVATE | GOVD_SEEN);
- else
-   omp_add_variable (ctx, t, GOVD_LOCAL | GOVD_SEEN);
+   {
+ if (TREE_CODE (DECL_SIZE_UNIT (t)) != INTEGER_CST)
+   ctx->add_safelen1 = true;
+ else
+   flag = GOVD_PRIVATE;
+   }
+ omp_add_variable (ctx, t, flag | GOVD_SEEN);
}
 
  DECL_SEEN_IN_BIND_EXPR_P (t) = 1;
@@ -9684,6 +9690,19 @@ gimplify_adjust_omp_clauses (gimple_seq
   omp_find_stores_op, &wi);
}
 }
+
+  if (ctx->add_safelen1)
+{
+  /* If there are VLAs in the body of simd loop, prevent
+vectorization.  */
+  gcc_assert (ctx->region_type == ORT_SIMD);
+  c = build_omp_clause (UNKNOWN_LOCATION, OMP_CLAUSE_SAFELEN);
+  OMP_CLAUSE_SAFELEN_EXPR (c) = integer_one_node;
+  OMP_CLAUSE_CHAIN (c) = *list_p;
+  *list_p = c;
+  list_p = &OMP_CLAUSE_CHAIN (c);
+}
+
   while ((c = *list_p) != NULL)
 {
   splay_tree_node n;
--- gcc/testsuite/gcc.dg/gomp/pr78884.c (nonexistent)
+++ gcc/testsuite/gcc.dg/gomp/pr78884.c (revision 273096)
@@ -0,0 +1,16 @@
+/* PR middle-end/78884 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fopenmp" } */
+
+void bar (int *);
+
+void
+foo (int n)
+{
+#pragma omp simd
+  for (int i = 0; i < 1024; i++)
+{
+  int vla[n];
+  bar (vla);
+}
+}
2019-07-14  Jakub Jelinek  

Backported from mainline
2019-07-13  Jakub Jelinek  

PR c/91149
* c-omp.c (c_omp_split_clauses): Fix a pasto in
OMP_CLAUSE_REDUCTION_TASK handling.

* c-c++-common/gomp/reduction-task-3.c: New test.

--- gcc/c-family/c-omp.c(revision 273464)
+++ gcc/c-family/c-omp.c(revision 273465)
@@ -1667,7 +1667,7 @@ c_omp_split_clauses (location_t loc, enu
}
  else if (code != OMP_SECTIONS
 

[PATCH] rs6000: Shut up -Wformat-diag a little more

2019-07-14 Thread Segher Boessenkool
2019-07-14  Segher Boessenkool  

PR target/91148
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Remove
superfluous "builtin function" phrasing.

gcc/testsuite/
* gcc.target/powerpc/bfp/scalar-extract-exp-2.c: Adjust.
* gcc.target/powerpc/bfp/scalar-extract-sig-2.c: Adjust.
* gcc.target/powerpc/bfp/scalar-insert-exp-2.c: Adjust.
* gcc.target/powerpc/bfp/scalar-insert-exp-5.c: Adjust.
* gcc.target/powerpc/bfp/scalar-insert-exp-8.c: Adjust.
* gcc.target/powerpc/byte-in-set-2.c: Adjust.
* gcc.target/powerpc/cmpb-3.c: Adjust.
* gcc.target/powerpc/vsu/vec-all-nez-7.c: Adjust.
* gcc.target/powerpc/vsu/vec-any-eqz-7.c: Adjust.
* gcc.target/powerpc/vsu/vec-xl-len-13.c: Adjust.
* gcc.target/powerpc/vsu/vec-xst-len-12.c: Adjust.

---
 gcc/config/rs6000/rs6000-c.c| 3 +--
 gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c | 2 +-
 gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c | 2 +-
 gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-2.c  | 2 +-
 gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-5.c  | 2 +-
 gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-8.c  | 2 +-
 gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c| 2 +-
 gcc/testsuite/gcc.target/powerpc/cmpb-3.c   | 2 +-
 gcc/testsuite/gcc.target/powerpc/vsu/vec-all-nez-7.c| 2 +-
 gcc/testsuite/gcc.target/powerpc/vsu/vec-any-eqz-7.c| 2 +-
 gcc/testsuite/gcc.target/powerpc/vsu/vec-xl-len-13.c| 2 +-
 gcc/testsuite/gcc.target/powerpc/vsu/vec-xst-len-12.c   | 2 +-
 12 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 7854082..7f0cdc7 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -7036,8 +7036,7 @@ altivec_resolve_overloaded_builtin (location_t loc, tree 
fndecl,
name, internal_name);
  }
else
- error ("builtin function %qs not supported in this compiler "
-"configuration", name);
+ error ("%qs is not supported in this compiler configuration", name);
/* If an error-representing  result tree was returned from
   altivec_build_resolved_builtin above, use it.  */
return (result != NULL) ? result : error_mark_node;
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
index b04462f..9221806 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
@@ -14,7 +14,7 @@ get_exponent (double *p)
 {
   double source = *p;
 
-  return scalar_extract_exp (source);  /* { dg-error 
"'__builtin_vec_scalar_extract_exp' not supported in this compiler 
configuration" } */
+  return scalar_extract_exp (source);  /* { dg-error 
"'__builtin_vec_scalar_extract_exp' is not supported in this compiler 
configuration" } */
 }
 
 
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c
index c912879..e24d4bd 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c
@@ -12,5 +12,5 @@ get_significand (double *p)
 {
   double source = *p;
 
-  return __builtin_vec_scalar_extract_sig (source); /* { dg-error 
"'__builtin_vec_scalar_extract_sig' not supported in this compiler 
configuration" } */
+  return __builtin_vec_scalar_extract_sig (source); /* { dg-error 
"'__builtin_vec_scalar_extract_sig' is not supported in this compiler 
configuration" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-2.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-2.c
index af27099..feb9431 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-2.c
@@ -16,5 +16,5 @@ insert_exponent (unsigned long long int *significand_p,
   unsigned long long int significand = *significand_p;
   unsigned long long int exponent = *exponent_p;
 
-  return scalar_insert_exp (significand, exponent); /* { dg-error 
"'__builtin_vec_scalar_insert_exp' not supported in this compiler 
configuration" } */
+  return scalar_insert_exp (significand, exponent); /* { dg-error 
"'__builtin_vec_scalar_insert_exp' is not supported in this compiler 
configuration" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-5.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-5.c
index 05b98d9..0e5683d 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-5.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-5.c
@@ -16,5 +16,5 @@ insert_exponent (double *significand_p,
   double significand = *significand_p;
   unsigned long