[PING 3] Ability to remap file names in __FILE__, etc (PR other/70268)

2017-12-13 Thread Boris Kolpackov
Hi,

I would like to again ping this patch:

https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01451.html

It has been reviewed (with thanks) by David Malcolm[1] and
Martin Sebor[2]. Their concerns are addressed in the latest
revision of the patch:

https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00544.html

I am still hoping this will make it into GCC 8.

[1] https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00322.html
[2] https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00398.html

Thanks,
Boris


Re: [PATCH][Middle-end]79538 missing -Wformat-overflow with %s and non-member array arguments

2017-12-13 Thread Qing Zhao
Hi,

I updated gimple-fold.c as you suggested, bootstrapped and re-tested on both 
x86 and aarch64. no any issue.


diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 353a46e..eb6a87a 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1323,6 +1323,19 @@ get_range_strlen (tree arg, tree length[2], bitmap 
*visited, int type,
 the array could have zero length.  */
  *minlen = ssize_int (0);
}
+
+ if (VAR_P (arg) 
+ && TREE_CODE (TREE_TYPE (arg)) == ARRAY_TYPE)
+   {
+ val = TYPE_SIZE_UNIT (TREE_TYPE (arg));
+ if (!val || TREE_CODE (val) != INTEGER_CST || integer_zerop (val))
+   return false;
+ val = wide_int_to_tree (TREE_TYPE (val), 
+ wi::sub(wi::to_wide (val), 1));
+ /* Set the minimum size to zero since the string in
+the array could have zero length.  */
+ *minlen = ssize_int (0);
+   }
}


I plan to commit the change very soon. 
let me know if you have further comment.

thanks.

Qing

==

the updated full patch is as following:

gcc/ChangeLog

2017-12-13  Qing Zhao  

 PR middle_end/79538
 * gimple-fold.c (get_range_strlen): Add the handling of non-member array.

gcc/fortran/ChangeLog

2017-12-13  Qing Zhao  

  PR middle_end/79538
  * class.c (gfc_build_class_symbol): Replace call to
  sprintf with xasprintf to avoid format-overflow warning.
  (generate_finalization_wrapper): Likewise.
  (gfc_find_derived_vtab): Likewise.
  (find_intrinsic_vtab): Likewise.


gcc/c-family/ChangeLog

2017-12-13  Qing Zhao  

 PR middle_end/79538 
* c-cppbuiltin.c (builtin_define_with_hex_fp_value):
 Adjust the size of buf1 and buf2, add a new buf to avoid
 format-overflow warning.

gcc/testsuite/ChangeLog

2017-12-13  Qing Zhao  

 PR middle_end/79538
 * gcc.dg/pr79538.c: New test.

---
 gcc/c-family/c-cppbuiltin.c| 10 -
 gcc/fortran/class.c| 49 --
 gcc/gimple-fold.c  | 13 +++
 gcc/testsuite/gcc.dg/pr79538.c | 23 
 4 files changed, 69 insertions(+), 26 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr79538.c

diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index 2ac9616..9e33aed 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -1613,7 +1613,7 @@ builtin_define_with_hex_fp_value (const char *macro,
  const char *fp_cast)
 {
   REAL_VALUE_TYPE real;
-  char dec_str[64], buf1[256], buf2[256];
+  char dec_str[64], buf[256], buf1[128], buf2[64];
 
   /* This is very expensive, so if possible expand them lazily.  */
   if (lazy_hex_fp_value_count < 12
@@ -1656,11 +1656,11 @@ builtin_define_with_hex_fp_value (const char *macro,
 
   /* Assemble the macro in the following fashion
  macro = fp_cast [dec_str fp_suffix] */
-  sprintf (buf1, "%s%s", dec_str, fp_suffix);
-  sprintf (buf2, fp_cast, buf1);
-  sprintf (buf1, "%s=%s", macro, buf2);
+  sprintf (buf2, "%s%s", dec_str, fp_suffix);
+  sprintf (buf1, fp_cast, buf2);
+  sprintf (buf, "%s=%s", macro, buf1);
 
-  cpp_define (parse_in, buf1);
+  cpp_define (parse_in, buf);
 }
 
 /* Return a string constant for the suffix for a value of type TYPE
diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c
index ebbd41b..a08fb8d 100644
--- a/gcc/fortran/class.c
+++ b/gcc/fortran/class.c
@@ -602,7 +602,8 @@ bool
 gfc_build_class_symbol (gfc_typespec *ts, symbol_attribute *attr,
gfc_array_spec **as)
 {
-  char name[GFC_MAX_SYMBOL_LEN+1], tname[GFC_MAX_SYMBOL_LEN+1];
+  char tname[GFC_MAX_SYMBOL_LEN+1];
+  char *name;
   gfc_symbol *fclass;
   gfc_symbol *vtab;
   gfc_component *c;
@@ -633,17 +634,17 @@ gfc_build_class_symbol (gfc_typespec *ts, 
symbol_attribute *attr,
   rank = !(*as) || (*as)->rank == -1 ? GFC_MAX_DIMENSIONS : (*as)->rank;
   get_unique_hashed_string (tname, ts->u.derived);
   if ((*as) && attr->allocatable)
-sprintf (name, "__class_%s_%d_%da", tname, rank, (*as)->corank);
+name = xasprintf ("__class_%s_%d_%da", tname, rank, (*as)->corank);
   else if ((*as) && attr->pointer)
-sprintf (name, "__class_%s_%d_%dp", tname, rank, (*as)->corank);
+name = xasprintf ("__class_%s_%d_%dp", tname, rank, (*as)->corank);
   else if ((*as))
-sprintf (name, "__class_%s_%d_%dt", tname, rank, (*as)->corank);
+name = xasprintf ("__class_%s_%d_%dt", tname, rank, (*as)->corank);
   else if (attr->pointer)
-sprintf (name, "__class_%s_p", tname);
+name = xasprintf ("__class_%s_p", tname);
   else if (attr->allocatable)
-sprintf (name, "__class_%s_a", tname);
+name = xasprintf ("__class_%s_a", tname);
   else
-sprintf (name, "__class_%s_t", tname);
+name = xasprintf ("__class_%s_t", tname);
 
   if (ts->u.derived->attr.unlimited_poly

Re: Add support for SVE gather loads

2017-12-13 Thread Jeff Law
On 11/17/2017 02:58 PM, Richard Sandiford wrote:
> This patch adds support for SVE gather loads.  It uses the basically
> the same analysis code as the AVX gather support, but after that
> there are two major differences:
> 
> - It uses new internal functions rather than target built-ins.
>   The interface is:
> 
>  IFN_GATHER_LOAD (base, offsets, scale)
>  IFN_MASK_GATHER_LOAD (base, offsets, scale, mask)
> 
>   which should be reasonably generic.  One of the advantages of
>   using internal functions is that other passes can understand what
>   the functions do, but a more immediate advantage is that we can
>   query the underlying target pattern to see which scales it supports.
> 
> - It uses pattern recognition to convert the offset to the right width,
>   if it was originally narrower than that.  This avoids having to do
>   a widening operation as part of the gather expansion itself.
> 
> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> and powerpc64le-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> 2017-11-17  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * doc/md.texi (gather_load@var{m}): Document.
>   (mask_gather_load@var{m}): Likewise.
>   * genopinit.c (main): Add supports_vec_gather_load and
>   supports_vec_gather_load_cached to target_optabs.
>   * optabs-tree.c (init_tree_optimization_optabs): Use
>   ggc_cleared_alloc to allocate target_optabs.
>   * optabs.def (gather_load_optab, mask_gather_laod_optab): New optabs.
>   * internal-fn.def (GATHER_LOAD, MASK_GATHER_LOAD): New internal
>   functions.
>   * internal-fn.h (internal_load_fn_p): Declare.
>   (internal_gather_scatter_fn_p): Likewise.
>   (internal_fn_mask_index): Likewise.
>   (internal_gather_scatter_fn_supported_p): Likewise.
>   * internal-fn.c (gather_load_direct): New macro.
>   (expand_gather_load_optab_fn): New function.
>   (direct_gather_load_optab_supported_p): New macro.
>   (direct_internal_fn_optab): New function.
>   (internal_load_fn_p): Likewise.
>   (internal_gather_scatter_fn_p): Likewise.
>   (internal_fn_mask_index): Likewise.
>   (internal_gather_scatter_fn_supported_p): Likewise.
>   * optabs-query.c (supports_at_least_one_mode_p): New function.
>   (supports_vec_gather_load_p): Likewise.
>   * optabs-query.h (supports_vec_gather_load_p): Declare.
>   * tree-vectorizer.h (gather_scatter_info): Add ifn, element_type
>   and memory_type field.
>   (NUM_PATTERNS): Bump to 15.
>   * tree-vect-data-refs.c (vect_gather_scatter_fn_p): New function.
>   (vect_describe_gather_scatter_call): Likewise.
>   (vect_check_gather_scatter): Try using internal functions for
>   gather loads.  Recognize existing calls to a gather load function.
>   (vect_analyze_data_refs): Consider using gather loads if
>   supports_vec_gather_load_p.
>   * tree-vect-patterns.c (vect_get_load_store_mask): New function.
>   (vect_get_gather_scatter_offset_type): Likewise.
>   (vect_convert_mask_for_vectype): Likewise.
>   (vect_add_conversion_to_patterm): Likewise.
>   (vect_try_gather_scatter_pattern): Likewise.
>   (vect_recog_gather_scatter_pattern): New pattern recognizer.
>   (vect_vect_recog_func_ptrs): Add it.
>   * tree-vect-stmts.c (exist_non_indexing_operands_for_use_p): Use
>   internal_fn_mask_index and internal_gather_scatter_fn_p.
>   (check_load_store_masking): Take the gather_scatter_info as an
>   argument and handle gather loads.
>   (vect_get_gather_scatter_ops): New function.
>   (vectorizable_call): Check internal_load_fn_p.
>   (vectorizable_load): Likewise.  Handle gather load internal
>   functions.
>   (vectorizable_store): Update call to check_load_store_masking.
>   * config/aarch64/aarch64.md (UNSPEC_LD1_GATHER): New unspec.
>   * config/aarch64/iterators.md (SVE_S, SVE_D): New mode iterators.
>   * config/aarch64/predicates.md (aarch64_gather_scale_operand_w)
>   (aarch64_gather_scale_operand_d): New predicates.
>   * config/aarch64/aarch64-sve.md (gather_load): New expander.
>   (mask_gather_load): New insns.
> 
> gcc/testsuite/
>   * gcc.target/aarch64/sve_gather_load_1.c: New test.
>   * gcc.target/aarch64/sve_gather_load_2.c: Likewise.
>   * gcc.target/aarch64/sve_gather_load_3.c: Likewise.
>   * gcc.target/aarch64/sve_gather_load_4.c: Likewise.
>   * gcc.target/aarch64/sve_gather_load_5.c: Likewise.
>   * gcc.target/aarch64/sve_gather_load_6.c: Likewise.
>   * gcc.target/aarch64/sve_gather_load_7.c: Likewise.
>   * gcc.target/aarch64/sve_mask_gather_load_1.c: Likewise.
>   * gcc.target/aarch64/sve_mask_gather_load_2.c: Likewise.
>   * gcc.target/aarch64/sve_mask_gather_load_3.c: Likewise.
>   * gcc.target/aarch64/sve_mask_gather_load_4.c: Likewise.
>   * gcc.target/aarch64/

[wwwdocs] readings.html - adjust another Donald Knuth link

2017-12-13 Thread Gerald Pfeifer
Somehow we must have missed this one while updating the other
related links on this page.

Committed.

Gerald

Index: readings.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/readings.html,v
retrieving revision 1.286
diff -u -r1.286 readings.html
--- readings.html   10 Dec 2017 11:44:38 -  1.286
+++ readings.html   10 Dec 2017 14:11:23 -
@@ -204,7 +204,7 @@
actual computers very similar to MMIX".  The name may also be due to a
predecessor appropriately named MIX.
MMIX is used in program examples in Donald E. Knuth's
-   http://www-cs-faculty.stanford.edu/~uno/taocp.html";>The Art
+   http://www-cs-faculty.stanford.edu/~knuth/taocp.html";>The Art
of Computer Programming (ISBN 0-201-89683-4).
The http://www-cs-faculty.stanford.edu/~knuth/mmix.html";>MMIX


Re: SLP reductions with variable-length vectors

2017-12-13 Thread Jeff Law
On 11/22/2017 11:10 AM, Richard Sandiford wrote:
> Richard Sandiford  writes:
>> Two things stopped us using SLP reductions with variable-length vectors:
>>
>> (1) We didn't have a way of constructing the initial vector.
>> This patch does it by creating a vector full of the neutral
>> identity value and then using a shift-and-insert function
>> to insert any non-identity inputs into the low-numbered elements.
>> (The non-identity values are needed for double reductions.)
>> Alternatively, for unchained MIN/MAX reductions that have no neutral
>> value, we instead use the same duplicate-and-interleave approach as
>> for SLP constant and external definitions (added by a previous
>> patch).
>>
>> (2) The epilogue for constant-length vectors would extract the vector
>> elements associated with each SLP statement and do scalar arithmetic
>> on these individual elements.  For variable-length vectors, the patch
>> instead creates a reduction vector for each SLP statement, replacing
>> the elements for other SLP statements with the identity value.
>> It then uses a hardware reduction instruction on each vector.
>>
>> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
>> and powerpc64le-linux-gnu.
> 
> Here's an updated version that applies on top of the recent
> removal of REDUC_*_EXPR.  Tested as before.
> 
> Thanks,
> Richard
> 
> 
> 2017-11-22  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * doc/md.texi (vec_shl_insert_@var{m}): New optab.
>   * internal-fn.def (VEC_SHL_INSERT): New internal function.
>   * optabs.def (vec_shl_insert_optab): New optab.
>   * tree-vectorizer.h (can_duplicate_and_interleave_p): Declare.
>   (duplicate_and_interleave): Likewise.
>   * tree-vect-loop.c: Include internal-fn.h.
>   (neutral_op_for_slp_reduction): New function, split out from
>   get_initial_defs_for_reduction.
>   (get_initial_def_for_reduction): Handle option 2 for variable-length
>   vectors by loading the neutral value into a vector and then shifting
>   the initial value into element 0.
>   (get_initial_defs_for_reduction): Replace the code argument with
>   the neutral value calculated by neutral_op_for_slp_reduction.
>   Use gimple_build_vector for constant-length vectors.
>   Use IFN_VEC_SHL_INSERT for variable-length vectors if all
>   but the first group_size elements have a neutral value.
>   Use duplicate_and_interleave otherwise.
>   (vect_create_epilog_for_reduction): Take a neutral_op parameter.
>   Update call to get_initial_defs_for_reduction.  Handle SLP
>   reductions for variable-length vectors by creating one vector
>   result for each scalar result, with the elements associated
>   with other scalar results stubbed out with the neutral value.
>   (vectorizable_reduction): Call neutral_op_for_slp_reduction.
>   Require IFN_VEC_SHL_INSERT for double reductions on
>   variable-length vectors, or SLP reductions that have
>   a neutral value.  Require can_duplicate_and_interleave_p
>   support for variable-length unchained SLP reductions if there
>   is no neutral value, such as for MIN/MAX reductions.  Also require
>   the number of vector elements to be a multiple of the number of
>   SLP statements when doing variable-length unchained SLP reductions.
>   Update call to vect_create_epilog_for_reduction.
>   * tree-vect-slp.c (can_duplicate_and_interleave_p): Make public
>   and remove initial values.
>   (duplicate_and_interleave): Use IFN_VEC_SHL_INSERT for
>   variable-length vectors if all but the first group_size elements
>   have a neutral value.
>   * config/aarch64/aarch64.md (UNSPEC_INSR): New unspec.
>   * config/aarch64/aarch64-sve.md (vec_shl_insert_): New insn.
> 
> gcc/testsuite/
>   * gcc.dg/vect/pr37027.c: Remove XFAIL for variable-length vectors.
>   * gcc.dg/vect/pr67790.c: Likewise.
>   * gcc.dg/vect/slp-reduc-1.c: Likewise.
>   * gcc.dg/vect/slp-reduc-2.c: Likewise.
>   * gcc.dg/vect/slp-reduc-3.c: Likewise.
>   * gcc.dg/vect/slp-reduc-5.c: Likewise.
>   * gcc.target/aarch64/sve_slp_5.c: New test.
>   * gcc.target/aarch64/sve_slp_5_run.c: Likewise.
>   * gcc.target/aarch64/sve_slp_6.c: Likewise.
>   * gcc.target/aarch64/sve_slp_6_run.c: Likewise.
>   * gcc.target/aarch64/sve_slp_7.c: Likewise.
>   * gcc.target/aarch64/sve_slp_7_run.c: Likewise.
OK
jeff


Re: Add support for bitwise reductions

2017-12-13 Thread Jeff Law
On 11/22/2017 11:12 AM, Richard Sandiford wrote:
> Richard Sandiford  writes:
>> This patch adds support for the SVE bitwise reduction instructions
>> (ANDV, ORV and EORV).  It's a fairly mechanical extension of existing
>> REDUC_* operators.
>>
>> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
>> and powerpc64le-linux-gnu.
> 
> Here's an updated version that applies on top of the recent
> removal of REDUC_*_EXPR.  Tested as before.
> 
> Thanks,
> Richard
> 
> 
> 2017-11-22  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * optabs.def (reduc_and_scal_optab, reduc_ior_scal_optab)
>   (reduc_xor_scal_optab): New optabs.
>   * doc/md.texi (reduc_and_scal_@var{m}, reduc_ior_scal_@var{m})
>   (reduc_xor_scal_@var{m}): Document.
>   * doc/sourcebuild.texi (vect_logical_reduc): Likewise.
>   * internal-fn.def (IFN_REDUC_AND, IFN_REDUC_IOR, IFN_REDUC_XOR): New
>   internal functions.
>   * fold-const-call.c (fold_const_call): Handle them.
>   * tree-vect-loop.c (reduction_fn_for_scalar_code): Return the new
>   internal functions for BIT_AND_EXPR, BIT_IOR_EXPR and BIT_XOR_EXPR.
>   * config/aarch64/aarch64-sve.md (reduc__scal_):
>   (*reduc__scal_): New patterns.
>   * config/aarch64/iterators.md (UNSPEC_ANDV, UNSPEC_ORV)
>   (UNSPEC_XORV): New unspecs.
>   (optab): Add entries for them.
>   (BITWISEV): New int iterator.
>   (bit_reduc_op): New int attributes.
> 
> gcc/testsuite/
>   * lib/target-supports.exp (check_effective_target_vect_logical_reduc):
>   New proc.
>   * gcc.dg/vect/vect-reduc-or_1.c: Also run for vect_logical_reduc
>   and add an associated scan-dump test.  Prevent vectorization
>   of the first two loops.
>   * gcc.dg/vect/vect-reduc-or_2.c: Likewise.
>   * gcc.target/aarch64/sve_reduc_1.c: Add AND, IOR and XOR reductions.
>   * gcc.target/aarch64/sve_reduc_2.c: Likewise.
>   * gcc.target/aarch64/sve_reduc_1_run.c: Likewise.
>   (INIT_VECTOR): Tweak initial value so that some bits are always set.
>   * gcc.target/aarch64/sve_reduc_2_run.c: Likewise.
OK.
Jeff


Re: Add support for SVE scatter stores

2017-12-13 Thread Jeff Law
On 11/17/2017 03:10 PM, Richard Sandiford wrote:
> This is mostly a mechanical extension of the previous gather load
> support to scatter stores.  The internal functions in this case are:
> 
>   IFN_SCATTER_STORE (base, offsets, scale, values)
>   IFN_MASK_SCATTER_STORE (base, offsets, scale, values, mask)
> 
> However, one nonobvious change is to vect_analyze_data_ref_access.
> If we're treating an access as a gather load or scatter store
> (i.e. if STMT_VINFO_GATHER_SCATTER_P is true), the existing code
> would create a dummy data_reference whose step is 0.  There's not
> really much else it could do, since the whole point is that the
> step isn't predictable from iteration to iteration.  We then
> went into this code in vect_analyze_data_ref_access:
> 
>   /* Allow loads with zero step in inner-loop vectorization.  */
>   if (loop_vinfo && integer_zerop (step))
> {
>   GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt)) = NULL;
>   if (!nested_in_vect_loop_p (loop, stmt))
>   return DR_IS_READ (dr);
> 
> I.e. we'd take the step literally and assume that this is a load
> or store to an invariant address.  Loads from invariant addresses
> are supported but stores to them aren't.
> 
> The code therefore had the effect of disabling all scatter stores.
> AFAICT this is true of AVX too: although tests like avx512f-scatter-1.c
> test for the correctness of a scatter-like loop, they don't seem to
> check whether a scatter instruction is actually used.
> 
> The patch therefore makes vect_analyze_data_ref_access return true
> for scatters.  We do seem to handle the aliasing correctly;
> that's tested by other functions, and is symmetrical to the
> already-working gather case.
> 
> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> and powerpc64le-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> 2017-11-17  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * optabs.def (scatter_store_optab, mask_scatter_store_optab): New
>   optabs.
>   * doc/md.texi (scatter_store@var{m}, mask_scatter_store@var{m}):
>   Document.
>   * genopinit.c (main): Add supports_vec_scatter_store and
>   supports_vec_scatter_store_cached to target_optabs.
>   * gimple.h (gimple_expr_type): Handle IFN_SCATTER_STORE and
>   IFN_MASK_SCATTER_STORE.
>   * internal-fn.def (SCATTER_STORE, MASK_SCATTER_STORE): New internal
>   functions.
>   * internal-fn.h (internal_store_fn_p): Declare.
>   (internal_fn_stored_value_index): Likewise.
>   * internal-fn.c (scatter_store_direct): New macro.
>   (expand_scatter_store_optab_fn): New function.
>   (direct_scatter_store_optab_supported_p): New macro.
>   (internal_store_fn_p): New function.
>   (internal_gather_scatter_fn_p): Handle IFN_SCATTER_STORE and
>   IFN_MASK_SCATTER_STORE.
>   (internal_fn_mask_index): Likewise.
>   (internal_fn_stored_value_index): New function.
>   (internal_gather_scatter_fn_supported_p): Adjust operand numbers
>   for scatter stores.
>   * optabs-query.h (supports_vec_scatter_store_p): Declare.
>   * optabs-query.c (supports_vec_scatter_store_p): New function.
>   * tree-vectorizer.h (vect_get_store_rhs): Declare.
>   * tree-vect-data-refs.c (vect_analyze_data_ref_access): Return
>   true for scatter stores.
>   (vect_gather_scatter_fn_p): Handle scatter stores too.
>   (vect_check_gather_scatter): Consider using scatter stores if
>   supports_vec_scatter_store_p.
>   * tree-vect-patterns.c (vect_try_gather_scatter_pattern): Handle
>   scatter stores too.
>   * tree-vect-stmts.c (exist_non_indexing_operands_for_use_p): Use
>   internal_fn_stored_value_index.
>   (check_load_store_masking): Handle scatter stores too.
>   (vect_get_store_rhs): Make public.
>   (vectorizable_call): Use internal_store_fn_p.
>   (vectorizable_store): Handle scatter store internal functions.
>   (vect_transform_stmt): Compare GROUP_STORE_COUNT with GROUP_SIZE
>   when deciding whether the end of the group has been reached.
>   * config/aarch64/aarch64.md (UNSPEC_ST1_SCATTER): New unspec.
>   * config/aarch64/aarch64-sve.md (scatter_store): New expander.
>   (mask_scatter_store): New insns.
> 
> gcc/testsuite/
>   * gcc.target/aarch64/sve_mask_scatter_store_1.c: New test.
>   * gcc.target/aarch64/sve_mask_scatter_store_2.c: Likewise.
>   * gcc.target/aarch64/sve_scatter_store_1.c: Likewise.
>   * gcc.target/aarch64/sve_scatter_store_2.c: Likewise.
>   * gcc.target/aarch64/sve_scatter_store_3.c: Likewise.
>   * gcc.target/aarch64/sve_scatter_store_4.c: Likewise.
>   * gcc.target/aarch64/sve_scatter_store_5.c: Likewise.
>   * gcc.target/aarch64/sve_scatter_store_6.c: Likewise.
>   * gcc.target/aarch64/sve_scatter_store_7.c: Likewise.
>   * gcc.target/aarch64/sve_strided_store_1.c: Likewise.
>   * gcc.target/aarch64/sve_s

Re: Handle peeling for alignment with masking

2017-12-13 Thread Jeff Law
On 11/17/2017 08:13 AM, Richard Sandiford wrote:
> This patch adds support for aligning vectors by using a partial
> first iteration.  E.g. if the start pointer is 3 elements beyond
> an aligned address, the first iteration will have a mask in which
> the first three elements are false.
> 
> On SVE, the optimisation is only useful for vector-length-specific
> code.  Vector-length-agnostic code doesn't try to align vectors
> since the vector length might not be a power of 2.
> 
> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> and powerpc64le-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> 2017-11-17  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-vectorizer.h (_loop_vec_info::mask_skip_niters): New field.
>   (LOOP_VINFO_MASK_SKIP_NITERS): New macro.
>   (vect_use_loop_mask_for_alignment_p): New function.
>   (vect_prepare_for_masked_peels, vect_gen_while_not): Declare.
>   * tree-vect-loop-manip.c (vect_set_loop_masks_directly): Add an
>   niters_skip argument.  Make sure that the first niters_skip elements
>   of the first iteration are inactive.
>   (vect_set_loop_condition_masked): Handle LOOP_VINFO_MASK_SKIP_NITERS.
>   Update call to vect_set_loop_masks_directly.
>   (get_misalign_in_elems): New function, split out from...
>   (vect_gen_prolog_loop_niters): ...here.
>   (vect_update_init_of_dr): Take a code argument that specifies whether
>   the adjustment should be added or subtracted.
>   (vect_update_init_of_drs): Likewise.
>   (vect_prepare_for_masked_peels): New function.
>   (vect_do_peeling): Skip prologue peeling if we're using a mask
>   instead.  Update call to vect_update_inits_of_drs.
>   * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
>   mask_skip_niters.
>   (vect_analyze_loop_2): Allow fully-masked loops with peeling for
>   alignment.  Do not include the number of peeled iterations in
>   the minimum threshold in that case.
>   (vectorizable_induction): Adjust the start value down by
>   LOOP_VINFO_MASK_SKIP_NITERS iterations.
>   (vect_transform_loop): Call vect_prepare_for_masked_peels.
>   Take the number of skipped iterations into account when calculating
>   the loop bounds.
>   * tree-vect-stmts.c (vect_gen_while_not): New function.
OK.
jeff


Re: [PATCH] Improve alloca alignment

2017-12-13 Thread Eric Botcazou
> No clue, but ISTM that it should.  Eric, can you try that and see if it
> addresses these problems?  I'd really like to get this wrapped up, but I
> don't have access to any sparc systems to test it myself.

Yes, the INIT_EXPANDERS trick works for SPARC (but this has nothing to do with 
SPARC_STACK_BIAS) and avoid hardcoding the bogus alignment assumption in the 
get_dynamic_stack_size function.  As a matter of fact, this was the approach 
originally used by Dominik Vogt last year.

Of course this doesn't address the same potential issue on other targets but 
you don't seem to care much about that, so who am I to do it after all? ;-)

Tested on x86_64-suse-linux and SPARC/Solaris, applied on the mainline.


2017-12-13  Eric Botcazou  
Dominik Vogt  

PR middle-end/78468
* emit-rtl.c (init_emit): Remove ??? comment.
* explow.c (get_dynamic_stack_size): Take known alignment of stack
pointer + STACK_DYNAMIC_OFFSET into account in lieu of STACK_BOUNDARY
* config/sparc/sparc.h (INIT_EXPANDERS): In 32-bit mode, lower the
alignment of 3 virtual registers to BITS_PER_WORD.

* config/sparc/sparc.c (sparc_compute_frame_size): Simplify.

-- 
Eric BotcazouIndex: emit-rtl.c
===
--- emit-rtl.c	(revision 255578)
+++ emit-rtl.c	(working copy)
@@ -5764,8 +5764,6 @@ init_emit (void)
   REGNO_POINTER_ALIGN (HARD_FRAME_POINTER_REGNUM) = STACK_BOUNDARY;
   REGNO_POINTER_ALIGN (ARG_POINTER_REGNUM) = STACK_BOUNDARY;
 
-  /* ??? These are problematic (for example, 3 out of 4 are wrong on
- 32-bit SPARC and cannot be all fixed because of the ABI).  */
   REGNO_POINTER_ALIGN (VIRTUAL_INCOMING_ARGS_REGNUM) = STACK_BOUNDARY;
   REGNO_POINTER_ALIGN (VIRTUAL_STACK_VARS_REGNUM) = STACK_BOUNDARY;
   REGNO_POINTER_ALIGN (VIRTUAL_STACK_DYNAMIC_REGNUM) = STACK_BOUNDARY;
Index: explow.c
===
--- explow.c	(revision 255578)
+++ explow.c	(working copy)
@@ -1206,7 +1206,6 @@ get_dynamic_stack_size (rtx *psize, unsi
 			unsigned required_align,
 			HOST_WIDE_INT *pstack_usage_size)
 {
-  unsigned extra = 0;
   rtx size = *psize;
 
   /* Ensure the size is in the proper mode.  */
@@ -1242,16 +1241,16 @@ get_dynamic_stack_size (rtx *psize, unsi
  example), so we must preventively align the value.  We leave space
  in SIZE for the hole that might result from the alignment operation.  */
 
-  /* Since the stack is presumed to be aligned before this allocation,
- we only need to increase the size of the allocation if the required
- alignment is more than the stack alignment.  */
-  if (required_align > STACK_BOUNDARY)
+  unsigned known_align = REGNO_POINTER_ALIGN (VIRTUAL_STACK_DYNAMIC_REGNUM);
+  if (known_align == 0)
+known_align = BITS_PER_UNIT;
+  if (required_align > known_align)
 {
-  extra = (required_align - STACK_BOUNDARY) / BITS_PER_UNIT;
+  unsigned extra = (required_align - known_align) / BITS_PER_UNIT;
   size = plus_constant (Pmode, size, extra);
   size = force_operand (size, NULL_RTX);
-  if (size_align > STACK_BOUNDARY)
-	size_align = STACK_BOUNDARY;
+  if (size_align > known_align)
+	size_align = known_align;
 
   if (flag_stack_usage_info && pstack_usage_size)
 	*pstack_usage_size += extra;
Index: config/sparc/sparc.c
===
--- config/sparc/sparc.c	(revision 255578)
+++ config/sparc/sparc.c	(working copy)
@@ -5483,10 +5483,8 @@ sparc_compute_frame_size (HOST_WIDE_INT
 frame_size = apparent_frame_size = 0;
   else
 {
-  /* We subtract TARGET_STARTING_FRAME_OFFSET, remember it's negative.  */
-  apparent_frame_size
-	= ROUND_UP (size - targetm.starting_frame_offset (), 8);
-  apparent_frame_size += n_global_fp_regs * 4;
+  /* Start from the apparent frame size.  */
+  apparent_frame_size = ROUND_UP (size, 8) + n_global_fp_regs * 4;
 
   /* We need to add the size of the outgoing argument area.  */
   frame_size = apparent_frame_size + ROUND_UP (args_size, 8);
Index: config/sparc/sparc.h
===
--- config/sparc/sparc.h	(revision 255578)
+++ config/sparc/sparc.h	(working copy)
@@ -771,13 +771,29 @@ extern enum cmodel sparc_cmodel;
 /* The soft frame pointer does not have the stack bias applied.  */
 #define FRAME_POINTER_REGNUM 101
 
-/* Given the stack bias, the stack pointer isn't actually aligned.  */
 #define INIT_EXPANDERS			 \
   do {	 \
-if (crtl->emit.regno_pointer_align && SPARC_STACK_BIAS)	 \
+if (crtl->emit.regno_pointer_align)	 \
   {	 \
-	REGNO_POINTER_ALIGN (STACK_POINTER_REGNUM) = BITS_PER_UNIT;	 \
-	REGNO_POINTER_ALIGN (HARD_FRAME_POINTER_REGNUM) = BITS_PER_UNIT; \
+	/* The biased stack pointer is only aligned on BITS_PER_UNIT.  */\
+	if (SPARC_STACK_BIAS)		 \
+	  { \
+

Re: [PATCH] Fix gimple-ssa-sprintf.c ICE (PR tree-optimization/83198)

2017-12-13 Thread Martin Sebor

On 12/13/2017 03:16 PM, Jakub Jelinek wrote:

Hi!

This patch fixes 2 issues in format_floating.  One is that when determining
precision, we should consider solely the type *printf* will read the
argument as (i.e. double unless L or ll modifier is used, in which case
long double), not the type of the argument, because the corresponding
argument could have any type, even not floating, or say __float128 etc.

This is fixed in the first 2 hunks.

The last hunk is to treat REAL_CSTs arguments of incompatible types
as unknown argument, we really don't know what say __float128 passed to
%f or double passed to %La will do; that is something diagnosed by -Wformat,
so the patch just treats it as arbitrary value of the type.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


Thanks for the fix.

For the second part, can you please also add a compile-time test
to verify that the result isn't constrained to the same range as
with a real argument?  Checking that the abort below isn't
eliminated would do it for %f:

  void f (char *d)
  {
int n = __builtin_sprintf (d, "%f", 1.0Q);
if (n < 8 || 13 < n)
  __builtin_abort ();
  }

A test that has a convenient setup for this is tree-ssa/builtin-
sprintf-2.c in case you want to add to it.

Martin



2017-12-13  Jakub Jelinek  

PR tree-optimization/83198
* gimple-ssa-sprintf.c (format_floating): Set type solely based on
dir.modifier, regardless of TREE_TYPE (arg).  Assume non-REAL_CST
value if arg is a REAL_CST with incompatible type.

* gcc.dg/pr83198.c: New test.

--- gcc/gimple-ssa-sprintf.c.jj 2017-11-03 15:37:04.0 +0100
+++ gcc/gimple-ssa-sprintf.c2017-12-13 13:37:59.289435623 +0100
@@ -1885,6 +1885,8 @@ static fmtresult
 format_floating (const directive &dir, tree arg)
 {
   HOST_WIDE_INT prec[] = { dir.prec[0], dir.prec[1] };
+  tree type = (dir.modifier == FMT_LEN_L || dir.modifier == FMT_LEN_ll
+  ? long_double_type_node : double_type_node);

   /* For an indeterminate precision the lower bound must be assumed
  to be zero.  */
@@ -1892,10 +1894,6 @@ format_floating (const directive &dir, t
 {
   /* Get the number of fractional decimal digits needed to represent
 the argument without a loss of accuracy.  */
-  tree type = arg ? TREE_TYPE (arg) :
-   (dir.modifier == FMT_LEN_L || dir.modifier == FMT_LEN_ll
-? long_double_type_node : double_type_node);
-
   unsigned fmtprec
= REAL_MODE_FORMAT (TYPE_MODE (type))->p;

@@ -1946,7 +1944,9 @@ format_floating (const directive &dir, t
}
 }

-  if (!arg || TREE_CODE (arg) != REAL_CST)
+  if (!arg
+  || TREE_CODE (arg) != REAL_CST
+  || !useless_type_conversion_p (type, TREE_TYPE (arg)))
 return format_floating (dir, prec);

   /* The minimum and maximum number of bytes produced by the directive.  */
--- gcc/testsuite/gcc.dg/pr83198.c.jj   2017-12-13 13:43:36.056192309 +0100
+++ gcc/testsuite/gcc.dg/pr83198.c  2017-12-13 13:47:11.716474956 +0100
@@ -0,0 +1,18 @@
+/* PR tree-optimization/83198 */
+/* { dg-do compile } */
+/* { dg-options "-Wall -Wno-format" } */
+
+int
+foo (char *d[6], int x)
+{
+  int r = 0;
+  r += __builtin_sprintf (d[0], "%f", x);
+  r += __builtin_sprintf (d[1], "%a", x);
+  r += __builtin_sprintf (d[2], "%f", "foo");
+  r += __builtin_sprintf (d[3], "%a", "bar");
+#ifdef __SIZEOF_FLOAT128__
+  r += __builtin_sprintf (d[4], "%a", 1.0Q);
+  r += __builtin_sprintf (d[5], "%Lf", 1.0Q);
+#endif
+  return r;
+}

Jakub





Re: [PATCH v2] vrp_prop: Use dom_walker for -Warray-bounds (PR tree-optimization/83312)

2017-12-13 Thread Bernhard Reutner-Fischer
On 13 December 2017 22:30:12 CET, David Malcolm  wrote:
 
>-/* Walk over all statements of all reachable BBs and call
>check_array_bounds
>-   on them.  */
>+/* A dom_walker subclass for use by vrp_prop::check_all_array_refs,
>+   to walk over all statements of all reachable BBs and call

Not all statements, see below.

>+   check_array_bounds on them.  */
> 
>-void
>-vrp_prop::check_all_array_refs ()
>+class check_array_bounds_dom_walker : public dom_walker
> {
>-  basic_block bb;
>-  gimple_stmt_iterator si;
>+ public:
>+  check_array_bounds_dom_walker (vrp_prop *prop)
>+: dom_walker (CDI_DOMINATORS, true), m_prop (prop) {}
>+  ~check_array_bounds_dom_walker () {}
> 
>-  FOR_EACH_BB_FN (bb, cfun)
>-{
>-  edge_iterator ei;
>-  edge e;
>-  bool executable = false;
>+  edge before_dom_children (basic_block) FINAL OVERRIDE;
> 
>-  /* Skip blocks that were found to be unreachable.  */
>-  FOR_EACH_EDGE (e, ei, bb->preds)
>-  executable |= !!(e->flags & EDGE_EXECUTABLE);
>-  if (!executable)
>-  continue;
>+ private:
>+  vrp_prop *m_prop;
>+};
> 
>-  for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
>-  {
>-gimple *stmt = gsi_stmt (si);
>-struct walk_stmt_info wi;
>-if (!gimple_has_location (stmt)
>-|| is_gimple_debug (stmt))
>-  continue;
>+/* Implementation of dom_walker::before_dom_children.
> 
>-memset (&wi, 0, sizeof (wi));
>+   Walk over all statements of BB and call check_array_bounds on them,

Not all but all non-debug statements of BB with location (which statements 
don't, here?)

>+   and determine if there's a unique successor edge.  */
> 
>-wi.info = this;
>+edge
>+check_array_bounds_dom_walker::before_dom_children (basic_block bb)
>+{
>+  gimple_stmt_iterator si;
>+  for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))

for (si = gsi_start_nondebug_bb (bb); !gsi_end_p (si); gsi_next_nondebug (&si))

assuming you want to walk also phis(?).

>+{
>+  gimple *stmt = gsi_stmt (si);
>+  struct walk_stmt_info wi;
>+  if (!gimple_has_location (stmt)

Hence:
- >+  || is_gimple_debug (stmt))
>+  continue;

thanks, 
> 
>-walk_gimple_op (gsi_stmt (si),
>-check_array_bounds,
>-&wi);
>-  }
>+  memset (&wi, 0, sizeof (wi));
>+
>+  wi.info = m_prop;
>+
>+  walk_gimple_op (stmt, check_array_bounds, &wi);
> }



Re: [C++ PATCH] Fix ICE with label difference as template non-type argument (PR c++/79650)

2017-12-13 Thread Jason Merrill
OK.

On Wed, Dec 13, 2017 at 5:21 PM, Jakub Jelinek  wrote:
> Hi!
>
> reduced_constant_expression_p uses initializer_constant_valid_p
> to determine what is a valid constant expression.  That accepts several
> cases which aren't compile time INTEGER_CST, just something that the assembler
> can finalize into a constant, e.g. difference of labels, difference of
> STRING_CSTs, something plus ADDR_EXPR of a static var etc.
> But for template non-type arguments my understanding is we really need
> an INTEGER_CST, we cam hardly instantiate on something that only during
> assembly will become a constant.
>
> The following patch attempts to diagnose it.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> Or do you have better suggestions for the diagnostics wording?
>
> 2017-12-13  Jakub Jelinek  
>
> PR c++/79650
> * pt.c (convert_nontype_argument): Diagnose
> reduced_constant_expression_p expressions that aren't INTEGER_CST.
>
> * g++.dg/template/pr79650.C: New test.
>
> --- gcc/cp/pt.c.jj  2017-12-13 15:52:51.0 +0100
> +++ gcc/cp/pt.c 2017-12-13 19:21:35.861825357 +0100
> @@ -6577,7 +6577,19 @@ convert_nontype_argument (tree type, tre
> return NULL_TREE;
>   /* else cxx_constant_value complained but gave us
>  a real constant, so go ahead.  */
> - gcc_assert (TREE_CODE (expr) == INTEGER_CST);
> + if (TREE_CODE (expr) != INTEGER_CST)
> +   {
> + /* Some assemble time constant expressions like
> +(intptr_t)&&lab1 - (intptr_t)&&lab2 or
> +4 + (intptr_t)&&var satisfy reduced_constant_expression_p
> +as we can emit them into .rodata initializers of
> +variables, yet they can't fold into an INTEGER_CST at
> +compile time.  Refuse them here.  */
> + gcc_checking_assert (reduced_constant_expression_p (expr));
> + error_at (loc, "template argument %qE for type %qT not "
> +"a constant integer", expr, type);
> + return NULL_TREE;
> +   }
> }
>   else
> return NULL_TREE;
> --- gcc/testsuite/g++.dg/template/pr79650.C.jj  2017-12-13 19:29:04.549196268 
> +0100
> +++ gcc/testsuite/g++.dg/template/pr79650.C 2017-12-13 19:34:15.202298913 
> +0100
> @@ -0,0 +1,20 @@
> +// PR c++/79650
> +// { dg-do compile { target c++11 } }
> +// { dg-options "" }
> +
> +typedef __INTPTR_TYPE__ intptr_t;
> +template struct A {};
> +
> +void
> +foo ()
> +{
> +  static int a, b;
> +lab1:
> +lab2:
> +  A<(intptr_t)&&lab1 - (__INTPTR_TYPE__)&&lab2> c; // { dg-error "not a 
> constant integer" }
> +  A<(intptr_t)&&lab1 - (__INTPTR_TYPE__)&&lab1> d;
> +  A<(intptr_t)&a - (intptr_t)&b> e;// { dg-error "is not 
> a constant expression" }
> +  A<(intptr_t)&a - (intptr_t)&a> f;
> +  A<(intptr_t)sizeof(a) + (intptr_t)&a> g; // { dg-error "not a 
> constant integer" }
> +  A<(intptr_t)&a> h;   // { dg-error 
> "conversion from pointer type" }
> +}
>
> Jakub


Re: [C++ Patch PING] [C++ Patch] PR 82235 (Copy ctor is not found for copying array of an object when it's marked explicit)

2017-12-13 Thread Jason Merrill

On 12/12/2017 03:20 PM, Paolo Carlini wrote:

Hi,

On 15/11/2017 00:54, Mukesh Kapoor wrote:

Hi,

This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82235
For the following test case

struct Foo {
    Foo() {}
    explicit Foo(const Foo& aOther) {}
};
struct Bar {
    Foo m[1];
};
void test() {
    Bar a;
    Bar b = a;
}

the compiler issues an error when the compiler generated copy 
constructor of class Bar calls the explicit copy constructor of class 
Foo. The fix is to implement ISO C++/17 16.3.1.4 (over.match.copy) 
correctly.
I'm pinging this patch sent a while by Mukesh (I'm taking over from him 
about it). Any comments?


     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01133.html


These two don't match:


+   When initializing a temporary to be bound to the first
+   parameter of a constructor where the parameter is of type



+/* Return true if current_function_decl is a constructor
+   and its first argument is a reference type and it is


The language is talking about the function being called, and 
ref_first_parm_of_constructor is looking at the function we're currently in.


Jason


[PATCH] Fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83396#c28 on ia64 (PR bootstrap/83396, take 2)

2017-12-13 Thread Jakub Jelinek
On Wed, Dec 13, 2017 at 03:25:07PM +0100, Jakub Jelinek wrote:
> I think there are 2 issues.  One is that the ia64 backend emits
> the group barrier insns before BB_HEAD label, so it isn't part of a bb,
> but has BLOCK_FOR_INSN of the following block, that looks invalid to me
> and the ia64.c hunk ought to fix that, except that I don't have access to
> ia64 anymore and so can't test it.  Andreas, could you try that?
> 
> Another thing is that if we because of this end up with insns outside of
> basic blocks, the vt_initialize asserts will fire again.  Here, first of
> all, IMNSHO we should assert that debug bind insns aren't outside of basic
> blocks, the other patches and checking should ensure that (and if any slips
> in, we want to fix that too rather than work-around).
> Another is that while walking from get_first_insn to one before BB_HEAD 
> (bb->next_bb),
> we can encounter insns outside of bb not just before BB_HEAD (bb), but also
> after BB_END (bb), both cases are outside of a bb and thus we can
> expect BLOCK_FOR_INSN being NULL.
> 
> Bootstrapped/regtested on x86_64-linux, i686-linux, powerpc64le-linux,
> regtest on powerpc64-linux pending.  Ok for trunk perhaps without the
> ia64.c bits until that gets tested?
> 
> Or, in the PR there is a variant patch which just doesn't do the asserts and
> doesn't have to track outside_bb.

Here is another variant, without trying to change ia64 backend which
apparently doesn't bootstrap for other reasons.

This patch instead ignores insns outside of basic blocks during var-tracking
exactly as it has been ignoring them before, and just processes the debug
begin stmt markers in there (and verifies no debug bind stmts appear in
between bbs).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-12-13  Jakub Jelinek  

PR bootstrap/83396
* var-tracking.c (vt_initialize): Ignore non-DEBUG_INSNs outside of
basic blocks.  Assert debug bind insns don't appear outside of bbs,
don't reset them.  Assert insns without BLOCK_FOR_INSN are outside of
bb.  Simplify.

* gcc.dg/pr83396.c: New test.

--- gcc/var-tracking.c.jj   2017-12-13 13:22:59.651783152 +0100
+++ gcc/var-tracking.c  2017-12-13 19:11:13.895699735 +0100
@@ -10157,25 +10157,31 @@ vt_initialize (void)
 insns that might be before it too.  Unfortunately,
 BB_HEADER and BB_FOOTER are not set while we run this
 pass.  */
- insn = get_first_insn (bb);
- for (rtx_insn *next;
-  insn != BB_HEAD (bb->next_bb)
-? next = NEXT_INSN (insn), true : false;
+ rtx_insn *next;
+ bool outside_bb = true;
+ for (insn = get_first_insn (bb); insn != BB_HEAD (bb->next_bb);
   insn = next)
{
+ if (insn == BB_HEAD (bb))
+   outside_bb = false;
+ else if (insn == NEXT_INSN (BB_END (bb)))
+   outside_bb = true;
+ next = NEXT_INSN (insn);
  if (INSN_P (insn))
{
+ if (outside_bb)
+   {
+ /* Ignore non-debug insns outside of basic blocks.  */
+ if (!DEBUG_INSN_P (insn))
+   continue;
+ /* Debug binds shouldn't appear outside of bbs.  */
+ gcc_assert (!DEBUG_BIND_INSN_P (insn));
+   }
  basic_block save_bb = BLOCK_FOR_INSN (insn);
  if (!BLOCK_FOR_INSN (insn))
{
+ gcc_assert (outside_bb);
  BLOCK_FOR_INSN (insn) = bb;
- gcc_assert (DEBUG_INSN_P (insn));
- /* Reset debug insns between basic blocks.
-Their location is not reliable, because they
-were probably not maintained up to date.  */
- if (DEBUG_BIND_INSN_P (insn))
-   INSN_VAR_LOCATION_LOC (insn)
- = gen_rtx_UNKNOWN_VAR_LOC ();
}
  else
gcc_assert (BLOCK_FOR_INSN (insn) == bb);
--- gcc/testsuite/gcc.dg/pr83396.c.jj   2017-12-13 15:53:15.446687005 +0100
+++ gcc/testsuite/gcc.dg/pr83396.c  2017-12-13 15:53:15.446687005 +0100
@@ -0,0 +1,12 @@
+/* PR bootstrap/83396 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -g" } */
+
+int bar (int);
+int baz (int);
+
+int
+foo (int x)
+{
+  return bar (x) || baz (x) != 0;
+}


Jakub


[C++ PATCH] Fix ICE with label difference as template non-type argument (PR c++/79650)

2017-12-13 Thread Jakub Jelinek
Hi!

reduced_constant_expression_p uses initializer_constant_valid_p
to determine what is a valid constant expression.  That accepts several
cases which aren't compile time INTEGER_CST, just something that the assembler
can finalize into a constant, e.g. difference of labels, difference of
STRING_CSTs, something plus ADDR_EXPR of a static var etc.
But for template non-type arguments my understanding is we really need
an INTEGER_CST, we cam hardly instantiate on something that only during
assembly will become a constant.

The following patch attempts to diagnose it.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Or do you have better suggestions for the diagnostics wording?

2017-12-13  Jakub Jelinek  

PR c++/79650
* pt.c (convert_nontype_argument): Diagnose
reduced_constant_expression_p expressions that aren't INTEGER_CST.

* g++.dg/template/pr79650.C: New test.

--- gcc/cp/pt.c.jj  2017-12-13 15:52:51.0 +0100
+++ gcc/cp/pt.c 2017-12-13 19:21:35.861825357 +0100
@@ -6577,7 +6577,19 @@ convert_nontype_argument (tree type, tre
return NULL_TREE;
  /* else cxx_constant_value complained but gave us
 a real constant, so go ahead.  */
- gcc_assert (TREE_CODE (expr) == INTEGER_CST);
+ if (TREE_CODE (expr) != INTEGER_CST)
+   {
+ /* Some assemble time constant expressions like
+(intptr_t)&&lab1 - (intptr_t)&&lab2 or
+4 + (intptr_t)&&var satisfy reduced_constant_expression_p
+as we can emit them into .rodata initializers of
+variables, yet they can't fold into an INTEGER_CST at
+compile time.  Refuse them here.  */
+ gcc_checking_assert (reduced_constant_expression_p (expr));
+ error_at (loc, "template argument %qE for type %qT not "
+"a constant integer", expr, type);
+ return NULL_TREE;
+   }
}
  else
return NULL_TREE;
--- gcc/testsuite/g++.dg/template/pr79650.C.jj  2017-12-13 19:29:04.549196268 
+0100
+++ gcc/testsuite/g++.dg/template/pr79650.C 2017-12-13 19:34:15.202298913 
+0100
@@ -0,0 +1,20 @@
+// PR c++/79650
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+typedef __INTPTR_TYPE__ intptr_t;
+template struct A {};
+
+void
+foo ()
+{
+  static int a, b;
+lab1:
+lab2:
+  A<(intptr_t)&&lab1 - (__INTPTR_TYPE__)&&lab2> c; // { dg-error "not a 
constant integer" }
+  A<(intptr_t)&&lab1 - (__INTPTR_TYPE__)&&lab1> d;
+  A<(intptr_t)&a - (intptr_t)&b> e;// { dg-error "is not a 
constant expression" }
+  A<(intptr_t)&a - (intptr_t)&a> f;
+  A<(intptr_t)sizeof(a) + (intptr_t)&a> g; // { dg-error "not a 
constant integer" }
+  A<(intptr_t)&a> h;   // { dg-error 
"conversion from pointer type" }
+}

Jakub


[PATCH] Fix gimple-ssa-sprintf.c ICE (PR tree-optimization/83198)

2017-12-13 Thread Jakub Jelinek
Hi!

This patch fixes 2 issues in format_floating.  One is that when determining
precision, we should consider solely the type *printf* will read the
argument as (i.e. double unless L or ll modifier is used, in which case
long double), not the type of the argument, because the corresponding
argument could have any type, even not floating, or say __float128 etc.

This is fixed in the first 2 hunks.

The last hunk is to treat REAL_CSTs arguments of incompatible types
as unknown argument, we really don't know what say __float128 passed to
%f or double passed to %La will do; that is something diagnosed by -Wformat,
so the patch just treats it as arbitrary value of the type.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-12-13  Jakub Jelinek  

PR tree-optimization/83198
* gimple-ssa-sprintf.c (format_floating): Set type solely based on
dir.modifier, regardless of TREE_TYPE (arg).  Assume non-REAL_CST
value if arg is a REAL_CST with incompatible type.

* gcc.dg/pr83198.c: New test.

--- gcc/gimple-ssa-sprintf.c.jj 2017-11-03 15:37:04.0 +0100
+++ gcc/gimple-ssa-sprintf.c2017-12-13 13:37:59.289435623 +0100
@@ -1885,6 +1885,8 @@ static fmtresult
 format_floating (const directive &dir, tree arg)
 {
   HOST_WIDE_INT prec[] = { dir.prec[0], dir.prec[1] };
+  tree type = (dir.modifier == FMT_LEN_L || dir.modifier == FMT_LEN_ll
+  ? long_double_type_node : double_type_node);
 
   /* For an indeterminate precision the lower bound must be assumed
  to be zero.  */
@@ -1892,10 +1894,6 @@ format_floating (const directive &dir, t
 {
   /* Get the number of fractional decimal digits needed to represent
 the argument without a loss of accuracy.  */
-  tree type = arg ? TREE_TYPE (arg) :
-   (dir.modifier == FMT_LEN_L || dir.modifier == FMT_LEN_ll
-? long_double_type_node : double_type_node);
-
   unsigned fmtprec
= REAL_MODE_FORMAT (TYPE_MODE (type))->p;
 
@@ -1946,7 +1944,9 @@ format_floating (const directive &dir, t
}
 }
 
-  if (!arg || TREE_CODE (arg) != REAL_CST)
+  if (!arg
+  || TREE_CODE (arg) != REAL_CST
+  || !useless_type_conversion_p (type, TREE_TYPE (arg)))
 return format_floating (dir, prec);
 
   /* The minimum and maximum number of bytes produced by the directive.  */
--- gcc/testsuite/gcc.dg/pr83198.c.jj   2017-12-13 13:43:36.056192309 +0100
+++ gcc/testsuite/gcc.dg/pr83198.c  2017-12-13 13:47:11.716474956 +0100
@@ -0,0 +1,18 @@
+/* PR tree-optimization/83198 */
+/* { dg-do compile } */
+/* { dg-options "-Wall -Wno-format" } */
+
+int
+foo (char *d[6], int x)
+{
+  int r = 0;
+  r += __builtin_sprintf (d[0], "%f", x);
+  r += __builtin_sprintf (d[1], "%a", x);
+  r += __builtin_sprintf (d[2], "%f", "foo");
+  r += __builtin_sprintf (d[3], "%a", "bar");
+#ifdef __SIZEOF_FLOAT128__
+  r += __builtin_sprintf (d[4], "%a", 1.0Q);
+  r += __builtin_sprintf (d[5], "%Lf", 1.0Q);
+#endif
+  return r;
+}

Jakub


Re: [PATCH, rs6000] Allow memmov/memset builtin expansion to use unaligned vsx on p8/p9

2017-12-13 Thread Segher Boessenkool
Hi!

On Wed, Dec 13, 2017 at 02:07:44PM -0600, Aaron Sawdey wrote:
> This patch allows the use of unaligned vsx loads/stores for builtin
> expansion of memset and memcmp on p8/p9. Performance of unaligned vsx
> instructions is good on these processors. 
> 
> OK for trunk if bootstrap/regtest on ppc64le passes?
> 
> 2017-12-13  Aaron Sawdey  
> 
>   * config/rs6000/rs6000-string.c (expand_block_move): Allow the use of
>   unaligned VSX load/store on P8/P9.
>   (expand_block_clear): Allow the use of unaligned VSX load/store on 
> P8/P9.

(The last line is too long.)

Okay for trunk, thanks!

We'll probably want some macro for this isP8||isP9 condition, but this is
fine for now.  Let's not do many more of this though.


Segher


Re: [C++] Add support for #pragma GCC unroll v4

2017-12-13 Thread Jason Merrill

On 12/06/2017 03:50 AM, Eric Botcazou wrote:
this is the (hopefully) final implementation of the support for the unrolling 
pragma in the C++ front-end.


This needs some C++ tests, particularly with templates and range-for.  I 
suspect that using the pragma in a template will ICE.


Jason


Re: [PATCH] diagnose attribute conflicts on conversion operators (PR 83394)

2017-12-13 Thread Jason Merrill
On Wed, Dec 13, 2017 at 12:54 PM, Martin Sebor  wrote:
> The attached update also fixes both instances of the ICE
> reported in bug 83322 and supersedes Jakub's patch for that
> bug (https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00765.html).
> This passes bootstrap on x86_64 with no new regressions (there
> are an increasing number of failures on trunk at the moment
> but, AFAICS, none caused by this patch).
>
> Jason, I'm still trying to come up with a test case for templates
> that would illustrate the issue you're concerned about.  If you
> have one that would be great (preferably one showing a regression).

I looked at the case I was concerned about, and found that it isn't an
issue because in that case we call duplicate_decls before applying
attributes.

But it looks like we'll still get this testcase wrong, because the
code assumes that if the old decl is a single _DECL, it must match.

[[gnu::noinline]] void f() { }
[[gnu::always_inline]] void f(int) { }  // OK, not the same function

I think the answer is to use Nathan's new iterators unconditionally,
probably lkp_iterator.

Jason


[PATCH v2] vrp_prop: Use dom_walker for -Warray-bounds (PR tree-optimization/83312)

2017-12-13 Thread David Malcolm
On Wed, 2017-12-13 at 10:47 -0700, Jeff Law wrote:
> On 12/13/2017 09:24 AM, Richard Biener wrote:
> > > 
> > > Alternately we could to the dom_walker ctor that an initial state
> > > of
> > > EDGE_EXECUTABLE is already set.
> > 
> > I'm quite sure that wouldn't help for VRP. 
> 
> Not sure why.  But it's not worth digging deep into.
> 
> I do think the current structure could still fail to pick up some
> secondary cases where blocks become unreachable as a result of both
> not
> needing to visit during the lattice propagation step and the
> substitution step.  But I'd expect this to be rare.
> 
> > I think David's approach is fine just we don't need any other API
> > to get at a known executable outgoing edge. We can improve the
> > existing one or just add the trivial folding required. 
> 
> I think Michael's suggestion to pass in NULL for the value and allow
> find_edge to try and determine the value makes the most sense here.
> 
> Jeff

Michael: thanks for the hint about find_taken_edge; I assumed that such
a "find the relevant out-edge" function would already exist; but I
didn't find it (I'm relatively unfamiliar with this part of the code).

Here's an updated version of the patch, which eliminates the stuff I
added to gimple.h/gimple.c changes in favor of using
find_taken_edge (bb, NULL_TREE),
generalizing it to work with arbitrary bbs, so that the dom_walker
vfunc can simply use:
  return find_taken_edge (bb, NULL_TREE);
without having to check e.g. for there being a last stmt (ENTRY
and EXIT), or having to check that it is indeed a control statement
(is there a requirement at this point of the IR that we don't just
fall off the last statment through an out-edge?)

I handled var == NULL_TREE for GIMPLE_COND and GIMPLE_SWITCH,
but not for computed goto (find_taken_edge already handles that by
bailing out).

I also made some things "const" whilst I was touching it.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/ChangeLog:
PR tree-optimization/83312
* domwalk.h (dom_walker::dom_walker): Fix typo in comment.
* tree-cfg.c (find_taken_edge): Update to handle NULL_TREE for
"val" param, and to cope with arbitrary basic blocks.
(find_taken_edge_cond_expr): Add "cond_stmt" param and use it to
handle NULL_TREE for "val".
(find_taken_edge_switch_expr): Make "switch_stmt" param const.
Handle NULL_TREE for "val".
(find_case_label_for_value): Make "switch_stmt" param const.
* tree-vrp.c (class check_array_bounds_dom_walker): New subclass
of dom_walker.
(vrp_prop::check_all_array_refs): Reimplement as...
(check_array_bounds_dom_walker::before_dom_children): ...this new
vfunc.  Replace linear search through BB block list, excluding
those with non-executable in-edges via dominator walk.

gcc/testsuite/ChangeLog:
PR tree-optimization/83312
* gcc.dg/pr83312.c: New test case.
---
 gcc/domwalk.h  |  2 +-
 gcc/testsuite/gcc.dg/pr83312.c | 30 +
 gcc/tree-cfg.c | 59 +---
 gcc/tree-vrp.c | 76 ++
 4 files changed, 117 insertions(+), 50 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr83312.c

diff --git a/gcc/domwalk.h b/gcc/domwalk.h
index 6ac93eb..c7e3450 100644
--- a/gcc/domwalk.h
+++ b/gcc/domwalk.h
@@ -32,7 +32,7 @@ class dom_walker
 public:
   static const edge STOP;
 
-  /* Use SKIP_UNREACHBLE_BLOCKS = true when your client can discover
+  /* Use SKIP_UNREACHABLE_BLOCKS = true when your client can discover
  that some edges are not executable.
 
  If a client can discover that a COND, SWITCH or GOTO has a static
diff --git a/gcc/testsuite/gcc.dg/pr83312.c b/gcc/testsuite/gcc.dg/pr83312.c
new file mode 100644
index 000..2eb241d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr83312.c
@@ -0,0 +1,30 @@
+/* { dg-options "-O2 -Warray-bounds" } */
+
+struct ptlrpcd_ctl {
+  char pc_name[20];
+};
+struct ptlrpcd {
+  struct ptlrpcd_ctl pd_threads[6];
+};
+struct ptlrpcd *ptlrpcd_init_pd;
+static void ptlrpcd_ctl_init(struct ptlrpcd_ctl *pc, int index) {
+  if (index < 0)
+__builtin_snprintf(pc->pc_name, sizeof(pc->pc_name), "ptlrpcd_rcv");
+  else
+__builtin_snprintf(pc->pc_name, sizeof(pc->pc_name), "ptlrpcd_%d", index);
+}
+int ptlrpcd_init_ncpts;
+static int ptlrpcd_init(int nthreads) {
+  int j;
+  if (ptlrpcd_init_ncpts) {
+ptlrpcd_ctl_init(&ptlrpcd_init_pd->pd_threads[0], -1);
+for (j = 1; j < nthreads; j++)
+  ptlrpcd_ctl_init(&ptlrpcd_init_pd->pd_threads[j], j);
+  }
+  return 0;
+}
+int ptlrpcd_init_groupsize;
+void ptlrpcd_addref(void) {
+ptlrpcd_init(ptlrpcd_init_groupsize);
+}
+
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 4d09b2c..7ecc5c8 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -170,9 +170,9 @@ static void gimple_merge_blocks (basic_block, basic_block);
 static

Re: [PATCH] PR libstdc++/59568 fix error handling for std::complex stream extraction

2017-12-13 Thread Jonathan Wakely

On 13/12/17 18:42 +, Jonathan Wakely wrote:

The bug here is that we called putback even if the initial __is >> __ch
extraction failed and set eofbit, and putback clears the eofbit. I
found a number of other problems though, such as not even trying to
call putback after failing to find the ',' and ')' characters.

I decided to rewrite the function following the proposed resolution
for https://wg21.link/lwg2714 which is a much more precise
specification for much more desirable semantics.

PR libstdc++/59568
* include/std/complex (operator>>): Implement proposed resolution to
LWG 2714. Use putback if and only if a character has been successfully
extracted but isn't a delimiter. Use ctype::widen and traits::eq when
testing if extracted characters match delimiters.
* testsuite/26_numerics/complex/dr2714.cc: New test.

Tested powerpc64le-linux, committed to trunk.

For the release branches I'm considering just fixing the bug that
clears eofbit, and not the whole rewrite of the function.


Actually there's another bug in the original function, which is that
it unconditionally sets "__x = __re_x;" even if extracting a value
into __re_x failed.

Here's what I plan to commit for the branches.

Tested x86_64-linux.

commit 907f186cd5d2e98f8ddf031b46b2cb0ae520b0d7
Author: Jonathan Wakely 
Date:   Wed Dec 13 20:21:26 2017 +

PR libstdc++/59568 don't use putback or update value when extraction fails

PR libstdc++/59568
* include/std/complex (operator>>): Only use putback if a character
was successfully extracted and only set the value if a number was
successfully extracted.
* testsuite/26_numerics/complex/inserters_extractors/char/59568.cc:
New test.

diff --git a/libstdc++-v3/include/std/complex b/libstdc++-v3/include/std/complex
index 6342c98e88a..22107cb2264 100644
--- a/libstdc++-v3/include/std/complex
+++ b/libstdc++-v3/include/std/complex
@@ -493,7 +493,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 operator>>(basic_istream<_CharT, _Traits>& __is, complex<_Tp>& __x)
 {
   _Tp __re_x, __im_x;
-  _CharT __ch;
+  _CharT __ch = _CharT();
   __is >> __ch;
   if (__ch == '(')
 	{
@@ -511,11 +511,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  else
 	__is.setstate(ios_base::failbit);
 	}
-  else
+  else if (__is)
 	{
 	  __is.putback(__ch);
-	  __is >> __re_x;
-	  __x = __re_x;
+	  if (__is >> __re_x)
+	__x = __re_x;
+	  else
+	__is.setstate(ios_base::failbit);
 	}
   return __is;
 }
diff --git a/libstdc++-v3/testsuite/26_numerics/complex/inserters_extractors/char/59568.cc b/libstdc++-v3/testsuite/26_numerics/complex/inserters_extractors/char/59568.cc
new file mode 100644
index 000..2bbdb6abae4
--- /dev/null
+++ b/libstdc++-v3/testsuite/26_numerics/complex/inserters_extractors/char/59568.cc
@@ -0,0 +1,166 @@
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++98" }
+
+#include 
+#include 
+#include 
+#include 
+
+void
+test01()
+{
+  std::istringstream in(" 1 (2) ( 2.0 , 0.5 ) ");
+  std::complex c1, c2, c3;
+  in >> c1 >> c2 >> c3;
+  VERIFY( in.good() );
+  VERIFY( c1.real() == 1 && c1.imag() == 0 );
+  VERIFY( c2.real() == 2 && c2.imag() == 0 );
+  VERIFY( c3.real() == 2 && c3.imag() == 0.5 );
+}
+
+void
+test02()
+{
+  std::istringstream in;
+  std::complex c(-1, -1);
+  const std::complex c0 = c;
+
+  in.str("a");
+  in >> c;
+  VERIFY( in.fail() );
+  in.clear();
+  VERIFY( in.get() == 'a' );
+  VERIFY( c == c0 );
+
+  in.str(" ( ) ");
+  in >> c;
+  VERIFY( in.fail() );
+  in.clear();
+  VERIFY( in.get() == ')' );
+  VERIFY( c == c0 );
+
+  in.str("(,");
+  in >> c;
+  VERIFY( in.fail() );
+  in.clear();
+  VERIFY( in.get() == ',' );
+  VERIFY( c == c0 );
+
+  in.str("(b)");
+  in >> c;
+  VERIFY( in.fail() );
+  in.clear();
+  VERIFY( in.get() == 'b' );
+  VERIFY( c == c0 );
+
+  in.str("( c)");
+  in >> c;
+  VERIFY( in.fail() );
+  in.clear();
+  VERIFY( in.get() == 'c' );
+  VERIFY( c == c0 );
+
+  in.str("(99d");
+  in >> c;
+  VERIFY( in.fail() );
+  in.clear();
+  // VERIFY( in.get() == 'd' );
+  VERIFY( c == c0 );
+
+  in.str("(99 e");
+  in >> c;
+  VERIFY( in.fail() );
+  in.clear(

Re: [PATCH] PR libgcc/83112, Fix warnings on libgcc float128-ifunc.c

2017-12-13 Thread Segher Boessenkool
On Tue, Dec 12, 2017 at 04:56:36PM -0500, Michael Meissner wrote:
> On Tue, Dec 12, 2017 at 11:04:55AM -0600, Segher Boessenkool wrote:
> > On Mon, Dec 11, 2017 at 03:57:51PM -0500, Michael Meissner wrote:
> > > > > +extern KCtype __divkc3 (KFtype, KFtype, KFtype, KFtype);
> > > > > +
> > > > >  KCtype
> > > > >  __divkc3 (KFtype a, KFtype b, KFtype c, KFtype d)
> > > > >  {
> > > > 
> > > > How does this warn?  -Wmissing-declarations?  Should this declaration be
> > > > in a header then?

> As Andreas points out, the option -Wmissing-prototypes complains if a global
> function is compliled without prototypes for C/Objective C.
> 
> Before the patch, the internal definition within the compiler meant that that
> __mulkc3 would not get the warning.  Now with separate ifunc handlers, both
> __mulkc3_sw and __mulkc3_hw got warnings.

Gotcha.  There isn't a nice header file for it, so sure that is fine
the way you have it.  Thanks for the explanation!


Segher


[PATCH, rs6000] Allow memmov/memset builtin expansion to use unaligned vsx on p8/p9

2017-12-13 Thread Aaron Sawdey
This patch allows the use of unaligned vsx loads/stores for builtin
expansion of memset and memcmp on p8/p9. Performance of unaligned vsx
instructions is good on these processors. 

OK for trunk if bootstrap/regtest on ppc64le passes?

2017-12-13  Aaron Sawdey  

* config/rs6000/rs6000-string.c (expand_block_move): Allow the use of
unaligned VSX load/store on P8/P9.
(expand_block_clear): Allow the use of unaligned VSX load/store on 
P8/P9.


-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC ToolchainIndex: gcc/config/rs6000/rs6000-string.c
===
--- gcc/config/rs6000/rs6000-string.c	(revision 255585)
+++ gcc/config/rs6000/rs6000-string.c	(working copy)
@@ -67,11 +67,14 @@
   if (bytes <= 0)
 return 1;
 
+  bool isP8 = (rs6000_cpu == PROCESSOR_POWER8);
+  bool isP9 = (rs6000_cpu == PROCESSOR_POWER9);
+
   /* Use the builtin memset after a point, to avoid huge code bloat.
  When optimize_size, avoid any significant code bloat; calling
  memset is about 4 instructions, so allow for one instruction to
  load zero and three to do clearing.  */
-  if (TARGET_ALTIVEC && align >= 128)
+  if (TARGET_ALTIVEC && (align >= 128 || isP8 || isP9))
 clear_step = 16;
   else if (TARGET_POWERPC64 && (align >= 64 || !STRICT_ALIGNMENT))
 clear_step = 8;
@@ -88,7 +91,7 @@
   machine_mode mode = BLKmode;
   rtx dest;
 
-  if (bytes >= 16 && TARGET_ALTIVEC && align >= 128)
+  if (bytes >= 16 && TARGET_ALTIVEC && (align >= 128 || isP8 || isP9))
 	{
 	  clear_bytes = 16;
 	  mode = V4SImode;
@@ -1247,6 +1250,9 @@
   if (bytes > rs6000_block_move_inline_limit)
 return 0;
 
+  bool isP8 = (rs6000_cpu == PROCESSOR_POWER8);
+  bool isP9 = (rs6000_cpu == PROCESSOR_POWER9);
+
   for (offset = 0; bytes > 0; offset += move_bytes, bytes -= move_bytes)
 {
   union {
@@ -1258,7 +1264,7 @@
 
   /* Altivec first, since it will be faster than a string move
 	 when it applies, and usually not significantly larger.  */
-  if (TARGET_ALTIVEC && bytes >= 16 && align >= 128)
+  if (TARGET_ALTIVEC && bytes >= 16 && (isP8 || isP9 || align >= 128))
 	{
 	  move_bytes = 16;
 	  mode = V4SImode;


[PATCH] PR libstdc++/59568 fix error handling for std::complex stream extraction

2017-12-13 Thread Jonathan Wakely

The bug here is that we called putback even if the initial __is >> __ch
extraction failed and set eofbit, and putback clears the eofbit. I
found a number of other problems though, such as not even trying to
call putback after failing to find the ',' and ')' characters.

I decided to rewrite the function following the proposed resolution
for https://wg21.link/lwg2714 which is a much more precise
specification for much more desirable semantics.

PR libstdc++/59568
* include/std/complex (operator>>): Implement proposed resolution to
LWG 2714. Use putback if and only if a character has been successfully
extracted but isn't a delimiter. Use ctype::widen and traits::eq when
testing if extracted characters match delimiters.
* testsuite/26_numerics/complex/dr2714.cc: New test.

Tested powerpc64le-linux, committed to trunk.

For the release branches I'm considering just fixing the bug that
clears eofbit, and not the whole rewrite of the function.

commit 419381b5d32b5a38c1fe7703dc0400c836106939
Author: redi 
Date:   Wed Dec 13 17:02:14 2017 +

PR libstdc++/59568 fix error handling for std::complex stream extraction

PR libstdc++/59568
* include/std/complex (operator>>): Implement proposed resolution to
LWG 2714. Use putback if and only if a character has been 
successfully
extracted but isn't a delimiter. Use ctype::widen and traits::eq 
when
testing if extracted characters match delimiters.
* testsuite/26_numerics/complex/dr2714.cc: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@255608 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/include/std/complex b/libstdc++-v3/include/std/complex
index 61f8cc1fce3..bfe10347bd3 100644
--- a/libstdc++-v3/include/std/complex
+++ b/libstdc++-v3/include/std/complex
@@ -492,31 +492,52 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 basic_istream<_CharT, _Traits>&
 operator>>(basic_istream<_CharT, _Traits>& __is, complex<_Tp>& __x)
 {
-  _Tp __re_x, __im_x;
+  bool __fail = true;
   _CharT __ch;
-  __is >> __ch;
-  if (__ch == '(')
+  if (__is >> __ch)
{
- __is >> __re_x >> __ch;
- if (__ch == ',')
+ if (_Traits::eq(__ch, __is.widen('(')))
{
- __is >> __im_x >> __ch;
- if (__ch == ')')
-   __x = complex<_Tp>(__re_x, __im_x);
- else
-   __is.setstate(ios_base::failbit);
+ _Tp __u;
+ if (__is >> __u >> __ch)
+   {
+ const _CharT __rparen = __is.widen(')');
+ if (_Traits::eq(__ch, __rparen))
+   {
+ __x = __u;
+ __fail = false;
+   }
+ else if (_Traits::eq(__ch, __is.widen(',')))
+   {
+ _Tp __v;
+ if (__is >> __v >> __ch)
+   {
+ if (_Traits::eq(__ch, __rparen))
+   {
+ __x = complex<_Tp>(__u, __v);
+ __fail = false;
+   }
+ else
+   __is.putback(__ch);
+   }
+   }
+ else
+   __is.putback(__ch);
+   }
}
- else if (__ch == ')')
-   __x = __re_x;
  else
-   __is.setstate(ios_base::failbit);
-   }
-  else
-   {
- __is.putback(__ch);
- __is >> __re_x;
- __x = __re_x;
+   {
+ __is.putback(__ch);
+ _Tp __u;
+ if (__is >> __u)
+   {
+ __x = __u;
+ __fail = false;
+   }
+   }
}
+  if (__fail)
+   __is.setstate(ios_base::failbit);
   return __is;
 }
 
diff --git a/libstdc++-v3/testsuite/26_numerics/complex/dr2714.cc 
b/libstdc++-v3/testsuite/26_numerics/complex/dr2714.cc
new file mode 100644
index 000..6b35e8adcf9
--- /dev/null
+++ b/libstdc++-v3/testsuite/26_numerics/complex/dr2714.cc
@@ -0,0 +1,168 @@
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the fil

Re: [PATCH] diagnose attribute conflicts on conversion operators (PR 83394)

2017-12-13 Thread Martin Sebor

The attached update also fixes both instances of the ICE
reported in bug 83322 and supersedes Jakub's patch for that
bug (https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00765.html).
This passes bootstrap on x86_64 with no new regressions (there
are an increasing number of failures on trunk at the moment
but, AFAICS, none caused by this patch).

Jason, I'm still trying to come up with a test case for templates
that would illustrate the issue you're concerned about.  If you
have one that would be great (preferably one showing a regression).

Martin

On 12/12/2017 06:25 PM, Martin Sebor wrote:

In bug 83394 - always_inline vs. noinline no longer diagnosed,
Jakub provided a test case where the recent enhancement to detect
nonsensical attribute combinations fails to detect a pair of
mutually exclusive attributes on separate declarations of
a conversion member operator (see bug 83322 for the origin of
the test case).  This case was previously diagnosed so this is
a regression introduced by the enhancement.

The attached patch restores this diagnostic.  I have very little
experience with lookup and scoping in the C++ front end so if
this isn't the right approach I'd be grateful for suggestions
for what API to use.

In a private conversation Jason mentioned there may be cases
involving templates where the current approach won't have access
to the "last declaration" and so won't be able to detect a mismatch.
I am yet to come up with an example where this happens.  If/when
I do I'll look into enhancing or modifying the current solution
to detect those as well.  But until then I'd like to submit this
as an incremental step in that direction.

The attached patch passes regression testing on x86_64-linux.

Thanks
Martin


PR c++/83394 - [8 Regression] always_inline vs. noinline no longer diagnosed
PR c++/83322 - ICE: tree check: expected class ‘type’, have ‘exceptional’

gcc/cp/ChangeLog:

	PR c++/83394
	PR c++/83322
	* decl2.c (cplus_decl_attributes): Look up member functions
	in the scope of their class.

gcc/testsuite/ChangeLog:

	PR c++/83394
	* g++.dg/Wattributes-3.C: New test.
	* g++.dg/Wattributes-4.C: New test.

diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 5d30369..ae5dbab 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -1432,6 +1432,71 @@ cp_omp_mappable_type (tree type)
   return true;
 }
 
+/* Return the last pushed declaration for the symbol DECL or NULL
+   when no such declaration exists.  */
+
+static tree
+find_last_decl (tree decl)
+{
+  tree last_decl = NULL_TREE;
+
+  if (tree name = DECL_P (decl) ? DECL_NAME (decl) : NULL_TREE)
+{
+  /* Look up the declaration in its scope.  */
+  tree pushed_scope = NULL_TREE;
+  if (tree ctype = DECL_CONTEXT (decl))
+	pushed_scope = push_scope (ctype);
+
+  last_decl = lookup_name (name);
+
+  if (pushed_scope)
+	pop_scope (pushed_scope);
+
+  /* The declaration may be a member conversion operator
+	 or a bunch of overfloads (handle the latter below).  */
+  if (last_decl && BASELINK_P (last_decl))
+	last_decl = BASELINK_FUNCTIONS (last_decl);
+}
+
+  if (!last_decl)
+return NULL_TREE;
+
+  if (DECL_P (last_decl))
+return last_decl;
+
+  if (TREE_CODE (last_decl) == OVERLOAD)
+{
+  /* A set of overloads of the same function.  */
+  for (ovl_iterator iter (last_decl, true); iter; ++iter)
+	{
+	  if (TREE_CODE (*iter) == OVERLOAD)
+	continue;
+
+	  if (decls_match (decl, *iter, /*record_decls=*/false))
+	return *iter;
+	}
+
+  return NULL_TREE;
+}
+
+  if (TREE_CODE (last_decl) == TREE_LIST)
+{
+  /* The list contains a mix of symbols with the same name
+	 (e.g., functions and data members defined in different
+	 base classes).  */
+  do
+	{
+	  if (decls_match (decl, TREE_VALUE (last_decl)))
+	return TREE_VALUE (last_decl);
+
+	  last_decl = TREE_CHAIN (last_decl);
+	}
+  while (last_decl);
+}
+
+  return NULL_TREE;
+}
+
 /* Like decl_attributes, but handle C++ complexity.  */
 
 void
@@ -1483,28 +1548,7 @@ cplus_decl_attributes (tree *decl, tree attributes, int flags)
 }
   else
 {
-  tree last_decl = (DECL_P (*decl) && DECL_NAME (*decl)
-			? lookup_name (DECL_NAME (*decl)) : NULL_TREE);
-
-  if (last_decl && TREE_CODE (last_decl) == OVERLOAD)
-	for (ovl_iterator iter (last_decl, true); ; ++iter)
-	  {
-	if (!iter)
-	  {
-		last_decl = NULL_TREE;
-		break;
-	  }
-
-	if (TREE_CODE (*iter) == OVERLOAD)
-	  continue;
-
-	if (decls_match (*decl, *iter, /*record_decls=*/false))
-	  {
-		last_decl = *iter;
-		break;
-	  }
-	  }
-
+  tree last_decl = find_last_decl (*decl);
   decl_attributes (decl, attributes, flags, last_decl);
 }
 
diff --git a/gcc/testsuite/g++.dg/Wattributes-3.C b/gcc/testsuite/g++.dg/Wattributes-3.C
new file mode 100644
index 000..2479998
--- /dev/null
+++ b/gcc/testsuite/g++.dg/Wattributes-3.C
@@ -0,0 +1,89 @@
+// PR c++/83394 - always_inline vs. noinline no longer diagno

Re: [PATCH] Verify allowed stmts before labels

2017-12-13 Thread Jeff Law
On 12/13/2017 07:12 AM, Jakub Jelinek wrote:
> Hi!
> 
> PR83391/PR83396 failed because debug bind stmts were put before labels.
> Alex said that is undesirable, and that right now we want to allow
> just debug begin stmt markers before or intermixed with labels.
> 
> This patch ensures that through verification, which is what defines
> what is and isn't valid GIMPLE.  If we ever reconsider it, either allow
> further stmts or disallow even debug begin stmt markers, we can easily also
> tweak the verifier.  The patch has been successfully bootstrapped/regtested as
> part of the:
> 
> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00811.html  
>  
> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00808.html  
>  
> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42861  
>  
> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42866  
>  
> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00794.html  
>  
> 
> patchset, without the msg00811.html patch it of course doesn't survive
> bootstrap, as we insert debug bind stmts before labels in those cases.
> 
> Ok for trunk once the msg00811.html patch or something similar is committed?
> 
> 2017-12-13  Jakub Jelinek  
> 
>   * tree-cfg.c (verify_gimple_in_cfg): Verify no non-label stmts
>   with the exception of debug begin stmt markers appear before
>   labels.
OK.
jeff


Re: [PATCH] vrp_prop: Use dom_walker for -Warray-bounds (PR tree-optimization/83312)

2017-12-13 Thread Jeff Law
On 12/13/2017 09:24 AM, Richard Biener wrote:
>>
>> Alternately we could to the dom_walker ctor that an initial state of
>> EDGE_EXECUTABLE is already set.
> 
> I'm quite sure that wouldn't help for VRP. 
Not sure why.  But it's not worth digging deep into.

I do think the current structure could still fail to pick up some
secondary cases where blocks become unreachable as a result of both not
needing to visit during the lattice propagation step and the
substitution step.  But I'd expect this to be rare.

> I think David's approach is fine just we don't need any other API to get at a 
> known executable outgoing edge. We can improve the existing one or just add 
> the trivial folding required. 
I think Michael's suggestion to pass in NULL for the value and allow
find_edge to try and determine the value makes the most sense here.

Jeff


Re: [C++ Patch] PR 81061 ("[7/8 Regression] ICE modifying read-only variable")

2017-12-13 Thread Jason Merrill
OK.

On Wed, Dec 13, 2017 at 5:32 AM, Paolo Carlini  wrote:
> Hi,
>
> in this simple error recovery regression we ICE during gimplification after
> sensible diagnostic about assigning to a read-only location. The problem can
> be avoided by simply returning immediately error_mark_node upon
> cxx_readonly_error - the rest of the function does the same, ie, doesn't try
> to proceed when complain & tf_error. I also noticed that clang appears to
> behave in the same way for this error. Tested x86_64-linux.
>
> Thanks, Paolo
>
> 
>


Re: [PATCH] vrp_prop: Use dom_walker for -Warray-bounds (PR tree-optimization/83312)

2017-12-13 Thread Jeff Law
On 12/13/2017 09:55 AM, Michael Matz wrote:
> Hi,
> 
> On Tue, 12 Dec 2017, David Malcolm wrote:
> 
>> There didn't seem to be a pre-existing way to determine the unique
>> out-edge after a GIMPLE_COND (if it has a constant cond), so I added
>> a new gimple_cond_get_unique_successor_edge function.  Similarly,
>> something similar may apply for switches, so I put in a
>> gimple_get_unique_successor_edge (though I wasn't able to create a
>> reproducer that used a switch).
> 
> Please instead extend find_taken_edge() (its sub routines).  E.g. let it 
> accept a NULL val and initialize it with the cond from a gcond.
Yea.  That seems like the best way to go.

Jeff


Re: [SFN] Bootstrap broken

2017-12-13 Thread Rainer Orth
Hi Jakub,

>> On Wed, Dec 13, 2017 at 02:31:07PM +0100, Rainer Orth wrote:
>>> Hi Jakub,
>>> 
>>> > Here it is everything in patch form, in case some volunteers are willing 
>>> > to
>>> > test it on their targets, because we need faster turn-arounds for this.
>>> 
>>> thanks for that: it's easy to loose track in this maze ;-)
>>
>> True.  What I'm regtesting (bootstraps already done) on
>> {x86_64,i686,powerpc64{,le}}-linux now is:
>> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00811.html
>> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00808.html
>> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42861
>> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42866
>> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00794.html
>> set.  Does pr69102.c FAIL with that set?
>
> thanks for the list.  A sparc-sun-solaris2.11 bootstrap with the whole
> set is now running; expect results in about two hours.

completed now and the testsuite regression is gone, too.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: Add support for conditional reductions using SVE CLASTB

2017-12-13 Thread Jeff Law
On 11/17/2017 08:29 AM, Richard Sandiford wrote:
> This patch uses SVE CLASTB to optimise conditional reductions.  It means
> that we no longer need to maintain a separate index vector to record
> the most recent valid value, and no longer need to worry about overflow
> cases.
> 
> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> and powerpc64le-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> 2017-11-17  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * doc/md.texi (fold_extract_last_@var{m}): Document.
>   * doc/sourcebuild.texi (vect_fold_extract_last): Likewise.
>   * optabs.def (fold_extract_last_optab): New optab.
>   * internal-fn.def (FOLD_EXTRACT_LAST): New internal function.
>   * internal-fn.c (fold_extract_direct): New macro.
>   (expand_fold_extract_optab_fn): Likewise.
>   (direct_fold_extract_optab_supported_p): Likewise.
>   * tree-vectorizer.h (EXTRACT_LAST_REDUCTION): New vect_reduction_type.
>   * tree-vect-loop.c (vect_model_reduction_cost): Handle
>   EXTRACT_LAST_REDUCTION.
>   (get_initial_def_for_reduction): Do not create an initial vector
>   for EXTRACT_LAST_REDUCTION reductions.
>   (vectorizable_reduction): Leave the scalar phi in place for
>   EXTRACT_LAST_REDUCTIONs.  Try using EXTRACT_LAST_REDUCTION
>   ahead of INTEGER_INDUC_COND_REDUCTION.  Do not check for an
>   epilogue code for EXTRACT_LAST_REDUCTION and defer the
>   transform phase to vectorizable_condition.
>   * tree-vect-stmts.c (vect_finish_stmt_generation_1): New function,
>   split out from...
>   (vect_finish_stmt_generation): ...here.
>   (vect_finish_replace_stmt): New function.
>   (vectorizable_condition): Handle EXTRACT_LAST_REDUCTION.
>   * config/aarch64/aarch64-sve.md (fold_extract_last_): New
>   pattern.
>   * config/aarch64/aarch64.md (UNSPEC_CLASTB): New unspec.
> 
> gcc/testsuite/
>   * lib/target-supports.exp
>   (check_effective_target_vect_fold_extract_last): New proc.
>   * gcc.dg/vect/pr65947-1.c: Update dump messages.  Add markup
>   for fold_extract_last.
>   * gcc.dg/vect/pr65947-2.c: Likewise.
>   * gcc.dg/vect/pr65947-3.c: Likewise.
>   * gcc.dg/vect/pr65947-4.c: Likewise.
>   * gcc.dg/vect/pr65947-5.c: Likewise.
>   * gcc.dg/vect/pr65947-6.c: Likewise.
>   * gcc.dg/vect/pr65947-9.c: Likewise.
>   * gcc.dg/vect/pr65947-10.c: Likewise.
>   * gcc.dg/vect/pr65947-12.c: Likewise.
>   * gcc.dg/vect/pr65947-13.c: Likewise.
>   * gcc.dg/vect/pr65947-14.c: Likewise.
>   * gcc.target/aarch64/sve_clastb_1.c: New test.
>   * gcc.target/aarch64/sve_clastb_1_run.c: Likewise.
>   * gcc.target/aarch64/sve_clastb_2.c: Likewise.
>   * gcc.target/aarch64/sve_clastb_2_run.c: Likewise.
>   * gcc.target/aarch64/sve_clastb_3.c: Likewise.
>   * gcc.target/aarch64/sve_clastb_3_run.c: Likewise.
>   * gcc.target/aarch64/sve_clastb_4.c: Likewise.
>   * gcc.target/aarch64/sve_clastb_4_run.c: Likewise.
>   * gcc.target/aarch64/sve_clastb_5.c: Likewise.
>   * gcc.target/aarch64/sve_clastb_5_run.c: Likewise.
>   * gcc.target/aarch64/sve_clastb_6.c: Likewise.
>   * gcc.target/aarch64/sve_clastb_6_run.c: Likewise.
>   * gcc.target/aarch64/sve_clastb_7.c: Likewise.
>   * gcc.target/aarch64/sve_clastb_7_run.c: Likewise.
LIke some of the other patches, I focused just on the generic bits and
did not look at the aarch64 target bits.  The generic bits are OK.

jeff


Re: [PATCH PR81228][AARCH64] Fix ICE by adding LTGT in vec_cmp

2017-12-13 Thread James Greenhalgh
On Wed, Dec 13, 2017 at 04:45:33PM +, Sudi Das wrote:
> On 13/12/17 16:42, Sudakshina Das wrote:
> > Hi
> > 
> > This patch is a follow up to the existing discussions on 
> > https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01904.html
> > Bin had earlier submitted a patch to fix the ICE that occurs because of 
> > the missing LTGT in aarch64-simd.md.
> > That discussion opened up a new bug report PR81647 for an inconsistent 
> > behavior.
> > 
> > As discussed earlier on the gcc-patches discussion and on the bug 
> > report, PR81647 was occurring because of how UNEQ was handled in 
> > aarch64-simd.md rather than LTGT. Since __builtin_islessgreater is 
> > guaranteed to not give an FP exception but LTGT might, 
> > __builtin_islessgreater gets converted to ~UNEQ very early on in 
> > fold_builtin_unordered_cmp. Thus I will post a separate patch for 
> > correcting how UNEQ and other unordered comparisons are handled in 
> > aarch64-simd.md.
> > 
> > This patch is only adding the missing LTGT to plug the ICE.
> > 
> > Testing done: Checked for regressions on bootstrapped 
> > aarch64-none-linux-gnu and added a new compile time test case that gives 
> > out LTGT to make sure it doesn't ICE.
> > 
> > Is this ok for trunk?

OK.

Thanks,
James



Re: [PATCH] vrp_prop: Use dom_walker for -Warray-bounds (PR tree-optimization/83312)

2017-12-13 Thread Michael Matz
Hi,

On Tue, 12 Dec 2017, David Malcolm wrote:

> There didn't seem to be a pre-existing way to determine the unique
> out-edge after a GIMPLE_COND (if it has a constant cond), so I added
> a new gimple_cond_get_unique_successor_edge function.  Similarly,
> something similar may apply for switches, so I put in a
> gimple_get_unique_successor_edge (though I wasn't able to create a
> reproducer that used a switch).

Please instead extend find_taken_edge() (its sub routines).  E.g. let it 
accept a NULL val and initialize it with the cond from a gcond.


Ciao,
Michael.


Re: Allow gather loads to be used for grouped accesses

2017-12-13 Thread Jeff Law
On 11/17/2017 03:04 PM, Richard Sandiford wrote:
> Following on from the previous patch for strided accesses, this patch
> allows gather loads to be used with grouped accesses, if we otherwise
> would need to fall back to VMAT_ELEMENTWISE.  However, as the comment
> says, this is restricted to single-element groups for now:
> 
>??? Although the code can handle all group sizes correctly,
>it probably isn't a win to use separate strided accesses based
>on nearby locations.  Or, even if it's a win over scalar code,
>it might not be a win over vectorizing at a lower VF, if that
>allows us to use contiguous accesses.
> 
> Single-element groups are an important special case though,
> and this means that code is less sensitive to GCC's classification
> of single accesses with constant steps as "grouped" and ones with
> variable steps as "strided".
> 
> 2017-11-17  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-vectorizer.h (vect_gather_scatter_fn_p): Declare.
>   * tree-vect-data-refs.c (vect_gather_scatter_fn_p): Make public.
>   * tree-vect-stmts.c (vect_truncate_gather_scatter_offset): New
>   function.
>   (vect_use_strided_gather_scatters_p): Take a masked_p argument.
>   Use vect_truncate_gather_scatter_offset if we can't treat the
>   operation as a normal gather load or scatter store.
>   (get_group_load_store_type): Take the gather_scatter_info
>   as argument.  Try using a gather load or scatter store for
>   single-element groups.
>   (get_load_store_type): Update calls to get_group_load_store_type
>   and vect_use_strided_gather_scatters_p.
> 
> gcc/testsuite/
>   * gcc.target/aarch64/sve_strided_load_4.c: New test.
>   * gcc.target/aarch64/sve_strided_load_5.c: Likewise.
>   * gcc.target/aarch64/sve_strided_load_6.c: Likewise.
>   * gcc.target/aarch64/sve_strided_load_7.c: Likewise.
OK.
jeff


Re: Use gather loads for strided accesses

2017-12-13 Thread Jeff Law
On 11/17/2017 03:02 PM, Richard Sandiford wrote:
> This patch tries to use gather loads for strided accesses,
> rather than falling back to VMAT_ELEMENTWISE.
> 
> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> and powerpc64le-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> 2017-11-17  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-vectorizer.h (vect_create_data_ref_ptr): Take an extra
>   optional tree argument.
>   * tree-vect-data-refs.c (vect_create_data_ref_ptr): Take the
>   iv_step as an optional argument, but continue to use the current
>   value as a fallback.
>   (bump_vector_ptr): Use operand_equal_p rather than tree_int_cst_compare
>   to compare the updates.
>   * tree-vect-stmts.c (vect_use_strided_gather_scatters_p): New function.
>   (get_load_store_type): Use it when handling a strided access.
>   (vect_get_strided_load_store_ops): New function.
>   (vect_get_data_ptr_increment): Likewise.
>   (vectorizable_load): Handle strided gather loads.  Always pass
>   a step to vect_create_data_ref_ptr and bump_vector_ptr.
> 
> gcc/testsuite/
>   * gcc.target/aarch64/sve_strided_load_1.c: New test.
>   * gcc.target/aarch64/sve_strided_load_2.c: Likewise.
>   * gcc.target/aarch64/sve_strided_load_3.c: Likewise.
OK.
jeff


Re: [PATCH PR81228][AARCH64] Fix ICE by adding LTGT in vec_cmp

2017-12-13 Thread Sudakshina Das

On 13/12/17 16:42, Sudakshina Das wrote:

Hi

This patch is a follow up to the existing discussions on 
https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01904.html
Bin had earlier submitted a patch to fix the ICE that occurs because of 
the missing LTGT in aarch64-simd.md.
That discussion opened up a new bug report PR81647 for an inconsistent 
behavior.


As discussed earlier on the gcc-patches discussion and on the bug 
report, PR81647 was occurring because of how UNEQ was handled in 
aarch64-simd.md rather than LTGT. Since __builtin_islessgreater is 
guaranteed to not give an FP exception but LTGT might, 
__builtin_islessgreater gets converted to ~UNEQ very early on in 
fold_builtin_unordered_cmp. Thus I will post a separate patch for 
correcting how UNEQ and other unordered comparisons are handled in 
aarch64-simd.md.


This patch is only adding the missing LTGT to plug the ICE.

Testing done: Checked for regressions on bootstrapped 
aarch64-none-linux-gnu and added a new compile time test case that gives 
out LTGT to make sure it doesn't ICE.


Is this ok for trunk?

Thanks
Sudi

ChangeLog Entries:

*** gcc/ChangeLog ***

2017-12-13  Sudakshina Das  
     Bin Cheng  

     PR target/81228
     * config/aarch64/aarch64.c (aarch64_select_cc_mode): Move LTGT
     to CCFPEmode.
     * config/aarch64/aarch64-simd.md (vec_cmp): Add
     LTGT.

*** gcc/testsuite/ChangeLog ***

2017-12-13  Sudakshina Das  

     PR target/81228
     * gcc.dg/pr81228.c: New.


Sorry
Forgot to attach the patch!

Sudi
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index ae71af8334343a749f11db1801554eac01a33cac..f90f74fe7fd5990a97b9f4eb68f5735b7d4fb9aa 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -2759,6 +2759,7 @@
 case UNEQ:
 case ORDERED:
 case UNORDERED:
+case LTGT:
   break;
 default:
   gcc_unreachable ();
@@ -2813,6 +2814,15 @@
   emit_insn (gen_one_cmpl2 (operands[0], operands[0]));
   break;
 
+case LTGT:
+  /* LTGT is not guranteed to not generate a FP exception.  So let's
+	 go the faster way : ((a > b) || (b > a)).  */
+  emit_insn (gen_aarch64_cmgt (operands[0],
+	 operands[2], operands[3]));
+  emit_insn (gen_aarch64_cmgt (tmp, operands[3], operands[2]));
+  emit_insn (gen_ior3 (operands[0], operands[0], tmp));
+  break;
+
 case UNORDERED:
   /* Operands are ORDERED iff (a > b || b >= a), so we can compute
 	 UNORDERED as !ORDERED.  */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 75a6c0d0421354d7c0759292947eb5d407f5b703..3efb1b7548ea9b0ea5644d99a0677dbe5baba2ef 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4962,13 +4962,13 @@ aarch64_select_cc_mode (RTX_CODE code, rtx x, rtx y)
 	case UNGT:
 	case UNGE:
 	case UNEQ:
-	case LTGT:
 	  return CCFPmode;
 
 	case LT:
 	case LE:
 	case GT:
 	case GE:
+	case LTGT:
 	  return CCFPEmode;
 
 	default:
diff --git a/gcc/testsuite/gcc.dg/pr81228.c b/gcc/testsuite/gcc.dg/pr81228.c
new file mode 100644
index ..f7eecc510ad2acaa656a1ce5df0aafffa56b3bd9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr81228.c
@@ -0,0 +1,21 @@
+/* PR target/81228.  */
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-ssa" } */
+
+void *a;
+
+void b ()
+{
+  char c;
+  long d;
+  char *e = a;
+  for (; d; d++)
+  {
+double f, g;
+c = g < f || g > f;
+e[d] = c;
+  }
+}
+
+/* Let's make sure we do have a LTGT.  */
+/* { dg-final { scan-tree-dump "<>" "ssa" } } */


[PATCH PR81228][AARCH64] Fix ICE by adding LTGT in vec_cmp

2017-12-13 Thread Sudakshina Das

Hi

This patch is a follow up to the existing discussions on 
https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01904.html
Bin had earlier submitted a patch to fix the ICE that occurs because of 
the missing LTGT in aarch64-simd.md.
That discussion opened up a new bug report PR81647 for an inconsistent 
behavior.


As discussed earlier on the gcc-patches discussion and on the bug 
report, PR81647 was occurring because of how UNEQ was handled in 
aarch64-simd.md rather than LTGT. Since __builtin_islessgreater is 
guaranteed to not give an FP exception but LTGT might, 
__builtin_islessgreater gets converted to ~UNEQ very early on in 
fold_builtin_unordered_cmp. Thus I will post a separate patch for 
correcting how UNEQ and other unordered comparisons are handled in 
aarch64-simd.md.


This patch is only adding the missing LTGT to plug the ICE.

Testing done: Checked for regressions on bootstrapped 
aarch64-none-linux-gnu and added a new compile time test case that gives 
out LTGT to make sure it doesn't ICE.


Is this ok for trunk?

Thanks
Sudi

ChangeLog Entries:

*** gcc/ChangeLog ***

2017-12-13  Sudakshina Das  
Bin Cheng  

PR target/81228
* config/aarch64/aarch64.c (aarch64_select_cc_mode): Move LTGT
to CCFPEmode.
* config/aarch64/aarch64-simd.md (vec_cmp): Add
LTGT.

*** gcc/testsuite/ChangeLog ***

2017-12-13  Sudakshina Das  

PR target/81228
* gcc.dg/pr81228.c: New.


Re: Use single-iteration epilogues when peeling for gaps

2017-12-13 Thread Jeff Law
On 11/17/2017 08:38 AM, Richard Sandiford wrote:
> This patch adds support for fully-masking loops that require peeling
> for gaps.  It peels exactly one scalar iteration and uses the masked
> loop to handle the rest.  Previously we would fall back on using a
> standard unmasked loop instead.
> 
> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> and powerpc64le-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> 2017-11-17  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-vect-loop-manip.c (vect_gen_scalar_loop_niters): Replace
>   vfm1 with a bound_epilog parameter.
>   (vect_do_peeling): Update calls accordingly, and move the prologue
>   call earlier in the function.  Treat the base bound_epilog as 0 for
>   fully-masked loops and retain vf - 1 for other loops.  Add 1 to
>   this base when peeling for gaps.
>   * tree-vect-loop.c (vect_analyze_loop_2): Allow peeling for gaps
>   with fully-masked loops.
>   (vect_estimate_min_profitable_iters): Handle the single peeled
>   iteration in that case.
> 
> gcc/testsuite/
>   * gcc.target/aarch64/sve_struct_vect_18.c: Check the number
>   of branches.
>   * gcc.target/aarch64/sve_struct_vect_19.c: Likewise.
>   * gcc.target/aarch64/sve_struct_vect_20.c: New test.
>   * gcc.target/aarch64/sve_struct_vect_20_run.c: Likewise.
>   * gcc.target/aarch64/sve_struct_vect_21.c: Likewise.
>   * gcc.target/aarch64/sve_struct_vect_21_run.c: Likewise.
>   * gcc.target/aarch64/sve_struct_vect_22.c: Likewise.
>   * gcc.target/aarch64/sve_struct_vect_22_run.c: Likewise.
>   * gcc.target/aarch64/sve_struct_vect_23.c: Likewise.
>   * gcc.target/aarch64/sve_struct_vect_23_run.c: Likewise.
OK.
jeff


[committed] sel-sched: fix sel_rank_for_schedule for qsort (PR 82398)

2017-12-13 Thread Alexander Monakov
Hello,

I have applied the following patch (ack'ed by Andrey) to fix PR 82398.

The patch pasted in the Bugzilla also had a gcc_checking_assert for
VINSN_UNIQUE_P vs. SCHED_GROUP_P consistency verification earlier in that
comparator, but it's not directly related to the problem at hand, and Andrey
said he'd prefer to handle that differently.

Alexander

PR rtl-optimization/82398
* sel-sched.c (sel_rank_for_schedule): Fix check for zero
EXPR_USEFULNESS in priority comparison.

--- sel-sched.c (revision 255606)
+++ sel-sched.c (working copy)
@@ -3397,7 +3397,7 @@
 return 1;
   /* Prefer an expr with greater priority.  */
-  if (EXPR_USEFULNESS (tmp) != 0 && EXPR_USEFULNESS (tmp2) != 0)
+  if (EXPR_USEFULNESS (tmp) != 0 || EXPR_USEFULNESS (tmp2) != 0)
 {
   int p2 = EXPR_PRIORITY (tmp2) + EXPR_PRIORITY_ADJ (tmp2),
   p1 = EXPR_PRIORITY (tmp) + EXPR_PRIORITY_ADJ (tmp);



Re: Add support for vectorising live-out values using SVE LASTB

2017-12-13 Thread Jeff Law
On 11/17/2017 08:24 AM, Richard Sandiford wrote:
> This patch uses the SVE LASTB instruction to optimise cases in which
> a value produced by the final scalar iteration of a vectorised loop is
> live outside the loop.  Previously this situation would stop us from
> using a fully-masked loop.
> 
> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> and powerpc64le-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> 2017-11-17  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * doc/md.texi (extract_last_@var{m}): Document.
>   * optabs.def (extract_last_optab): New optab.
>   * internal-fn.def (EXTRACT_LAST): New internal function.
>   * internal-fn.c (cond_unary_direct): New macro.
>   (expand_cond_unary_optab_fn): Likewise.
>   (direct_cond_unary_optab_supported_p): Likewise.
>   * tree-vect-loop.c (vectorizable_live_operation): Allow fully-masked
>   loops using EXTRACT_LAST.
>   * config/aarch64/aarch64-sve.md (aarch64_sve_lastb): Rename to...
>   (extract_last_): ...this optab.
>   (vec_extract): Update accordingly.
> 
> gcc/testsuite/
>   * gcc.target/aarch64/sve_live_1.c: New test.
>   * gcc.target/aarch64/sve_live_1_run.c: Likewise.
Like the last patch, I didn't look at the aarch64 bits.  The generic
bits are OK.

jeff


Re: Add support for reductions in fully-masked loops

2017-12-13 Thread Jeff Law
On 11/17/2017 07:59 AM, Richard Sandiford wrote:
> This patch removes the restriction that fully-masked loops cannot
> have reductions.  The key thing here is to make sure that the
> reduction accumulator doesn't include any values associated with
> inactive lanes; the patch adds a bunch of conditional binary
> operations for doing that.
> 
> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> and powerpc64le-linux-gnu.
> 
> Richard
> 
> 
> 2017-11-17  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * doc/md.texi (cond_add@var{mode}, cond_sub@var{mode})
>   (cond_and@var{mode}, cond_ior@var{mode}, cond_xor@var{mode})
>   (cond_smin@var{mode}, cond_smax@var{mode}, cond_umin@var{mode})
>   (cond_umax@var{mode}): Document.
>   * optabs.def (cond_add_optab, cond_sub_optab, cond_and_optab)
>   (cond_ior_optab, cond_xor_optab, cond_smin_optab, cond_smax_optab)
>   (cond_umin_optab, cond_umax_optab): New optabs.
>   * internal-fn.def (COND_ADD, COND_SUB, COND_SMIN, COND_SMAX)
>   (COND_UMIN, COND_UMAX, COND_AND, COND_IOR, COND_XOR): New internal
>   functions.
>   * internal-fn.h (get_conditional_internal_fn): Declare.
>   * internal-fn.c (cond_binary_direct): New macro.
>   (expand_cond_binary_optab_fn): Likewise.
>   (direct_cond_binary_optab_supported_p): Likewise.
>   (get_conditional_internal_fn): New function.
>   * tree-vect-loop.c (vectorizable_reduction): Handle fully-masked loops.
>   Cope with reduction statements that are vectorized as calls rather
>   than assignments.
>   * config/aarch64/aarch64-sve.md (cond_): New insns.
>   * config/aarch64/iterators.md (UNSPEC_COND_ADD, UNSPEC_COND_SUB)
>   (UNSPEC_COND_SMAX, UNSPEC_COND_UMAX, UNSPEC_COND_SMIN)
>   (UNSPEC_COND_UMIN, UNSPEC_COND_AND, UNSPEC_COND_ORR)
>   (UNSPEC_COND_EOR): New unspecs.
>   (optab): Add mappings for them.
>   (SVE_COND_INT_OP, SVE_COND_FP_OP): New int iterators.
>   (sve_int_op, sve_fp_op): New int attributes.
> 
> gcc/testsuite/
>   * gcc.dg/vect/pr60482.c: Remove XFAIL for variable-length vectors.
>   * gcc.target/aarch64/sve_reduc_1.c: Expect the loop operations
>   to be predicated.
>   * gcc.target/aarch64/sve_slp_5.c: Check for a fully-masked loop.
>   * gcc.target/aarch64/sve_slp_7.c: Likewise.
>   * gcc.target/aarch64/sve_reduc_5.c: New test.
>   * gcc.target/aarch64/sve_slp_13.c: Likewise.
>   * gcc.target/aarch64/sve_slp_13_run.c: Likewise.
I didn't walk through the aarch64 specific bits here.  The generic bits
are OK.

jeff


Re: [SFN] Bootstrap broken

2017-12-13 Thread Christophe Lyon
On 13 December 2017 at 11:45, Jakub Jelinek  wrote:
> On Wed, Dec 13, 2017 at 10:28:22AM +0100, Jakub Jelinek wrote:
>> Formatting, this should be
>>   bool can_move_early_debug_stmts
>> = ...
>> and the line is too long, so needs to be wrapped.
>>
>> Furthermore, I must say I don't understand why
>> can_move_early_debug_stmts should care whether there are any labels in
>> dest bb or not.  That sounds very risky for introducing non-# DEBUG 
>> BEGIN_STMT
>> debug insns before labels if it could happen.  Though, if
>> gsi_stmt (gsi) == gsi_stmt (gsie), then the loop right below it will not
>> do anything and nothing cares about can_move_early_debug_stmts afterwards.
>> So, in short, can_move_early_debug_stmts is used only if
>> gsi_stmt (gsi) != gsi_stmt (gsie), and therefore
>> can_move_early_debug_stmts if it is used is can_move_debug_stmts && (1 || 
>> ...);
>>
>> So, can we get rid of can_move_early_debug_stmts altogether and just use
>> can_move_debug_stmts in there instead?
>>
>> Another thing I find risky is that you compute gsie_to so early and don't
>> update it.  If you don't need it above for can_move_early_debug_stmts, can
>> you just do it back where it used to be done,
>
> Here it is everything in patch form, in case some volunteers are willing to
> test it on their targets, because we need faster turn-arounds for this.
>

Thanks for that, it certainly helps.

So, this version does restore a successful build on arm --with-mode=thumb,
but pr69102 still fails in arm mode.

As I mentioned in PR83396, the 4 patches attached there fix all the
problems I noticed.

HTH.

Christophe

> 2017-12-13  Alexandre Oliva 
> Jakub Jelinek  
>
> PR bootstrap/83396
> PR debug/83391
> * tree-cfgcleanup.c (remove_forwarder_block): Keep after
> labels debug stmts that can only appear after labels.
>
> * gcc.dg/torture/pr83396.c: New test.
> * g++.dg/torture/pr83391.C: New test.
>
> --- gcc/tree-cfgcleanup.c.jj2017-12-12 09:48:26.813393301 +0100
> +++ gcc/tree-cfgcleanup.c   2017-12-13 11:39:03.373065381 +0100
> @@ -536,9 +536,14 @@ remove_forwarder_block (basic_block bb)
>   defined labels and labels with an EH landing pad number to the
>   new block, so that the redirection of the abnormal edges works,
>   jump targets end up in a sane place and debug information for
> - labels is retained.  */
> + labels is retained.
> +
> + While at that, move any debug stmts that appear before or in between
> + labels, but not those that can only appear after labels.  */
>gsi_to = gsi_start_bb (dest);
> -  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
> +  gsi = gsi_start_bb (bb);
> +  gimple_stmt_iterator gsie = gsi_after_labels (bb);
> +  while (gsi_stmt (gsi) != gsi_stmt (gsie))
>  {
>tree decl;
>label = gsi_stmt (gsi);
> @@ -557,6 +562,21 @@ remove_forwarder_block (basic_block bb)
> gsi_next (&gsi);
>  }
>
> +  /* Move debug statements if the destination has a single predecessor.  */
> +  if (can_move_debug_stmts && !gsi_end_p (gsi))
> +{
> +  gcc_assert (gsi_stmt (gsi) == gsi_stmt (gsie));
> +  gimple_stmt_iterator gsie_to = gsi_after_labels (dest);
> +  do
> +   {
> + gimple *debug = gsi_stmt (gsi);
> + gcc_assert (is_gimple_debug (debug));
> + gsi_remove (&gsi, false);
> + gsi_insert_before (&gsie_to, debug, GSI_SAME_STMT);
> +   }
> +  while (!gsi_end_p (gsi));
> +}
> +
>bitmap_set_bit (cfgcleanup_altered_bbs, dest->index);
>
>/* Update the dominators.  */
> --- gcc/testsuite/g++.dg/torture/pr83391.C.jj
> +++ gcc/testsuite/g++.dg/torture/pr83391.C
> @@ -0,0 +1,36 @@
> +// PR debug/83391
> +// { dg-do compile }
> +// { dg-options "-g" }
> +// { dg-additional-options "-mbranch-cost=1" { target { i?86-*-* x86_64-*-* 
> mips*-*-* s390*-*-* avr*-*-* } } }
> +
> +unsigned char a;
> +enum E { F, G, H } b;
> +int c, d;
> +
> +void
> +foo ()
> +{
> +  int e;
> +  bool f;
> +  E g = b;
> +  while (1)
> +{
> +  unsigned char h = a ? d : 0;
> +  switch (g)
> +   {
> +   case 0:
> + f = h <= 'Z' || h >= 'a' && h <= 'z';
> + break;
> +   case 1:
> + {
> +   unsigned char i = h;
> +   e = 0;
> + }
> + if (e || h)
> +   g = H;
> + /* FALLTHRU */
> +   default:
> + c = 0;
> +   }
> +}
> +}
> --- gcc/testsuite/gcc.dg/torture/pr83396.c.jj
> +++ gcc/testsuite/gcc.dg/torture/pr83396.c
> @@ -0,0 +1,38 @@
> +/* PR bootstrap/83396 */
> +/* { dg-do compile } */
> +/* { dg-options "-g" } */
> +
> +int fn1 (void);
> +void fn2 (void *, const char *);
> +void fn3 (void);
> +
> +void
> +fn4 (long long x)
> +{
> +  fn3 ();
> +}
> +
> +void
> +fn5 (long long x)
> +{
> +  if (x)
> +fn3();
> +}
> +
> +void
> +fn6 (long long x)
> +{
> +  switch (fn1 ())
> +{
> +case 0:
> +  fn5 (x);
> +case 2:
> +  fn2 (0, ""

Re: [PATCH] vrp_prop: Use dom_walker for -Warray-bounds (PR tree-optimization/83312)

2017-12-13 Thread Richard Biener
On December 13, 2017 5:18:16 PM GMT+01:00, Jeff Law  wrote:
>On 12/13/2017 09:02 AM, David Malcolm wrote:
>> On Wed, 2017-12-13 at 08:46 -0700, Jeff Law wrote:
>>> On 12/13/2017 03:06 AM, Richard Biener wrote:
 On December 12, 2017 9:50:38 PM GMT+01:00, David Malcolm >>> redhat.com> wrote:
> PR tree-optimization/83312 reports a false positive from
> -Warray-bounds.
> The root cause is that VRP both:
>
> (a) updates a GIMPLE_COND to be always false, and
>
> (b) updates an ARRAY_REF in the now-unreachable other path to use
> an
>ASSERT_EXPR with a negative index:
>  def_stmt j_6 = ASSERT_EXPR ;
>
> When vrp_prop::check_all_array_refs () runs, the CFG hasn't yet
> been
> updated to take account of (a), and so a false positive is
> emitted
> when (b) is encountered.
>
> This patch fixes the false warning by converting
>  vrp_prop::check_all_array_refs
> from a simple iteration over all BBs to use a new dom_walker
> subclass,
> using the "skip_unreachable_blocks = true" mechanism to avoid
> analyzing
> (b).
>
> There didn't seem to be a pre-existing way to determine the
> unique
> out-edge after a GIMPLE_COND (if it has a constant cond), so I
> added
> a new gimple_cond_get_unique_successor_edge function.  Similarly,
> something similar may apply for switches, so I put in a
> gimple_get_unique_successor_edge (though I wasn't able to create
> a
> reproducer that used a switch).
>
> Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
>
> OK for trunk?

 I don't like the GIMPLE.c bits a lot. Can't you use the existing
 known taken edge helper (too lazy to look up from my phone...)
 basically splitting out some code from cond processing in evrp for
 example? 
>>>
>>> I'm not sure those bits are needed at all. I think the right things
>>> will
>>> happen if we clear EDGE_EXECUTABLE on the appropriate edge in
>>> vrp_folder::fold_predicate_in.
>>>
>>> That should cause the block in question to become unreachable (zero
>>> preds).  So later when we do the domwal dom_walker::walk will see
>the
>>> block as unreachable -- which avoids walking into it and also
>>> triggers
>>> the call to propagate_unreachable_to_edges which marks the outgoing
>>> edges as not executable.
>> 
>> AIUI, dom_walker::bb_reachable only honors the EDGE_EXECUTABLE flags
>if
>> m_skip_unreachable_blocks is set on the dom_walker.
>> 
>> However, dom_walker's ctor clears all of the EDGE_EXECUTABLE if that
>> bool is set.
>Ugh.  How unfortunate, though I understand why it would be written that
>way.
>
>> 
>> So, as written any edge flags that are touched in
>> vrp_folder::fold_predicate_in will get reset when the dom walker is
>> created.
>> 
>> So should the dom walker get created before the vrp_folder runs?
>That would probably work, but something doesn't feel right about it.
>
>Alternately we could to the dom_walker ctor that an initial state of
>EDGE_EXECUTABLE is already set.

I'm quite sure that wouldn't help for VRP. 
I think David's approach is fine just we don't need any other API to get at a 
known executable outgoing edge. We can improve the existing one or just add the 
trivial folding required. 

Richard. 

>Jeff



Re: [PATCH] vrp_prop: Use dom_walker for -Warray-bounds (PR tree-optimization/83312)

2017-12-13 Thread Jeff Law
On 12/13/2017 09:02 AM, David Malcolm wrote:
> On Wed, 2017-12-13 at 08:46 -0700, Jeff Law wrote:
>> On 12/13/2017 03:06 AM, Richard Biener wrote:
>>> On December 12, 2017 9:50:38 PM GMT+01:00, David Malcolm >> redhat.com> wrote:
 PR tree-optimization/83312 reports a false positive from
 -Warray-bounds.
 The root cause is that VRP both:

 (a) updates a GIMPLE_COND to be always false, and

 (b) updates an ARRAY_REF in the now-unreachable other path to use
 an
ASSERT_EXPR with a negative index:
  def_stmt j_6 = ASSERT_EXPR ;

 When vrp_prop::check_all_array_refs () runs, the CFG hasn't yet
 been
 updated to take account of (a), and so a false positive is
 emitted
 when (b) is encountered.

 This patch fixes the false warning by converting
  vrp_prop::check_all_array_refs
 from a simple iteration over all BBs to use a new dom_walker
 subclass,
 using the "skip_unreachable_blocks = true" mechanism to avoid
 analyzing
 (b).

 There didn't seem to be a pre-existing way to determine the
 unique
 out-edge after a GIMPLE_COND (if it has a constant cond), so I
 added
 a new gimple_cond_get_unique_successor_edge function.  Similarly,
 something similar may apply for switches, so I put in a
 gimple_get_unique_successor_edge (though I wasn't able to create
 a
 reproducer that used a switch).

 Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.

 OK for trunk?
>>>
>>> I don't like the GIMPLE.c bits a lot. Can't you use the existing
>>> known taken edge helper (too lazy to look up from my phone...)
>>> basically splitting out some code from cond processing in evrp for
>>> example? 
>>
>> I'm not sure those bits are needed at all. I think the right things
>> will
>> happen if we clear EDGE_EXECUTABLE on the appropriate edge in
>> vrp_folder::fold_predicate_in.
>>
>> That should cause the block in question to become unreachable (zero
>> preds).  So later when we do the domwal dom_walker::walk will see the
>> block as unreachable -- which avoids walking into it and also
>> triggers
>> the call to propagate_unreachable_to_edges which marks the outgoing
>> edges as not executable.
> 
> AIUI, dom_walker::bb_reachable only honors the EDGE_EXECUTABLE flags if
> m_skip_unreachable_blocks is set on the dom_walker.
> 
> However, dom_walker's ctor clears all of the EDGE_EXECUTABLE if that
> bool is set.
Ugh.  How unfortunate, though I understand why it would be written that way.

> 
> So, as written any edge flags that are touched in
> vrp_folder::fold_predicate_in will get reset when the dom walker is
> created.
> 
> So should the dom walker get created before the vrp_folder runs?
That would probably work, but something doesn't feel right about it.

Alternately we could to the dom_walker ctor that an initial state of
EDGE_EXECUTABLE is already set.

Jeff


Re: [PATCH] set range for strlen(array) to avoid spurious -Wstringop-overflow (PR 83373 , PR 78450)

2017-12-13 Thread Martin Sebor

On 12/13/2017 12:25 AM, Bernhard Reutner-Fischer wrote:

On 12 December 2017 21:15:25 CET, Martin Sebor  wrote:



Tested on x86_64-linux.


I assume this test worked even before this patch.


Of the tests added by the patch, strlenopt-37.c passes without
the compiler changes and strlenopt-36.c fails.  Both are expected.
-37.c verifies the optimization doesn't happen when it shouldn't
happen and -36.c verifies it does take place when it's safe.

The regression test for the warning fails without the patch and
passes with the patch applied.


Thus:

s/oveflow/overflow/


Thanks.  I'll fix that if/when the patch is approved.

Martin



thanks,





Re: [PATCH] vrp_prop: Use dom_walker for -Warray-bounds (PR tree-optimization/83312)

2017-12-13 Thread David Malcolm
On Wed, 2017-12-13 at 08:46 -0700, Jeff Law wrote:
> On 12/13/2017 03:06 AM, Richard Biener wrote:
> > On December 12, 2017 9:50:38 PM GMT+01:00, David Malcolm  > redhat.com> wrote:
> > > PR tree-optimization/83312 reports a false positive from
> > > -Warray-bounds.
> > > The root cause is that VRP both:
> > > 
> > > (a) updates a GIMPLE_COND to be always false, and
> > > 
> > > (b) updates an ARRAY_REF in the now-unreachable other path to use
> > > an
> > >ASSERT_EXPR with a negative index:
> > >  def_stmt j_6 = ASSERT_EXPR ;
> > > 
> > > When vrp_prop::check_all_array_refs () runs, the CFG hasn't yet
> > > been
> > > updated to take account of (a), and so a false positive is
> > > emitted
> > > when (b) is encountered.
> > > 
> > > This patch fixes the false warning by converting
> > >  vrp_prop::check_all_array_refs
> > > from a simple iteration over all BBs to use a new dom_walker
> > > subclass,
> > > using the "skip_unreachable_blocks = true" mechanism to avoid
> > > analyzing
> > > (b).
> > > 
> > > There didn't seem to be a pre-existing way to determine the
> > > unique
> > > out-edge after a GIMPLE_COND (if it has a constant cond), so I
> > > added
> > > a new gimple_cond_get_unique_successor_edge function.  Similarly,
> > > something similar may apply for switches, so I put in a
> > > gimple_get_unique_successor_edge (though I wasn't able to create
> > > a
> > > reproducer that used a switch).
> > > 
> > > Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
> > > 
> > > OK for trunk?
> > 
> > I don't like the GIMPLE.c bits a lot. Can't you use the existing
> > known taken edge helper (too lazy to look up from my phone...)
> > basically splitting out some code from cond processing in evrp for
> > example? 
> 
> I'm not sure those bits are needed at all. I think the right things
> will
> happen if we clear EDGE_EXECUTABLE on the appropriate edge in
> vrp_folder::fold_predicate_in.
> 
> That should cause the block in question to become unreachable (zero
> preds).  So later when we do the domwal dom_walker::walk will see the
> block as unreachable -- which avoids walking into it and also
> triggers
> the call to propagate_unreachable_to_edges which marks the outgoing
> edges as not executable.

AIUI, dom_walker::bb_reachable only honors the EDGE_EXECUTABLE flags if
m_skip_unreachable_blocks is set on the dom_walker.

However, dom_walker's ctor clears all of the EDGE_EXECUTABLE if that
bool is set.

So, as written any edge flags that are touched in
vrp_folder::fold_predicate_in will get reset when the dom walker is
created.

So should the dom walker get created before the vrp_folder runs?

> I think David's got to go back to some diagnostics/FE stuff, so I'll
> probably be wrapping this up.
> 
> Thanks David!
> 
> jeff


Re: [PATCH] Fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83396#c28 on ia64 (PR bootstrap/83396)

2017-12-13 Thread Andreas Schwab
On Dez 13 2017, Jakub Jelinek  wrote:

> I think there are 2 issues.  One is that the ia64 backend emits
> the group barrier insns before BB_HEAD label, so it isn't part of a bb,
> but has BLOCK_FOR_INSN of the following block, that looks invalid to me
> and the ia64.c hunk ought to fix that, except that I don't have access to
> ia64 anymore and so can't test it.  Andreas, could you try that?

That doesn't bootstrap, details in the PR.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH] vrp_prop: Use dom_walker for -Warray-bounds (PR tree-optimization/83312)

2017-12-13 Thread Jeff Law
On 12/13/2017 03:06 AM, Richard Biener wrote:
> On December 12, 2017 9:50:38 PM GMT+01:00, David Malcolm 
>  wrote:
>> PR tree-optimization/83312 reports a false positive from
>> -Warray-bounds.
>> The root cause is that VRP both:
>>
>> (a) updates a GIMPLE_COND to be always false, and
>>
>> (b) updates an ARRAY_REF in the now-unreachable other path to use an
>>ASSERT_EXPR with a negative index:
>>  def_stmt j_6 = ASSERT_EXPR ;
>>
>> When vrp_prop::check_all_array_refs () runs, the CFG hasn't yet been
>> updated to take account of (a), and so a false positive is emitted
>> when (b) is encountered.
>>
>> This patch fixes the false warning by converting
>>  vrp_prop::check_all_array_refs
>>from a simple iteration over all BBs to use a new dom_walker subclass,
>> using the "skip_unreachable_blocks = true" mechanism to avoid analyzing
>> (b).
>>
>> There didn't seem to be a pre-existing way to determine the unique
>> out-edge after a GIMPLE_COND (if it has a constant cond), so I added
>> a new gimple_cond_get_unique_successor_edge function.  Similarly,
>> something similar may apply for switches, so I put in a
>> gimple_get_unique_successor_edge (though I wasn't able to create a
>> reproducer that used a switch).
>>
>> Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
>>
>> OK for trunk?
> 
> I don't like the GIMPLE.c bits a lot. Can't you use the existing known taken 
> edge helper (too lazy to look up from my phone...) basically splitting out 
> some code from cond processing in evrp for example? 
I'm not sure those bits are needed at all. I think the right things will
happen if we clear EDGE_EXECUTABLE on the appropriate edge in
vrp_folder::fold_predicate_in.

That should cause the block in question to become unreachable (zero
preds).  So later when we do the domwal dom_walker::walk will see the
block as unreachable -- which avoids walking into it and also triggers
the call to propagate_unreachable_to_edges which marks the outgoing
edges as not executable.

I think David's got to go back to some diagnostics/FE stuff, so I'll
probably be wrapping this up.

Thanks David!

jeff


Re: [SFN] Bootstrap broken

2017-12-13 Thread Richard Biener
On Wed, 13 Dec 2017, Jakub Jelinek wrote:

> On Wed, Dec 13, 2017 at 11:45:51AM +0100, Jakub Jelinek wrote:
> > On Wed, Dec 13, 2017 at 10:28:22AM +0100, Jakub Jelinek wrote:
> > > Formatting, this should be
> > >   bool can_move_early_debug_stmts
> > > = ...
> > > and the line is too long, so needs to be wrapped.
> > > 
> > > Furthermore, I must say I don't understand why
> > > can_move_early_debug_stmts should care whether there are any labels in
> > > dest bb or not.  That sounds very risky for introducing non-# DEBUG 
> > > BEGIN_STMT
> > > debug insns before labels if it could happen.  Though, if
> > > gsi_stmt (gsi) == gsi_stmt (gsie), then the loop right below it will not
> > > do anything and nothing cares about can_move_early_debug_stmts afterwards.
> > > So, in short, can_move_early_debug_stmts is used only if
> > > gsi_stmt (gsi) != gsi_stmt (gsie), and therefore
> > > can_move_early_debug_stmts if it is used is can_move_debug_stmts && (1 || 
> > > ...);
> > > 
> > > So, can we get rid of can_move_early_debug_stmts altogether and just use
> > > can_move_debug_stmts in there instead?
> > > 
> > > Another thing I find risky is that you compute gsie_to so early and don't
> > > update it.  If you don't need it above for can_move_early_debug_stmts, can
> > > you just do it back where it used to be done,
> > 
> > Here it is everything in patch form, in case some volunteers are willing to
> > test it on their targets, because we need faster turn-arounds for this.
> > 
> > 2017-12-13  Alexandre Oliva 
> > Jakub Jelinek  
> > 
> > PR bootstrap/83396
> > PR debug/83391
> > * tree-cfgcleanup.c (remove_forwarder_block): Keep after
> > labels debug stmts that can only appear after labels.
> > 
> > * gcc.dg/torture/pr83396.c: New test.
> > * g++.dg/torture/pr83391.C: New test.
> 
> Bootstrapped/regtested on x86_64-linux, i686-linux and powerpc64le-linux,
> powerpc64-linux regtest still pending, ok for trunk?

Ok.

Richard.

> > --- gcc/tree-cfgcleanup.c.jj2017-12-12 09:48:26.813393301 +0100
> > +++ gcc/tree-cfgcleanup.c   2017-12-13 11:39:03.373065381 +0100
> > @@ -536,9 +536,14 @@ remove_forwarder_block (basic_block bb)
> >   defined labels and labels with an EH landing pad number to the
> >   new block, so that the redirection of the abnormal edges works,
> >   jump targets end up in a sane place and debug information for
> > - labels is retained.  */
> > + labels is retained.
> > +
> > + While at that, move any debug stmts that appear before or in between
> > + labels, but not those that can only appear after labels.  */
> >gsi_to = gsi_start_bb (dest);
> > -  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
> > +  gsi = gsi_start_bb (bb);
> > +  gimple_stmt_iterator gsie = gsi_after_labels (bb);
> > +  while (gsi_stmt (gsi) != gsi_stmt (gsie))
> >  {
> >tree decl;
> >label = gsi_stmt (gsi);
> > @@ -557,6 +562,21 @@ remove_forwarder_block (basic_block bb)
> > gsi_next (&gsi);
> >  }
> >  
> > +  /* Move debug statements if the destination has a single predecessor.  */
> > +  if (can_move_debug_stmts && !gsi_end_p (gsi))
> > +{
> > +  gcc_assert (gsi_stmt (gsi) == gsi_stmt (gsie));
> > +  gimple_stmt_iterator gsie_to = gsi_after_labels (dest);
> > +  do
> > +   {
> > + gimple *debug = gsi_stmt (gsi);
> > + gcc_assert (is_gimple_debug (debug));
> > + gsi_remove (&gsi, false);
> > + gsi_insert_before (&gsie_to, debug, GSI_SAME_STMT);
> > +   }
> > +  while (!gsi_end_p (gsi));
> > +}
> > +
> >bitmap_set_bit (cfgcleanup_altered_bbs, dest->index);
> >  
> >/* Update the dominators.  */
> > --- gcc/testsuite/g++.dg/torture/pr83391.C.jj
> > +++ gcc/testsuite/g++.dg/torture/pr83391.C
> > @@ -0,0 +1,36 @@
> > +// PR debug/83391
> > +// { dg-do compile }
> > +// { dg-options "-g" }
> > +// { dg-additional-options "-mbranch-cost=1" { target { i?86-*-* 
> > x86_64-*-* mips*-*-* s390*-*-* avr*-*-* } } }
> > +
> > +unsigned char a;
> > +enum E { F, G, H } b;
> > +int c, d;
> > +
> > +void
> > +foo ()
> > +{
> > +  int e;
> > +  bool f;
> > +  E g = b;
> > +  while (1)
> > +{
> > +  unsigned char h = a ? d : 0;
> > +  switch (g)
> > +   {
> > +   case 0:
> > + f = h <= 'Z' || h >= 'a' && h <= 'z';
> > + break;
> > +   case 1:
> > + {
> > +   unsigned char i = h;
> > +   e = 0;
> > + }
> > + if (e || h)
> > +   g = H;
> > + /* FALLTHRU */
> > +   default:
> > + c = 0;
> > +   }
> > +}
> > +}
> > --- gcc/testsuite/gcc.dg/torture/pr83396.c.jj
> > +++ gcc/testsuite/gcc.dg/torture/pr83396.c
> > @@ -0,0 +1,38 @@
> > +/* PR bootstrap/83396 */
> > +/* { dg-do compile } */
> > +/* { dg-options "-g" } */
> > +
> > +int fn1 (void);
> > +void fn2 (void *, const char *);
> > +void fn3 (void);
> > +
> > +void
> > +fn4 (long long x)
> > +{
> > +  fn3 ();
> > +}
> > +
> > +void
> > +fn5 (long long x)
> > +{
> > +  if (x)
> > +fn

Re: [SFN] Bootstrap broken

2017-12-13 Thread Richard Biener
On Wed, 13 Dec 2017, Jakub Jelinek wrote:

> On Wed, Dec 13, 2017 at 11:34:04AM +0100, Jakub Jelinek wrote:
> > 2017-12-13  Jakub Jelinek  
> > 
> > PR bootstrap/83396
> > * final.c (rest_of_handle_final): Call variable_tracking_main only
> > if !flag_var_tracking.
> 
> Bootstrapped/regtested on x86_64-linux, i686-linux and powerpc64le-linux,
> powerpc64-linux regtest still pending, ok for trunk?

Ok.

Richard.

> > --- gcc/final.c.jj  2017-12-12 09:48:15.0 +0100
> > +++ gcc/final.c 2017-12-13 11:29:12.284676265 +0100
> > @@ -4541,8 +4541,9 @@ rest_of_handle_final (void)
> >  {
> >const char *fnname = get_fnname_from_decl (current_function_decl);
> >  
> > -  /* Turn debug markers into notes.  */
> > -  if (!MAY_HAVE_DEBUG_BIND_INSNS && MAY_HAVE_DEBUG_MARKER_INSNS)
> > +  /* Turn debug markers into notes if the var-tracking pass has not
> > + been invoked.  */
> > +  if (!flag_var_tracking && MAY_HAVE_DEBUG_MARKER_INSNS)
> >  variable_tracking_main ();
> >  
> >assemble_start_function (current_function_decl, fnname);
> > 
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH] Verify allowed stmts before labels

2017-12-13 Thread Richard Biener
On Wed, 13 Dec 2017, Jakub Jelinek wrote:

> Hi!
> 
> PR83391/PR83396 failed because debug bind stmts were put before labels.
> Alex said that is undesirable, and that right now we want to allow
> just debug begin stmt markers before or intermixed with labels.
> 
> This patch ensures that through verification, which is what defines
> what is and isn't valid GIMPLE.  If we ever reconsider it, either allow
> further stmts or disallow even debug begin stmt markers, we can easily also
> tweak the verifier.  The patch has been successfully bootstrapped/regtested as
> part of the:
> 
> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00811.html  
>  
> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00808.html  
>  
> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42861  
>  
> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42866  
>  
> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00794.html  
>  
> 
> patchset, without the msg00811.html patch it of course doesn't survive
> bootstrap, as we insert debug bind stmts before labels in those cases.
> 
> Ok for trunk once the msg00811.html patch or something similar is committed?

Ok.

Richard.

> 2017-12-13  Jakub Jelinek  
> 
>   * tree-cfg.c (verify_gimple_in_cfg): Verify no non-label stmts
>   with the exception of debug begin stmt markers appear before
>   labels.
> 
> --- gcc/tree-cfg.c.jj 2017-12-12 21:24:23.0 +0100
> +++ gcc/tree-cfg.c2017-12-13 07:44:15.622790922 +0100
> @@ -5380,6 +5380,7 @@ verify_gimple_in_cfg (struct function *f
> err |= err2;
>   }
>  
> +  bool label_allowed = true;
>for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>   {
> gimple *stmt = gsi_stmt (gsi);
> @@ -5396,6 +5397,19 @@ verify_gimple_in_cfg (struct function *f
> err2 = true;
>   }
>  
> +   /* Labels may be preceded only by debug markers, not debug bind
> +  or source bind or any other statements.  */
> +   if (gimple_code (stmt) == GIMPLE_LABEL)
> + {
> +   if (!label_allowed)
> + {
> +   error ("gimple label in the middle of a basic block");
> +   err2 = true;
> + }
> + }
> +   else if (!gimple_debug_begin_stmt_p (stmt))
> + label_allowed = false;
> +
> err2 |= verify_gimple_stmt (stmt);
> err2 |= verify_location (&blocks, gimple_location (stmt));
>  
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[RFC PATCH] Avoid PRED_NEGATIVE_RETURN prediction on likely -1/0/1 comparison functions (PR middle-end/81914)

2017-12-13 Thread Jakub Jelinek
Hi!

While the PRED_NEGATIVE_RETURN heuristics generally works quite well, for
qsort comparison functions and similar, including the planned C++
spaceship operator <=> where typically negative and positive are
approximately even it doesn't work that well.  This patch is an attempt
to at least detect some of these cases.  It won't catch functions where
also other values are returned (e.g. a - b or similar), but then it would be
even harder to make a distinction.

Bootstrapped/regtested on {x86_64,i686,powerpc64le}-linux, regtest on
powerpc64-linux pending.  Honza, if it doesn't look completely bogus to you,
could you give it a spin on SPEC (which I don't have easy access to)?

2017-12-13  Jakub Jelinek  

PR middle-end/81914
* predict.c (zero_one_minusone): New function.
(apply_return_prediction): Avoid return prediction for functions
returning only -1, 0 and 1 values, unless they only return -1 and 0
or 0 and 1.

--- gcc/predict.c.jj2017-12-12 19:52:04.950182338 +0100
+++ gcc/predict.c   2017-12-13 11:54:10.139409006 +0100
@@ -2639,6 +2639,64 @@ return_prediction (tree val, enum predic
   return PRED_NO_PREDICTION;
 }
 
+/* Return zero if phi result could have values other than -1, 0 or 1,
+   otherwise return a bitmask, with bits 0, 1 and 2 set if -1, 0 and 1
+   values are used or likely.  */
+
+static int
+zero_one_minusone (gphi *phi, int limit)
+{
+  int phi_num_args = gimple_phi_num_args (phi);
+  int ret = 0;
+  for (int i = 0; i < phi_num_args; i++)
+{
+  tree t = PHI_ARG_DEF (phi, i);
+  if (TREE_CODE (t) != INTEGER_CST)
+   continue;
+  wide_int w = wi::to_wide (t);
+  if (w == -1)
+   ret |= 1;
+  else if (w == 0)
+   ret |= 2;
+  else if (w == 1)
+   ret |= 4;
+  else
+   return 0;
+}
+  for (int i = 0; i < phi_num_args; i++)
+{
+  tree t = PHI_ARG_DEF (phi, i);
+  if (TREE_CODE (t) == INTEGER_CST)
+   continue;
+  if (TREE_CODE (t) != SSA_NAME)
+   return 0;
+  gimple *g = SSA_NAME_DEF_STMT (t);
+  if (gimple_code (g) == GIMPLE_PHI && limit > 0)
+   if (int r = zero_one_minusone (as_a  (g), limit - 1))
+ {
+   ret |= r;
+   continue;
+ }
+  if (!is_gimple_assign (g))
+   return 0;
+  if (gimple_assign_cast_p (g))
+   {
+ tree rhs1 = gimple_assign_rhs1 (g);
+ if (TREE_CODE (rhs1) != SSA_NAME
+ || !INTEGRAL_TYPE_P (TREE_TYPE (rhs1))
+ || TYPE_PRECISION (TREE_TYPE (rhs1)) != 1
+ || !TYPE_UNSIGNED (TREE_TYPE (rhs1)))
+   return 0;
+ ret |= (2 | 4);
+ continue;
+   }
+  if (TREE_CODE_CLASS (gimple_assign_rhs_code (g)) != tcc_comparison)
+   return 0;
+  ret |= (2 | 4);
+}
+  return ret;
+}
+
 /* Find the basic block with return expression and look up for possible
return value trying to apply RETURN_PREDICTION heuristics.  */
 static void
@@ -2676,6 +2734,19 @@ apply_return_prediction (void)
   phi_num_args = gimple_phi_num_args (phi);
   pred = return_prediction (PHI_ARG_DEF (phi, 0), &direction);
 
+  /* Avoid the case where the function returns -1, 0 and 1 values and
+ nothing else.  Those could be qsort etc. comparison functions
+ where the negative return isn't less probable than positive.
+ For this require that the function returns at least -1 or 1
+ or -1 and a boolean value or comparison result, so that functions
+ returning just -1 and 0 are treated as if -1 represents error value.  */
+  if (INTEGRAL_TYPE_P (TREE_TYPE (return_val))
+  && !TYPE_UNSIGNED (TREE_TYPE (return_val))
+  && TYPE_PRECISION (TREE_TYPE (return_val)) > 1)
+if (int r = zero_one_minusone (phi, 3))
+  if ((r & (1 | 4)) == (1 | 4))
+   return;
+
   /* Avoid the degenerate case where all return values form the function
  belongs to same category (ie they are all positive constants)
  so we can hardly say something about them.  */

Jakub



[PATCH] Fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83396#c28 on ia64 (PR bootstrap/83396)

2017-12-13 Thread Jakub Jelinek
Hi!

I think there are 2 issues.  One is that the ia64 backend emits
the group barrier insns before BB_HEAD label, so it isn't part of a bb,
but has BLOCK_FOR_INSN of the following block, that looks invalid to me
and the ia64.c hunk ought to fix that, except that I don't have access to
ia64 anymore and so can't test it.  Andreas, could you try that?

Another thing is that if we because of this end up with insns outside of
basic blocks, the vt_initialize asserts will fire again.  Here, first of
all, IMNSHO we should assert that debug bind insns aren't outside of basic
blocks, the other patches and checking should ensure that (and if any slips
in, we want to fix that too rather than work-around).
Another is that while walking from get_first_insn to one before BB_HEAD 
(bb->next_bb),
we can encounter insns outside of bb not just before BB_HEAD (bb), but also
after BB_END (bb), both cases are outside of a bb and thus we can
expect BLOCK_FOR_INSN being NULL.

Bootstrapped/regtested on x86_64-linux, i686-linux, powerpc64le-linux,
regtest on powerpc64-linux pending.  Ok for trunk perhaps without the
ia64.c bits until that gets tested?

Or, in the PR there is a variant patch which just doesn't do the asserts and
doesn't have to track outside_bb.

2017-12-13  Jakub Jelinek  

PR bootstrap/83396
* config/ia64/ia64.c (emit_insn_group_barriers): If emitting a group
barrier before BB_HEAD, clear BLOCK_FOR_INSN on the group barrier.
* var-tracking.c (vt_initialize): Don't assert blocks without
BLOCK_FOR_INSN are only debug insns, instead assert they are outside of
basic blocks.  Don't reset debug bind stmts outside of basic blocks,
instead assert they aren't present outside of basic blocks.  Simplify.

* gcc.dg/pr83396.c: New test.

--- gcc/config/ia64/ia64.c.jj   2017-12-07 18:05:02.0 +0100
+++ gcc/config/ia64/ia64.c  2017-12-13 12:29:07.598684661 +0100
@@ -7098,7 +7098,13 @@ emit_insn_group_barriers (FILE *dump)
  if (dump)
fprintf (dump, "Emitting stop before label %d\n",
 INSN_UID (last_label));
- emit_insn_before (gen_insn_group_barrier (GEN_INT (3)), 
last_label);
+ insn
+   = emit_insn_before (gen_insn_group_barrier (GEN_INT (3)),
+   last_label);
+ /* If we emit the group barrier before BB_HEAD, it should
+be outside of any bb.  */
+ if (BB_HEAD (BLOCK_FOR_INSN (last_label)) == last_label)
+   BLOCK_FOR_INSN (insn) = NULL;
  insn = last_label;
 
  init_insn_group_barriers ();
--- gcc/var-tracking.c.jj   2017-12-12 09:48:26.0 +0100
+++ gcc/var-tracking.c  2017-12-13 12:38:51.856261497 +0100
@@ -10157,28 +10157,28 @@ vt_initialize (void)
 insns that might be before it too.  Unfortunately,
 BB_HEADER and BB_FOOTER are not set while we run this
 pass.  */
- insn = get_first_insn (bb);
- for (rtx_insn *next;
-  insn != BB_HEAD (bb->next_bb)
-? next = NEXT_INSN (insn), true : false;
+ rtx_insn *next;
+ bool outside_bb = true;
+ for (insn = get_first_insn (bb); insn != BB_HEAD (bb->next_bb);
   insn = next)
{
+ next = NEXT_INSN (insn);
+ if (insn == BB_HEAD (bb))
+   outside_bb = false;
  if (INSN_P (insn))
{
  basic_block save_bb = BLOCK_FOR_INSN (insn);
  if (!BLOCK_FOR_INSN (insn))
{
+ gcc_assert (outside_bb);
  BLOCK_FOR_INSN (insn) = bb;
- gcc_assert (DEBUG_INSN_P (insn));
- /* Reset debug insns between basic blocks.
-Their location is not reliable, because they
-were probably not maintained up to date.  */
- if (DEBUG_BIND_INSN_P (insn))
-   INSN_VAR_LOCATION_LOC (insn)
- = gen_rtx_UNKNOWN_VAR_LOC ();
}
  else
gcc_assert (BLOCK_FOR_INSN (insn) == bb);
+ /* Verify debug bind insns don't occur outside of bbs.  */
+ gcc_assert (!DEBUG_BIND_INSN_P (insn) || !outside_bb);
+ if (insn == BB_END (bb))
+   outside_bb = true;
 
  if (!frame_pointer_needed)
{
@@ -10255,6 +10255,8 @@ vt_initialize (void)
}
  BLOCK_FOR_INSN (insn) = save_bb;
}
+ else if (insn == BB_END (bb))
+   outside_bb = true;
}
  gcc_assert (offset == VTI (bb)->out.stack_adjust);
}
--- gcc/testsuite/gcc.dg/pr83396.c.jj   2017-12-13 12:41:05.1

Re: [SFN] Bootstrap broken

2017-12-13 Thread Rainer Orth
Hi Jakub,

> On Wed, Dec 13, 2017 at 02:31:07PM +0100, Rainer Orth wrote:
>> Hi Jakub,
>> 
>> > Here it is everything in patch form, in case some volunteers are willing to
>> > test it on their targets, because we need faster turn-arounds for this.
>> 
>> thanks for that: it's easy to loose track in this maze ;-)
>
> True.  What I'm regtesting (bootstraps already done) on
> {x86_64,i686,powerpc64{,le}}-linux now is:
> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00811.html
> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00808.html
> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42861
> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42866
> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00794.html
> set.  Does pr69102.c FAIL with that set?

thanks for the list.  A sparc-sun-solaris2.11 bootstrap with the whole
set is now running; expect results in about two hours.

>> I've just bootstrapped sparc-sun-solaris2.11 with your patch and this one:
>> 
>>  https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00794.html
>> 
>> The bootstrap succeeds, but the gcc.c-torture/compile/pr69102.c
>> regression persists.  Besides, I see
>> 
>> +FAIL: libgomp.graphite/force-parallel-4.c scan-tree-dump-times graphite
>> "2 loops carried no dependency" 1 (found 0 times)
>> +FAIL: libgomp.graphite/force-parallel-4.c scan-tree-dump-times optimized
>> "loopfn.1" 4 (found 0 times)
>> +FAIL: libgomp.graphite/force-parallel-8.c scan-tree-dump-times graphite
>> "5 loops carried no dependency" 1 (found 0 times)
>> 
>> which is most likely unrelated (I upgraded the tree from r255584 to
>> r255603).
>
> Yeah, these are almost certainly unrelated.

It certainly is: I've filed PR tree-optimization/83410.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[wwwdocs] mention AVR additions

2017-12-13 Thread Georg-Johann Lay

This adds AVR improvements to v8 Release Notes.

Ok?

Johann

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-8/changes.html,v
retrieving revision 1.22
diff -r1.22 changes.html
179c179,240
< 
---
> AVR
> 
>   
> The avr port now supports the following XMEGA-like devices:
> 
>   ATtiny212, ATtiny214, ATtiny412, ATtiny414, ATtiny416, ATtiny417,
>   ATtiny814, ATtiny816, ATtiny817, ATtiny1614, ATtiny1616, ATtiny1617,
>   ATtiny3214, ATtiny3216, ATtiny3217
> 
> The new devices are filed under 
> https://gcc.gnu.org/onlinedocs/gcc/AVR-Options.html";>-mmcu=avrxmega3.
> 
>   These devices see flash memory in the RAM address space, so that
> 	features like PROGMEM and __flash
> 	are no more needed (as opposed to other AVR families for which
> 	read-only data will be located in RAM except special, non-standard
> 	features are used to locate and access such data). This requires
> 	that the compiler is used with Binutils 2.29 or newer so that
> 	read-only data will be located in flash memory, see feature
> 	https://sourceware.org/PR21472";>PR21472.
>   A new command line option -mshort-calls is supported.
> 	This option is used internally for multilib selection of the
> 	avrxmega3 variants.
> 	It is not an optimization option, and you don't need to set
> 	it by hand.
> 
>   
>   
> The compiler now implements feature http://gcc.gnu.org/PR20296";>PR20296
> and will generate more efficient interrupt service routine (ISR)
> prologues and epilogues.  This is achieved by using the new pseudo
> instruction __gcc_isr which is supported and resolved by
> the GNU assembler.
> 
>   As the __gcc_isr pseudo-instruction will be resolved by
> 	the assembler, inline assembly is transparent to the process.
> 	This means that when inline assembly uses an instruction like
> 	INC that clobbers the condition code,
> 	then the assembler will detect this and generate an appropriate
> 	ISR prologue / epilogue chunk to save / restore SREG as needed.
>   A new command line option -mno-gas-isr-prologues
> 	has been added.  It disables the generation of the
> 	__gcc_isr pseudo instruction.
> 	Any non-naked ISR will save and restore SREG, tmp_reg and zero_reg,
> 	no matter whether the respective register is clobbered or used.
>   The feature is turned on per default for all optimization levels
> 	except for -O0 and -Og. It can still be
>   enabled by means of option -mgas-isr-prologues.
>   Support has been added for a new
> 	https://gcc.gnu.org/onlinedocs/gcc/AVR-Function-Attributes.html";>AVR function attribute
> 	no_gccisr. It can be used to disable
> 	__gcc_isr pseudo instruction generation for
> 	individual ISRs.
>   This optimization is only available if GCC is configured with
> 	GNU Binutils 2.29; or at least with a version of Binutils that
> 	implements feature
> 	https://sourceware.org/PR21683";>PR21683.
> 	For technical details and an example, see the
> 	https://sourceware.org/binutils/docs-2.29/as/AVR-Pseudo-Instructions.html";>GNU AVR assembler manual.
> 
>   
> 


Re: [SFN] Bootstrap broken

2017-12-13 Thread Jakub Jelinek
On Wed, Dec 13, 2017 at 11:34:04AM +0100, Jakub Jelinek wrote:
> 2017-12-13  Jakub Jelinek  
> 
>   PR bootstrap/83396
>   * final.c (rest_of_handle_final): Call variable_tracking_main only
>   if !flag_var_tracking.

Bootstrapped/regtested on x86_64-linux, i686-linux and powerpc64le-linux,
powerpc64-linux regtest still pending, ok for trunk?

> --- gcc/final.c.jj2017-12-12 09:48:15.0 +0100
> +++ gcc/final.c   2017-12-13 11:29:12.284676265 +0100
> @@ -4541,8 +4541,9 @@ rest_of_handle_final (void)
>  {
>const char *fnname = get_fnname_from_decl (current_function_decl);
>  
> -  /* Turn debug markers into notes.  */
> -  if (!MAY_HAVE_DEBUG_BIND_INSNS && MAY_HAVE_DEBUG_MARKER_INSNS)
> +  /* Turn debug markers into notes if the var-tracking pass has not
> + been invoked.  */
> +  if (!flag_var_tracking && MAY_HAVE_DEBUG_MARKER_INSNS)
>  variable_tracking_main ();
>  
>assemble_start_function (current_function_decl, fnname);
> 

Jakub


Re: [SFN] Bootstrap broken

2017-12-13 Thread Jakub Jelinek
On Wed, Dec 13, 2017 at 11:45:51AM +0100, Jakub Jelinek wrote:
> On Wed, Dec 13, 2017 at 10:28:22AM +0100, Jakub Jelinek wrote:
> > Formatting, this should be
> >   bool can_move_early_debug_stmts
> > = ...
> > and the line is too long, so needs to be wrapped.
> > 
> > Furthermore, I must say I don't understand why
> > can_move_early_debug_stmts should care whether there are any labels in
> > dest bb or not.  That sounds very risky for introducing non-# DEBUG 
> > BEGIN_STMT
> > debug insns before labels if it could happen.  Though, if
> > gsi_stmt (gsi) == gsi_stmt (gsie), then the loop right below it will not
> > do anything and nothing cares about can_move_early_debug_stmts afterwards.
> > So, in short, can_move_early_debug_stmts is used only if
> > gsi_stmt (gsi) != gsi_stmt (gsie), and therefore
> > can_move_early_debug_stmts if it is used is can_move_debug_stmts && (1 || 
> > ...);
> > 
> > So, can we get rid of can_move_early_debug_stmts altogether and just use
> > can_move_debug_stmts in there instead?
> > 
> > Another thing I find risky is that you compute gsie_to so early and don't
> > update it.  If you don't need it above for can_move_early_debug_stmts, can
> > you just do it back where it used to be done,
> 
> Here it is everything in patch form, in case some volunteers are willing to
> test it on their targets, because we need faster turn-arounds for this.
> 
> 2017-12-13  Alexandre Oliva 
>   Jakub Jelinek  
> 
>   PR bootstrap/83396
>   PR debug/83391
>   * tree-cfgcleanup.c (remove_forwarder_block): Keep after
>   labels debug stmts that can only appear after labels.
> 
>   * gcc.dg/torture/pr83396.c: New test.
>   * g++.dg/torture/pr83391.C: New test.

Bootstrapped/regtested on x86_64-linux, i686-linux and powerpc64le-linux,
powerpc64-linux regtest still pending, ok for trunk?

> --- gcc/tree-cfgcleanup.c.jj  2017-12-12 09:48:26.813393301 +0100
> +++ gcc/tree-cfgcleanup.c 2017-12-13 11:39:03.373065381 +0100
> @@ -536,9 +536,14 @@ remove_forwarder_block (basic_block bb)
>   defined labels and labels with an EH landing pad number to the
>   new block, so that the redirection of the abnormal edges works,
>   jump targets end up in a sane place and debug information for
> - labels is retained.  */
> + labels is retained.
> +
> + While at that, move any debug stmts that appear before or in between
> + labels, but not those that can only appear after labels.  */
>gsi_to = gsi_start_bb (dest);
> -  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
> +  gsi = gsi_start_bb (bb);
> +  gimple_stmt_iterator gsie = gsi_after_labels (bb);
> +  while (gsi_stmt (gsi) != gsi_stmt (gsie))
>  {
>tree decl;
>label = gsi_stmt (gsi);
> @@ -557,6 +562,21 @@ remove_forwarder_block (basic_block bb)
>   gsi_next (&gsi);
>  }
>  
> +  /* Move debug statements if the destination has a single predecessor.  */
> +  if (can_move_debug_stmts && !gsi_end_p (gsi))
> +{
> +  gcc_assert (gsi_stmt (gsi) == gsi_stmt (gsie));
> +  gimple_stmt_iterator gsie_to = gsi_after_labels (dest);
> +  do
> + {
> +   gimple *debug = gsi_stmt (gsi);
> +   gcc_assert (is_gimple_debug (debug));
> +   gsi_remove (&gsi, false);
> +   gsi_insert_before (&gsie_to, debug, GSI_SAME_STMT);
> + }
> +  while (!gsi_end_p (gsi));
> +}
> +
>bitmap_set_bit (cfgcleanup_altered_bbs, dest->index);
>  
>/* Update the dominators.  */
> --- gcc/testsuite/g++.dg/torture/pr83391.C.jj
> +++ gcc/testsuite/g++.dg/torture/pr83391.C
> @@ -0,0 +1,36 @@
> +// PR debug/83391
> +// { dg-do compile }
> +// { dg-options "-g" }
> +// { dg-additional-options "-mbranch-cost=1" { target { i?86-*-* x86_64-*-* 
> mips*-*-* s390*-*-* avr*-*-* } } }
> +
> +unsigned char a;
> +enum E { F, G, H } b;
> +int c, d;
> +
> +void
> +foo ()
> +{
> +  int e;
> +  bool f;
> +  E g = b;
> +  while (1)
> +{
> +  unsigned char h = a ? d : 0;
> +  switch (g)
> + {
> + case 0:
> +   f = h <= 'Z' || h >= 'a' && h <= 'z';
> +   break;
> + case 1:
> +   {
> + unsigned char i = h;
> + e = 0;
> +   }
> +   if (e || h)
> + g = H;
> +   /* FALLTHRU */
> + default:
> +   c = 0;
> + }
> +}
> +}
> --- gcc/testsuite/gcc.dg/torture/pr83396.c.jj
> +++ gcc/testsuite/gcc.dg/torture/pr83396.c
> @@ -0,0 +1,38 @@
> +/* PR bootstrap/83396 */
> +/* { dg-do compile } */
> +/* { dg-options "-g" } */
> +
> +int fn1 (void);
> +void fn2 (void *, const char *);
> +void fn3 (void);
> +
> +void
> +fn4 (long long x)
> +{
> +  fn3 ();
> +}
> +
> +void
> +fn5 (long long x)
> +{
> +  if (x)
> +fn3();
> +}
> +
> +void
> +fn6 (long long x)
> +{
> +  switch (fn1 ())
> +{
> +case 0:
> +  fn5 (x);
> +case 2:
> +  fn2 (0, "");
> +  break;
> +case 1:
> +case 3:
> +  fn4(x);
> +case 5:
> +  fn2 (0, "");
> +}
> +}
> 
> 
>   Jakub

Jak

[PATCH] Verify allowed stmts before labels

2017-12-13 Thread Jakub Jelinek
Hi!

PR83391/PR83396 failed because debug bind stmts were put before labels.
Alex said that is undesirable, and that right now we want to allow
just debug begin stmt markers before or intermixed with labels.

This patch ensures that through verification, which is what defines
what is and isn't valid GIMPLE.  If we ever reconsider it, either allow
further stmts or disallow even debug begin stmt markers, we can easily also
tweak the verifier.  The patch has been successfully bootstrapped/regtested as
part of the:

https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00811.html
   
https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00808.html
   
https://gcc.gnu.org/bugzilla/attachment.cgi?id=42861
   
https://gcc.gnu.org/bugzilla/attachment.cgi?id=42866
   
https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00794.html
   

patchset, without the msg00811.html patch it of course doesn't survive
bootstrap, as we insert debug bind stmts before labels in those cases.

Ok for trunk once the msg00811.html patch or something similar is committed?

2017-12-13  Jakub Jelinek  

* tree-cfg.c (verify_gimple_in_cfg): Verify no non-label stmts
with the exception of debug begin stmt markers appear before
labels.

--- gcc/tree-cfg.c.jj   2017-12-12 21:24:23.0 +0100
+++ gcc/tree-cfg.c  2017-12-13 07:44:15.622790922 +0100
@@ -5380,6 +5380,7 @@ verify_gimple_in_cfg (struct function *f
  err |= err2;
}
 
+  bool label_allowed = true;
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
{
  gimple *stmt = gsi_stmt (gsi);
@@ -5396,6 +5397,19 @@ verify_gimple_in_cfg (struct function *f
  err2 = true;
}
 
+ /* Labels may be preceded only by debug markers, not debug bind
+or source bind or any other statements.  */
+ if (gimple_code (stmt) == GIMPLE_LABEL)
+   {
+ if (!label_allowed)
+   {
+ error ("gimple label in the middle of a basic block");
+ err2 = true;
+   }
+   }
+ else if (!gimple_debug_begin_stmt_p (stmt))
+   label_allowed = false;
+
  err2 |= verify_gimple_stmt (stmt);
  err2 |= verify_location (&blocks, gimple_location (stmt));
 

Jakub


Re: [SFN] Bootstrap broken

2017-12-13 Thread Jakub Jelinek
On Wed, Dec 13, 2017 at 02:31:07PM +0100, Rainer Orth wrote:
> Hi Jakub,
> 
> > Here it is everything in patch form, in case some volunteers are willing to
> > test it on their targets, because we need faster turn-arounds for this.
> 
> thanks for that: it's easy to loose track in this maze ;-)

True.  What I'm regtesting (bootstraps already done) on
{x86_64,i686,powerpc64{,le}}-linux now is:
https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00811.html
https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00808.html
https://gcc.gnu.org/bugzilla/attachment.cgi?id=42861
https://gcc.gnu.org/bugzilla/attachment.cgi?id=42866
https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00794.html
set.  Does pr69102.c FAIL with that set?

> I've just bootstrapped sparc-sun-solaris2.11 with your patch and this one:
> 
>   https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00794.html
> 
> The bootstrap succeeds, but the gcc.c-torture/compile/pr69102.c
> regression persists.  Besides, I see
> 
> +FAIL: libgomp.graphite/force-parallel-4.c scan-tree-dump-times graphite "2 
> loops carried no dependency" 1 (found 0 times)
> +FAIL: libgomp.graphite/force-parallel-4.c scan-tree-dump-times optimized 
> "loopfn.1" 4 (found 0 times)
> +FAIL: libgomp.graphite/force-parallel-8.c scan-tree-dump-times graphite "5 
> loops carried no dependency" 1 (found 0 times)
> 
> which is most likely unrelated (I upgraded the tree from r255584 to
> r255603).

Yeah, these are almost certainly unrelated.

Jakub


Re: [SFN] Bootstrap broken

2017-12-13 Thread Rainer Orth
Hi Jakub,

> Here it is everything in patch form, in case some volunteers are willing to
> test it on their targets, because we need faster turn-arounds for this.

thanks for that: it's easy to loose track in this maze ;-)

I've just bootstrapped sparc-sun-solaris2.11 with your patch and this one:

https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00794.html

The bootstrap succeeds, but the gcc.c-torture/compile/pr69102.c
regression persists.  Besides, I see

+FAIL: libgomp.graphite/force-parallel-4.c scan-tree-dump-times graphite "2 
loops carried no dependency" 1 (found 0 times)
+FAIL: libgomp.graphite/force-parallel-4.c scan-tree-dump-times optimized 
"loopfn.1" 4 (found 0 times)
+FAIL: libgomp.graphite/force-parallel-8.c scan-tree-dump-times graphite "5 
loops carried no dependency" 1 (found 0 times)

which is most likely unrelated (I upgraded the tree from r255584 to
r255603).

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH] combine: Fix PR83393

2017-12-13 Thread Segher Boessenkool
In move_deaths we move a REG_DEAD note if the instruction combination
has extended the lifetime of a register so that the existing note is
no longer valid.  We find that note using reg_stat, but what that finds
can refer to a later insn.  If so, we cannot use the cached value.  This
patch implements that.

Tested on powerpc64-linux {-m32,-m64}; test is running on x64_64-linux
{-m32,-m64} (the new testcase tested fine already).  I'll commit it in
a bit if that test succeeds.


Segher


2017-12-13  Segher Boessenkool  

PR rtl-optimization/83393
* combine.c (move_deaths): If reg_stat points to a too new insn in
last_death, do not use it: find the proper insn instead.

gcc/testsuite/
PR rtl-optimization/83393
* gcc.dg/pr83393.c: New testcase.

---
 gcc/combine.c  |  2 +-
 gcc/testsuite/gcc.dg/pr83393.c | 38 ++
 2 files changed, 39 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr83393.c

diff --git a/gcc/combine.c b/gcc/combine.c
index b12484a..f96e08e 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -13912,7 +13912,7 @@ move_deaths (rtx x, rtx maybe_kill_insn, int from_luid, 
rtx_insn *to_insn,
 
   /* If we do not know where the register died, it may still die between
 FROM_LUID and TO_INSN.  If so, find it.  This is PR83304.  */
-  if (!where_dead)
+  if (!where_dead || DF_INSN_LUID (where_dead) >= DF_INSN_LUID (to_insn))
{
  rtx_insn *insn = prev_real_insn (to_insn);
  while (insn
diff --git a/gcc/testsuite/gcc.dg/pr83393.c b/gcc/testsuite/gcc.dg/pr83393.c
new file mode 100644
index 000..a9a6b33
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr83393.c
@@ -0,0 +1,38 @@
+/* PR rtl-optimization/83393 */
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-forward-propagate -fno-tree-bit-ccp" } */
+
+typedef unsigned char u8;
+typedef unsigned short u16;
+typedef unsigned int u32;
+typedef unsigned long long u64;
+
+u32 a, d;
+u64 b;
+u8 c;
+
+static u64 __attribute__ ((noinline, noclone))
+foo (u16 f, u64 g)
+{
+  f <<= 15;
+  f *= d;
+  f -= g %= 44;
+  f <<= f <= g;
+  c = 255;
+  c >>= (u8) f == 0;
+  f *= g;
+  c -= ~c;
+  return f + a + b + f;
+}
+
+int
+main (void)
+{
+#if (__SIZEOF_LONG_LONG__ == 8 && __SIZEOF_INT__ == 4 \
+ && __SIZEOF_SHORT__ == 2 && __CHAR_BIT__ == 8)
+  u64 x = foo (3, 0xE6C0011BBA6DBD7LL);
+  if (x != 0x1f66e)
+__builtin_abort ();
+#endif
+  return 0;
+}
-- 
1.8.3.1



Re: patch to fix PR82353

2017-12-13 Thread Tom de Vries

On 10/16/2017 10:38 PM, Vladimir Makarov wrote:

This is another version of the patch to fix

    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82353

The patch was successfully bootstrapped on x86-64 with Go and Ada.

Committed as rev. 253796.


Hi Vladimir,

AFAIU this bit of the patch makes sure that the flags register show up 
in the bb_livein of the bb in which it's used (and not defined before 
the use), but not in the bb_liveout of the predecessors of that bb.


I wonder if that's a compile-speed optimization, or an oversight.

[ I ran into a similar problem for target gcn here (
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83327 ) with the
  spill_class hook. I've posted a tentative fix in that PR, which
  piggybacks on this fix, but needed a few extra bits to make sure that
  inter-bb propagation was done:
- a bit at the end of process_bb_lives to detect the liveness change and
  then set live_change_p which make sure the propagation is run.
- a bit in lra_create_live_ranges_1 to unmask the registers we want to
  propagate in all_hard_regs_bitmap.
]

Thanks,
- Tom


Index: lra-lives.c
===
--- lra-lives.c (revision 253685)
+++ lra-lives.c (working copy)
@@ -220,6 +220,9 @@ lra_intersected_live_ranges_p (lra_live_
return false;
  }
  
+/* The corresponding bitmaps of BB currently being processed.  */

+static bitmap bb_killed_pseudos, bb_gen_pseudos;
+
  /* The function processing birth of hard register REGNO.  It updates
 living hard regs, START_LIVING, and conflict hard regs for living
 pseudos.  Conflict hard regs for the pic pseudo is not updated if
@@ -243,6 +246,8 @@ make_hard_regno_born (int regno, bool ch
|| i != REGNO (pic_offset_table_rtx))
  #endif
SET_HARD_REG_BIT (lra_reg_info[i].conflict_hard_regs, regno);
+  if (fixed_regs[regno])
+bitmap_set_bit (bb_gen_pseudos, regno);
  }
  
  /* Process the death of hard register REGNO.  This updates

@@ -255,6 +260,11 @@ make_hard_regno_dead (int regno)
  return;
sparseset_set_bit (start_dying, regno);
CLEAR_HARD_REG_BIT (hard_regs_live, regno);
+  if (fixed_regs[regno])
+{
+  bitmap_clear_bit (bb_gen_pseudos, regno);
+  bitmap_set_bit (bb_killed_pseudos, regno);
+}
  }
  
  /* Mark pseudo REGNO as living at program point POINT, update conflicting

@@ -299,9 +309,6 @@ mark_pseudo_dead (int regno, int point)
  }
  }
  
-/* The corresponding bitmaps of BB currently being processed.  */

-static bitmap bb_killed_pseudos, bb_gen_pseudos;
-
  /* Mark register REGNO (pseudo or hard register) in MODE as live at
 program point POINT.  Update BB_GEN_PSEUDOS.
 Return TRUE if the liveness tracking sets were modified, or FALSE





Re: [PATCH] Fix Bug 83237 - Values returned by std::poisson_distribution are not distributed correctly

2017-12-13 Thread Jonathan Wakely

On 12/12/17 21:37 +0100, Paolo Carlini wrote:

Hi,

On 12/12/2017 19:42, Michele Pezzutti wrote:

Hi.

Yes, I looked at the text before submitting the patch.

I contacted Devroye and he confirmed that another reader had also 
pointed out this bug but not the solution. I sent him my proposed 
patch, he will look into it (no idea when though).

Nice.
I would state that "comparison function for x = 1 is e^(1/78)" 
(which becomes 1/78 as the algorithm uses log-probabilities).


I think the change is needed because otherwise, for that particular 
bin, the rejection probability is lower than it should be, resulting 
in a higher number of samples.
Ok. Ideally I would be much less nervous about committing the patch if 
we either 1- Had Luc's explicit green light; 2- Were able to 
*rigorously deduce* within the framework of the book why the change is 
needed. That said, the patch makes sense to me and so far holds up 
well in all my tests (I'm currently running a full make check). I 
would say, let's wait a week or so and then make the final decision. 
Jon, do you agree? Ideas about further testing? (eg, some code you are 
aware of stressing Poisson?)


No, I have nothing useful to add here, but I CC'd Ed on the PR as I'd
like his input.




Re: [SFN] Bootstrap broken

2017-12-13 Thread Jakub Jelinek
On Wed, Dec 13, 2017 at 10:28:22AM +0100, Jakub Jelinek wrote:
> Formatting, this should be
>   bool can_move_early_debug_stmts
> = ...
> and the line is too long, so needs to be wrapped.
> 
> Furthermore, I must say I don't understand why
> can_move_early_debug_stmts should care whether there are any labels in
> dest bb or not.  That sounds very risky for introducing non-# DEBUG BEGIN_STMT
> debug insns before labels if it could happen.  Though, if
> gsi_stmt (gsi) == gsi_stmt (gsie), then the loop right below it will not
> do anything and nothing cares about can_move_early_debug_stmts afterwards.
> So, in short, can_move_early_debug_stmts is used only if
> gsi_stmt (gsi) != gsi_stmt (gsie), and therefore
> can_move_early_debug_stmts if it is used is can_move_debug_stmts && (1 || 
> ...);
> 
> So, can we get rid of can_move_early_debug_stmts altogether and just use
> can_move_debug_stmts in there instead?
> 
> Another thing I find risky is that you compute gsie_to so early and don't
> update it.  If you don't need it above for can_move_early_debug_stmts, can
> you just do it back where it used to be done,

Here it is everything in patch form, in case some volunteers are willing to
test it on their targets, because we need faster turn-arounds for this.

2017-12-13  Alexandre Oliva 
Jakub Jelinek  

PR bootstrap/83396
PR debug/83391
* tree-cfgcleanup.c (remove_forwarder_block): Keep after
labels debug stmts that can only appear after labels.

* gcc.dg/torture/pr83396.c: New test.
* g++.dg/torture/pr83391.C: New test.

--- gcc/tree-cfgcleanup.c.jj2017-12-12 09:48:26.813393301 +0100
+++ gcc/tree-cfgcleanup.c   2017-12-13 11:39:03.373065381 +0100
@@ -536,9 +536,14 @@ remove_forwarder_block (basic_block bb)
  defined labels and labels with an EH landing pad number to the
  new block, so that the redirection of the abnormal edges works,
  jump targets end up in a sane place and debug information for
- labels is retained.  */
+ labels is retained.
+
+ While at that, move any debug stmts that appear before or in between
+ labels, but not those that can only appear after labels.  */
   gsi_to = gsi_start_bb (dest);
-  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
+  gsi = gsi_start_bb (bb);
+  gimple_stmt_iterator gsie = gsi_after_labels (bb);
+  while (gsi_stmt (gsi) != gsi_stmt (gsie))
 {
   tree decl;
   label = gsi_stmt (gsi);
@@ -557,6 +562,21 @@ remove_forwarder_block (basic_block bb)
gsi_next (&gsi);
 }
 
+  /* Move debug statements if the destination has a single predecessor.  */
+  if (can_move_debug_stmts && !gsi_end_p (gsi))
+{
+  gcc_assert (gsi_stmt (gsi) == gsi_stmt (gsie));
+  gimple_stmt_iterator gsie_to = gsi_after_labels (dest);
+  do
+   {
+ gimple *debug = gsi_stmt (gsi);
+ gcc_assert (is_gimple_debug (debug));
+ gsi_remove (&gsi, false);
+ gsi_insert_before (&gsie_to, debug, GSI_SAME_STMT);
+   }
+  while (!gsi_end_p (gsi));
+}
+
   bitmap_set_bit (cfgcleanup_altered_bbs, dest->index);
 
   /* Update the dominators.  */
--- gcc/testsuite/g++.dg/torture/pr83391.C.jj
+++ gcc/testsuite/g++.dg/torture/pr83391.C
@@ -0,0 +1,36 @@
+// PR debug/83391
+// { dg-do compile }
+// { dg-options "-g" }
+// { dg-additional-options "-mbranch-cost=1" { target { i?86-*-* x86_64-*-* 
mips*-*-* s390*-*-* avr*-*-* } } }
+
+unsigned char a;
+enum E { F, G, H } b;
+int c, d;
+
+void
+foo ()
+{
+  int e;
+  bool f;
+  E g = b;
+  while (1)
+{
+  unsigned char h = a ? d : 0;
+  switch (g)
+   {
+   case 0:
+ f = h <= 'Z' || h >= 'a' && h <= 'z';
+ break;
+   case 1:
+ {
+   unsigned char i = h;
+   e = 0;
+ }
+ if (e || h)
+   g = H;
+ /* FALLTHRU */
+   default:
+ c = 0;
+   }
+}
+}
--- gcc/testsuite/gcc.dg/torture/pr83396.c.jj
+++ gcc/testsuite/gcc.dg/torture/pr83396.c
@@ -0,0 +1,38 @@
+/* PR bootstrap/83396 */
+/* { dg-do compile } */
+/* { dg-options "-g" } */
+
+int fn1 (void);
+void fn2 (void *, const char *);
+void fn3 (void);
+
+void
+fn4 (long long x)
+{
+  fn3 ();
+}
+
+void
+fn5 (long long x)
+{
+  if (x)
+fn3();
+}
+
+void
+fn6 (long long x)
+{
+  switch (fn1 ())
+{
+case 0:
+  fn5 (x);
+case 2:
+  fn2 (0, "");
+  break;
+case 1:
+case 3:
+  fn4(x);
+case 5:
+  fn2 (0, "");
+}
+}


Jakub


[PATCH] RL78 bswaphi improvement

2017-12-13 Thread Sebastian Perta
Hello,

The following patch helps GCC to generate xch instruction.

The patch is being useful in many test cases from c-torture, for example in
gcc.c-torture/execute/pr52760.c 
xch is being generated 4 times in foo and the code size for foo is being
reduced from 94 to 58 bytes.

Regression test is OK, tested with the following command:
make -k check-gcc RUNTESTFLAGS=--target_board=rl78-sim

Please let me know if this is OK, Thank you!
Sebastian

Index: ChangeLog
===
--- ChangeLog   (revision 255581)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2017-12-12  Sebastian Perta  
+
+   * config/rl78/rl78-expand.md: New define_expand "bswaphi2"
+   * config/rl78/rl78-virt.md: New define_insn "*bswaphi2_virt"
+   * config/rl78/rl78-real.md: New define_insn "*bswaphi2_real"
+   
 2017-12-12  Richard Biener  
 
PR tree-optimization/83385
Index: config/rl78/rl78-expand.md
===
--- config/rl78/rl78-expand.md  (revision 255581)
+++ config/rl78/rl78-expand.md  (working copy)
@@ -105,6 +105,14 @@
   [(set_attr "valloc" "op1")]
 )
 
+(define_expand "bswaphi2"
+  [(set (match_operand:HI   0 "nonimmediate_operand")
+(bswap:HI (match_operand:HI 1 "general_operand")))]
+  ""
+  "if (rl78_force_nonfar_2 (operands, gen_bswaphi2))
+ DONE;"
+)
+
 ;;-- Conversions 
 
 (define_expand "zero_extendqihi2"
Index: config/rl78/rl78-real.md
===
--- config/rl78/rl78-real.md(revision 255581)
+++ config/rl78/rl78-real.md(working copy)
@@ -90,6 +90,15 @@
movw\t%0, %1"
 )
 
+(define_insn "*bswaphi2_real"
+  [(set (match_operand:HI   0 "rl78_nonfar_nonimm_operand" "=A,A")
+(bswap:HI (match_operand:HI 1 "general_operand"  "0,viU")))]
+  "rl78_real_insns_ok ()"
+  "@
+   xch\ta, x
+   movw\tax, %1\n\txch\ta, x"
+)
+
 ;;-- Conversions 
 
 (define_insn "*zero_extendqihi2_real"
Index: config/rl78/rl78-virt.md
===
--- config/rl78/rl78-virt.md(revision 255581)
+++ config/rl78/rl78-virt.md(working copy)
@@ -65,6 +65,14 @@
   [(set_attr "valloc" "op1")]
 )
 
+(define_insn "*bswaphi2_virt"
+  [(set (match_operand:HI   0 "rl78_nonfar_nonimm_operand" "=vm")
+(bswap:HI (match_operand:HI 1 "general_operand"  "vim")))]
+  "rl78_virt_insns_ok ()"
+  "v.bswaphi\t%0, %1"
+  [(set_attr "valloc" "op1")]
+)
+
 ;;-- Conversions 
 
 (define_insn "*zero_extendqihi2_virt"



[C++ Patch] PR 81061 ("[7/8 Regression] ICE modifying read-only variable")

2017-12-13 Thread Paolo Carlini

Hi,

in this simple error recovery regression we ICE during gimplification 
after sensible diagnostic about assigning to a read-only location. The 
problem can be avoided by simply returning immediately error_mark_node 
upon cxx_readonly_error - the rest of the function does the same, ie, 
doesn't try to proceed when complain & tf_error. I also noticed that 
clang appears to behave in the same way for this error. Tested x86_64-linux.


Thanks, Paolo



/cp
2017-12-13  Paolo Carlini  

PR c++/81061
* typeck.c (cp_build_modify_expr): Upon cxx_readonly_error
immediately return error_mark_node.

/testsuite
2017-12-13  Paolo Carlini  

PR c++/81061
* g++.dg/other/const5.C: New.
Index: cp/typeck.c
===
--- cp/typeck.c (revision 255602)
+++ cp/typeck.c (working copy)
@@ -8037,8 +8037,7 @@ cp_build_modify_expr (location_t loc, tree lhs, en
 {
   if (complain & tf_error)
cxx_readonly_error (lhs, lv_assign);
-  else
-   return error_mark_node;
+  return error_mark_node;
 }
 
   /* If storing into a structure or union member, it may have been given a
Index: testsuite/g++.dg/other/const5.C
===
--- testsuite/g++.dg/other/const5.C (nonexistent)
+++ testsuite/g++.dg/other/const5.C (working copy)
@@ -0,0 +1,8 @@
+// PR c++/81061
+
+const int i = 0;
+
+void foo()
+{
+  (0, i) = 1;  // { dg-error "read-only" }
+}


Re: [SFN] Bootstrap broken

2017-12-13 Thread Jakub Jelinek
On Wed, Dec 13, 2017 at 05:22:32AM -0200, Alexandre Oliva wrote:
> On Dec 12, 2017, Rainer Orth  wrote:
> 
> > Hi David,
> >> Something in this series broke bootstrap on AIX, probably Power in general.
> 
> > I'm seeing the same in a sparc-sun-solaris2.11 bootstrap.
> 
> The AIX patch, that I've just emailed out in this thread, should fix
> that as well.  As for the regression you reported, here's a fix.
> Regstrapping; ok to install?
> 
> 
> [SFN] don't assume BLOCK_FOR_INSN is set in var-tracking
> 
> There's no guarantee that BLOCK_FOR_INSN will be set before var-tracking.
> So, keep track of whether we're in the first block header or inside a BB
> explicitly, and apply the logic we meant to apply outside BBs only when
> we are indeed outside a BB.
> 
> for  gcc/ChangeLog
> 
>   PR bootstrap/83396
>   * var-tracking.c (vt_initialize): Keep track of BB boundaries.

This looks like a workaround for a bigger problem.

In particular, this testcase is using selective scheduling, therefore
we turn off -fvar-tracking-assignments, but the debug stmt markers are
emitted anyway.

-fvar-tracking is still true, so the var-tracking pass does everything it
normally does (successfully), then the free_cfg pass removes all
BLOCK_FOR_INSN notes, then some targets in their machine reorg recompute
those, but sparc apparently doesn't, and finally in final.c:

  /* Turn debug markers into notes.  */
  if (!MAY_HAVE_DEBUG_BIND_INSNS && MAY_HAVE_DEBUG_MARKER_INSNS)
variable_tracking_main ();

Eeek, this runs all of the var tracking again, even when it has been done
earlier, but this time without BLOCK_FOR_INSN.

This is just wrong.  So, I think the right fix here is instead (so far
tested just with sparc cross-compiler on the single testcase).

Or export delete_vta_debug_insns function and call that under that condition
instead of variable_tracking_main, which will do the same thing for
!flag_var_tracking.

2017-12-13  Jakub Jelinek  

PR bootstrap/83396
* final.c (rest_of_handle_final): Call variable_tracking_main only
if !flag_var_tracking.

--- gcc/final.c.jj  2017-12-12 09:48:15.0 +0100
+++ gcc/final.c 2017-12-13 11:29:12.284676265 +0100
@@ -4541,8 +4541,9 @@ rest_of_handle_final (void)
 {
   const char *fnname = get_fnname_from_decl (current_function_decl);
 
-  /* Turn debug markers into notes.  */
-  if (!MAY_HAVE_DEBUG_BIND_INSNS && MAY_HAVE_DEBUG_MARKER_INSNS)
+  /* Turn debug markers into notes if the var-tracking pass has not
+ been invoked.  */
+  if (!flag_var_tracking && MAY_HAVE_DEBUG_MARKER_INSNS)
 variable_tracking_main ();
 
   assemble_start_function (current_function_decl, fnname);


Jakub


Re: [patch, fortran] Implement maxval for characters

2017-12-13 Thread Christophe Lyon
On 11 December 2017 at 18:58, Thomas Koenig  wrote:
>
>> I have created https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83379
>> and assigned it to myself. This should be easy to fix.
>
>
> OK, I have updated the test cases in question.  They pass for
> me at least.
>
> I'll keep the PR open for a couple of days to make sure this
> is really fixed.
>

Thanks, it seems clean now.

Christophe

> Regards
>
> Thomas


Re: [PATCH] vrp_prop: Use dom_walker for -Warray-bounds (PR tree-optimization/83312)

2017-12-13 Thread Richard Biener
On December 12, 2017 9:50:38 PM GMT+01:00, David Malcolm  
wrote:
>PR tree-optimization/83312 reports a false positive from
>-Warray-bounds.
>The root cause is that VRP both:
>
>(a) updates a GIMPLE_COND to be always false, and
>
>(b) updates an ARRAY_REF in the now-unreachable other path to use an
>ASSERT_EXPR with a negative index:
>  def_stmt j_6 = ASSERT_EXPR ;
>
>When vrp_prop::check_all_array_refs () runs, the CFG hasn't yet been
>updated to take account of (a), and so a false positive is emitted
>when (b) is encountered.
>
>This patch fixes the false warning by converting
>  vrp_prop::check_all_array_refs
>from a simple iteration over all BBs to use a new dom_walker subclass,
>using the "skip_unreachable_blocks = true" mechanism to avoid analyzing
>(b).
>
>There didn't seem to be a pre-existing way to determine the unique
>out-edge after a GIMPLE_COND (if it has a constant cond), so I added
>a new gimple_cond_get_unique_successor_edge function.  Similarly,
>something similar may apply for switches, so I put in a
>gimple_get_unique_successor_edge (though I wasn't able to create a
>reproducer that used a switch).
>
>Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
>
>OK for trunk?

I don't like the GIMPLE.c bits a lot. Can't you use the existing known taken 
edge helper (too lazy to look up from my phone...) basically splitting out some 
code from cond processing in evrp for example? 

The Dom walk itself looks like a good solution to me. 

Richard. 

>gcc/ChangeLog:
>   PR tree-optimization/83312
>   * domwalk.h (dom_walker::dom_walker): Fix typo in comment.
>   * gimple.c: Include "tree-cfg.h".
>   (gimple_get_unique_successor_edge): New function.
>   (gimple_cond_get_unique_successor_edge): New function.
>   * gimple.h (gimple_get_unique_successor_edge): New decl.
>   (gimple_cond_get_unique_successor_edge): New decl.
>   * tree-vrp.c (class check_array_bounds_dom_walker): New subclass
>   of dom_walker.
>   (vrp_prop::check_all_array_refs): Reimplement as...
>   (check_array_bounds_dom_walker::before_dom_children): ...this new
>   vfunc.  Replace linear search through BB block list, excluding
>   those with non-executable in-edges, with dominator walk.
>
>gcc/testsuite/ChangeLog:
>   PR tree-optimization/83312
>   * gcc.dg/pr83312.c: New test case.
>---
> gcc/domwalk.h  |  2 +-
> gcc/gimple.c   | 36 +++
> gcc/gimple.h   |  2 ++
> gcc/testsuite/gcc.dg/pr83312.c | 30 
>gcc/tree-vrp.c | 80
>+++---
> 5 files changed, 120 insertions(+), 30 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/pr83312.c
>
>diff --git a/gcc/domwalk.h b/gcc/domwalk.h
>index 6ac93eb..c7e3450 100644
>--- a/gcc/domwalk.h
>+++ b/gcc/domwalk.h
>@@ -32,7 +32,7 @@ class dom_walker
> public:
>   static const edge STOP;
> 
>-  /* Use SKIP_UNREACHBLE_BLOCKS = true when your client can discover
>+  /* Use SKIP_UNREACHABLE_BLOCKS = true when your client can discover
>  that some edges are not executable.
> 
>  If a client can discover that a COND, SWITCH or GOTO has a static
>diff --git a/gcc/gimple.c b/gcc/gimple.c
>index c986a73..e22fcda 100644
>--- a/gcc/gimple.c
>+++ b/gcc/gimple.c
>@@ -44,6 +44,7 @@ along with GCC; see the file COPYING3.  If not see
> #include "stringpool.h"
> #include "attribs.h"
> #include "asan.h"
>+#include "tree-cfg.h"
> 
> 
>/* All the tuples have their operand vector (if present) at the very
>bottom
>@@ -3087,6 +3088,41 @@ gimple_inexpensive_call_p (gcall *stmt)
>   return false;
> }
> 
>+/* If STMT terminates its basic block, determine if it has a uniquely
>+   valid successor edge and if so return it.
>+
>+   Otherwise, return NULL.  */
>+
>+edge
>+gimple_get_unique_successor_edge (const gimple *stmt)
>+{
>+  switch (gimple_code (stmt))
>+{
>+case GIMPLE_COND:
>+  return gimple_cond_get_unique_successor_edge
>+  (as_a  (stmt));
>+default:
>+  return NULL;
>+}
>+}
>+
>+/* Determine if COND has a uniquely valid successor edge and if so
>return it.
>+
>+   Otherwise, return NULL.  */
>+
>+edge
>+gimple_cond_get_unique_successor_edge (const gcond *cond)
>+{
>+  edge te, fe;
>+  extract_true_false_edges_from_block (gimple_bb (cond), &te, &fe);
>+  if (gimple_cond_true_p (cond))
>+return te;
>+  else if (gimple_cond_false_p (cond))
>+return fe;
>+  else
>+return NULL;
>+}
>+
> #if CHECKING_P
> 
> namespace selftest {
>diff --git a/gcc/gimple.h b/gcc/gimple.h
>index 0fcdd05..ab4cb8b 100644
>--- a/gcc/gimple.h
>+++ b/gcc/gimple.h
>@@ -1529,6 +1529,8 @@ extern void gimple_seq_discard (gimple_seq);
>extern void maybe_remove_unused_call_args (struct function *, gimple
>*);
> extern bool gimple_inexpensive_call_p (gcall *);
> extern bool stmt_can_terminate_bb_p (gimple *);
>+extern edge gimple_get_unique_successor_edge (const gimple *stmt);
>+extern edge gimple_

Re: [C++] Add support for #pragma GCC unroll v4

2017-12-13 Thread Eric Botcazou
Ping for the last missing bits of the #pragma GCC unroll support:

> this is the (hopefully) final implementation of the support for the
> unrolling pragma in the C++ front-end.  

https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00298.html

Thanks in advance.

-- 
Eric Botcazou


[PATCH] x86: don't use AVX512BW vmovdqu variants without -mavx512bw

2017-12-13 Thread Jan Beulich
Simply mirror the MODE_XI logic of handling unaligned operands in
mov_internal into MODE_TI / MODE_OI handling.

gcc/
2017-12-13  Jan Beulich  

* sse.md (mov_internal): Tighten condition for when to use
vmovdqu for TI and OI modes.

gcc/testsuite/
2017-12-13  Jan Beulich  

* gcc.target/i386/avx512vl-no-vmovdqu8.c,
gcc.target/i386/avx512vl-no-vmovdqu16.c: New.

---
I'm also being puzzled by the code being generated for the 256-bit cases
(which shouldn't differ much from the 128-bit ones).

--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -1005,8 +1005,14 @@
case MODE_TI:
  if (misaligned_operand (operands[0], mode)
  || misaligned_operand (operands[1], mode))
-   return TARGET_AVX512VL ? "vmovdqu\t{%1, %0|%0, %1}"
-  : "%vmovdqu\t{%1, %0|%0, %1}";
+   return TARGET_AVX512VL
+  && (mode == V4SImode
+  || mode == V2DImode
+  || mode == V8SImode
+  || mode == V4DImode
+  || TARGET_AVX512BW)
+  ? "vmovdqu\t{%1, %0|%0, %1}"
+  : "%vmovdqu\t{%1, %0|%0, %1}";
  else
return TARGET_AVX512VL ? "vmovdqa64\t{%1, %0|%0, %1}"
   : "%vmovdqa\t{%1, %0|%0, %1}";
--- a/gcc/testsuite/gcc.target/i386/avx512vl-no-vmovdqu16.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-no-vmovdqu16.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vl -mno-avx512bw" } */
+
+typedef unsigned int __attribute__((mode(HI), vector_size(16))) v8hi_t;
+typedef unsigned int __attribute__((mode(HI), vector_size(32))) v16hi_t;
+
+struct s8hi {
+   int i;
+   v8hi_t __attribute__((packed)) v;
+};
+struct s16hi {
+   int i;
+   v16hi_t __attribute__((packed)) v;
+};
+
+void f8hi(struct s8hi*p1, const struct s8hi*p2) {
+   p1->v += p2->v;
+}
+
+void f16hi(struct s16hi*p1, const struct s16hi*p2) {
+   p1->v += p2->v;
+}
+
+/* { dg-final { scan-assembler-not "^\[ \t\]*vmovdq\[au\](8|16)" } } */
--- a/gcc/testsuite/gcc.target/i386/avx512vl-no-vmovdqu8.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-no-vmovdqu8.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vl -mno-avx512bw" } */
+
+typedef unsigned int __attribute__((mode(QI), vector_size(16))) v16qi_t;
+typedef unsigned int __attribute__((mode(QI), vector_size(32))) v32qi_t;
+
+struct s16qi {
+   int i;
+   v16qi_t __attribute__((packed)) v;
+};
+struct s32qi {
+   int i;
+   v32qi_t __attribute__((packed)) v;
+};
+
+void f16qi(struct s16qi*p1, const struct s16qi*p2) {
+   p1->v += p2->v;
+}
+
+void f32qi(struct s32qi*p1, const struct s32qi*p2) {
+   p1->v += p2->v;
+}
+
+/* { dg-final { scan-assembler-not "^\[ \t\]*vmovdq\[au\](8|16)" } } */





[Ada] Make sure subprogram locus is initialized

2017-12-13 Thread Eric Botcazou
This is another fixlet aimed at making gigi more robust in the presence of 
unexpected nodes in the expanded code.

Tested on x86_64-suse-linux, applied on the mainline and 7 branch.


2017-12-13  Eric Botcazou  

* gcc-interface/trans.c (Subprogram_Body_to_gnu): Initialize locus.

-- 
Eric BotcazouIndex: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 255601)
+++ gcc-interface/trans.c	(working copy)
@@ -3773,7 +3773,8 @@ Subprogram_Body_to_gnu (Node_Id gnat_nod
 }
 
   /* Set the line number in the decl to correspond to that of the body.  */
-  Sloc_to_locus (Sloc (gnat_node), &locus);
+  if (!Sloc_to_locus (Sloc (gnat_node), &locus))
+locus = input_location;
   DECL_SOURCE_LOCATION (gnu_subprog_decl) = locus;
 
   /* If the body comes from an expression function, arrange it to be inlined


Re: [SFN] Bootstrap broken

2017-12-13 Thread Jakub Jelinek
On Wed, Dec 13, 2017 at 05:21:01AM -0200, Alexandre Oliva wrote:
> index ..098a1101a3f4
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/torture/pr83391.C
> @@ -0,0 +1,34 @@
> +/* PR debug/83391 */
> +/* { dg-do compile } */

If you put this into dg-torture.exp, please add:
/* { dg-options "-g" } */
and readd the needed:
/* { dg-additional-options "-mbranch-cost=1" { target { i?86-*-* x86_64-*-* 
mips*-*-* s390*-*-* avr*-*-* } } } */

> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr83396.c
> @@ -0,0 +1,37 @@
> +/* PR bootstrap/83396 */
> +/* { dg-do compile } */

Please add -g.

> --- a/gcc/tree-cfgcleanup.c
> +++ b/gcc/tree-cfgcleanup.c
> @@ -536,14 +536,23 @@ remove_forwarder_block (basic_block bb)
>   defined labels and labels with an EH landing pad number to the
>   new block, so that the redirection of the abnormal edges works,
>   jump targets end up in a sane place and debug information for
> - labels is retained.  */
> + labels is retained.
> +
> + While at that, move any debug stmts that appear before or among
> + labels, but not those that can only appear after labels, unless
> + the destination block didn't have labels of its own.  */
>gsi_to = gsi_start_bb (dest);
> -  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
> +  gsi = gsi_start_bb (bb);
> +  gimple_stmt_iterator gsie = gsi_after_labels (bb);
> +  gimple_stmt_iterator gsie_to = gsi_after_labels (dest);
> +  bool can_move_early_debug_stmts = can_move_debug_stmts
> +&& (gsi_stmt (gsi) != gsi_stmt (gsie) || gsi_stmt (gsi_to) != gsi_stmt 
> (gsie_to));

Formatting, this should be
  bool can_move_early_debug_stmts
= ...
and the line is too long, so needs to be wrapped.

Furthermore, I must say I don't understand why
can_move_early_debug_stmts should care whether there are any labels in
dest bb or not.  That sounds very risky for introducing non-# DEBUG BEGIN_STMT
debug insns before labels if it could happen.  Though, if
gsi_stmt (gsi) == gsi_stmt (gsie), then the loop right below it will not
do anything and nothing cares about can_move_early_debug_stmts afterwards.
So, in short, can_move_early_debug_stmts is used only if
gsi_stmt (gsi) != gsi_stmt (gsie), and therefore
can_move_early_debug_stmts if it is used is can_move_debug_stmts && (1 || ...);

So, can we get rid of can_move_early_debug_stmts altogether and just use
can_move_debug_stmts in there instead?

Another thing I find risky is that you compute gsie_to so early and don't
update it.  If you don't need it above for can_move_early_debug_stmts, can
you just do it back where it used to be done,

> +  while (gsi_stmt (gsi) != gsi_stmt (gsie))
>  {
>tree decl;
>label = gsi_stmt (gsi);
>if (is_gimple_debug (label)
> -   ? can_move_debug_stmts
> +   ? can_move_early_debug_stmts
> : ((decl = gimple_label_label (as_a  (label))),
>EH_LANDING_PAD_NR (decl) != 0
>|| DECL_NONLOCAL (decl)
> @@ -557,6 +566,20 @@ remove_forwarder_block (basic_block bb)
>   gsi_next (&gsi);
>  }
>  
> +  /* Move debug statements if the destination has a single predecessor.  */
> +  if (can_move_debug_stmts && !gsi_end_p (gsi))
> +{
> +  gcc_assert (gsi_stmt (gsi) == gsi_stmt (gsie));

i.e. here?

> +  do
> + {
> +   gimple *debug = gsi_stmt (gsi);
> +   gcc_assert (is_gimple_debug (debug));
> +   gsi_remove (&gsi, false);
> +   gsi_insert_before (&gsie_to, debug, GSI_SAME_STMT);
> + }
> +  while (!gsi_end_p (gsi));
> +}
> +
>bitmap_set_bit (cfgcleanup_altered_bbs, dest->index);
>  
>/* Update the dominators.  */

Jakub


Re: [SFN] Bootstrap broken

2017-12-13 Thread Rainer Orth
Hi Alexandre Oliva  writes:

> On Dec 12, 2017, Rainer Orth  wrote:
>
>> Hi David,
>>> Something in this series broke bootstrap on AIX, probably Power in general.
>
>> I'm seeing the same in a sparc-sun-solaris2.11 bootstrap.
>
> The AIX patch, that I've just emailed out in this thread, should fix
> that as well.  As for the regression you reported, here's a fix.

it did indeed, and this one fixed the testsuite regression, as just
confirmed in a sparc-sun-solaris2.11 bootstrap.

Thanks.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[Ada] Add small sanity check for package freezing

2017-12-13 Thread Eric Botcazou
In order to implement the complex elaboration rules of the languages, the 
front-end inserts freeze nodes in the expanded code to mark spots from where 
package bodies can be translated by gigi.  It can do so for packages with or 
without bodies now so a small sanity check is necessary.

Tested on x86_64-suse-linux, applied on the mainline and 7 branch.


2017-12-13  Eric Botcazou  

* gcc-interface/trans.c (process_freeze_entity): Be prepared for a
package without body.

-- 
Eric BotcazouIndex: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 255578)
+++ gcc-interface/trans.c	(working copy)
@@ -8718,12 +8718,12 @@ process_freeze_entity (Node_Id gnat_node
   const Entity_Kind kind = Ekind (gnat_entity);
   tree gnu_old, gnu_new;
 
-  /* If this is a package, we need to generate code for the package.  */
+  /* If this is a package, generate code for the package body, if any.  */
   if (kind == E_Package)
 {
-  insert_code_for
-	(Parent (Corresponding_Body
-		 (Parent (Declaration_Node (gnat_entity);
+  const Node_Id gnat_decl = Parent (Declaration_Node (gnat_entity));
+  if (Present (Corresponding_Body (gnat_decl)))
+	insert_code_for (Parent (Corresponding_Body (gnat_decl)));
   return;
 }
 


[Ada] Fix spurious warning on function imported from C in LTO mode

2017-12-13 Thread Eric Botcazou
We treat System.Address as equivalent to void* for functions imported from C 
and other languages, but the existing implementation was not very robust.

Tested on x86_64-suse-linux, applied on the mainline and 7 branch.


2017-12-13  Eric Botcazou  

* gcc-interface/decl.c (gnat_to_gnu_entity): Robustify test for types
descendant of System.Address.
(gnat_to_gnu_subprog_type): Likewise.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 255578)
+++ gcc-interface/decl.c	(working copy)
@@ -659,7 +659,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 
 	/* Get the type after elaborating the renamed object.  */
 	if (Has_Foreign_Convention (gnat_entity)
-	&& Is_Descendant_Of_Address (gnat_type))
+	&& Is_Descendant_Of_Address (Underlying_Type (gnat_type)))
 	  gnu_type = ptr_type_node;
 	else
 	  {
@@ -5594,7 +5594,7 @@ gnat_to_gnu_subprog_type (Entity_Id gnat
   /* For foreign convention subprograms, return System.Address as void *
 	 or equivalent.  Note that this comprises GCC builtins.  */
   if (Has_Foreign_Convention (gnat_subprog)
-	  && Is_Descendant_Of_Address (gnat_return_type))
+	  && Is_Descendant_Of_Address (Underlying_Type (gnat_return_type)))
 	gnu_return_type = ptr_type_node;
   else
 	gnu_return_type = gnat_to_gnu_profile_type (gnat_return_type);
@@ -5761,7 +5761,7 @@ gnat_to_gnu_subprog_type (Entity_Id gnat
 	  /* For foreign convention subprograms, pass System.Address as void *
 	 or equivalent.  Note that this comprises GCC builtins.  */
 	  if (Has_Foreign_Convention (gnat_subprog)
-	  && Is_Descendant_Of_Address (gnat_param_type))
+	  && Is_Descendant_Of_Address (Underlying_Type (gnat_param_type)))
 	gnu_param_type = ptr_type_node;
 	  else
 	gnu_param_type = gnat_to_gnu_profile_type (gnat_param_type);


Re: [PATCH][GCC][ARM] Fix fragile arm fpu attribute tests.

2017-12-13 Thread Christophe Lyon
On 12 December 2017 at 18:29, Tamar Christina  wrote:
> Hi All,
>
> The previous test made use of arm_neon.h which made the whole test
> rather fragile and only applicable to some of the arm targets.
>
> So instead I make use of different fpus now to test the generation of
> fmla instructions. The actual instruction itself is not tested as all
> we care about if that the proper .fpu directives are generated.
>
> Regtested on arm-none-eabi and arm-none-linux-gnueabihf
> with no regressions.
>
> Ok for trunk?
>
>
> gcc/testsuite/
> 2017-12-12  Tamar Christina  
>
> PR target/82641
> * gcc.target/arm/pragma_fpu_attribute.c: New.
> * gcc.target/arm/pragma_fpu_attribute_2.c: New.
>
Sorry, it seems your patch does not apply against ToT, and
the ChangeLog looks incorrect (these are not new files)

Christophe


Re: [compare-debug] use call loc for nop_endbr

2017-12-13 Thread Jakub Jelinek
On Wed, Dec 13, 2017 at 05:34:22AM -0200, Alexandre Oliva wrote:
> We skip debug insns and notes after a call that needs a nop_endbr, but
> since a debug insn could be the last in a block, it may affect the loc
> in the emitted nop_endbr insn.  Although this has no effect on
> codegen, it does mess with debug info a bit, and it causes
> -fcompare-debug to fail for e.g. libsanitizer's
> tsan/tsan_platform_linux.cc on x86_64.
> 
> So, pick the location of the call insn for the nop_endbr insn, to
> avoid the line number differences in dumps, including -fcompare-debug
> ones.
> 
> Also, we don't need to determine what the insert point would be unless
> we're actually emitting the nop_endbr insn after the call, so
> rearrange the code to avoid wasting cycles.
> 
> Finally, it seems like testing for barriers is a mistake.  We probably
> never actually pass that test, for the barriers would hit BB_END
> first.  If we did, we'd end up emitting the nop_endbr outside any BB,
> even after the end of the function!  That would be Very Bad (TM).
> Now, since the test as it is can't hurt, I figured I wouldn't change
> the logic right now, just add a comment so that someone involved in
> endbr stuff can have a second look and hopefully fix it.
> 
> I'd appreciate if you'd try to drop the BARRIER_P from the loop test,
> Igor, so as to address the final ??? in the comment I add.  Narrowing
> the skipped notes to only the relevant post-call ones might make sense
> as well, but it's not quite as important IMHO.

I believe the only insn that needs to be skipped is
NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_CALL_ARG_LOCATION
and there should be at most one of these after the call.
Anything else I believe can be separated from the call without problems.

Jakub


Re: [SFN] Bootstrap broken

2017-12-13 Thread Jakub Jelinek
On Wed, Dec 13, 2017 at 05:29:28AM -0200, Alexandre Oliva wrote:
> Regstrapping; I suppose I could install it as obvious, but...  Ok to install?
> 
> 
> [SFN] don't eliminate regs in markers
> 
> Eliminate regs in debug bind insns, but not in markers.
> 
> for  gcc/ChangeLog
> 
>   PR bootstrap/83396
>   * reload1.c (eliminate_regs_in_insn): Skip debug markers.

Ok.

> --- a/gcc/reload1.c
> +++ b/gcc/reload1.c
> @@ -3202,7 +3202,7 @@ eliminate_regs_in_insn (rtx_insn *insn, int replace)
> || GET_CODE (PATTERN (insn)) == USE
> || GET_CODE (PATTERN (insn)) == CLOBBER
> || GET_CODE (PATTERN (insn)) == ASM_INPUT);
> -  if (DEBUG_INSN_P (insn))
> +  if (DEBUG_BIND_INSN_P (insn))
>   INSN_VAR_LOCATION_LOC (insn)
> = eliminate_regs (INSN_VAR_LOCATION_LOC (insn), VOIDmode, insn);
>return 0;

Jakub