date:20200525

[PATCH 6/7] ivopts: Add handlings for vector with length IFNs

2020-05-25 Thread Kewen.Lin via Gcc-patches

gcc/ChangeLog

2020-MM-DD  Kewen Lin  

* tree-ssa-loop-ivopts.c (get_mem_type_for_internal_fn): Handle
IFN_LEN_LOAD and IFN_LEN_STORE.
(get_alias_ptr_type_for_ptr_address): Likewise.


---
 gcc/tree-ssa-loop-ivopts.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 1d2697ae1ba..45b31640e75 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -2436,12 +2436,14 @@ get_mem_type_for_internal_fn (gcall *call, tree *op_p)
 {
 case IFN_MASK_LOAD:
 case IFN_MASK_LOAD_LANES:
+case IFN_LEN_LOAD:
   if (op_p == gimple_call_arg_ptr (call, 0))
return TREE_TYPE (gimple_call_lhs (call));
   return NULL_TREE;
 
 case IFN_MASK_STORE:
 case IFN_MASK_STORE_LANES:
+case IFN_LEN_STORE:
   if (op_p == gimple_call_arg_ptr (call, 0))
return TREE_TYPE (gimple_call_arg (call, 3));
   return NULL_TREE;
@@ -7415,6 +7417,8 @@ get_alias_ptr_type_for_ptr_address (iv_use *use)
 case IFN_MASK_STORE:
 case IFN_MASK_LOAD_LANES:
 case IFN_MASK_STORE_LANES:
+case IFN_LEN_LOAD:
+case IFN_LEN_STORE:
   /* The second argument contains the correct alias type.  */
   gcc_assert (use->op_p = gimple_call_arg_ptr (call, 0));
   return TREE_TYPE (gimple_call_arg (call, 1));
--

[PATCH 5/7] vect: Support vector load/store with length in vectorizer

2020-05-25 Thread Kewen.Lin via Gcc-patches

gcc/ChangeLog

2020-MM-DD  Kewen Lin  

* doc/invoke.texi (vect-with-length-scope): Document new option.
* params.opt (vect-with-length-scope): New.
* tree-vect-loop-manip.c (vect_set_loop_lens_directly): New function.
(vect_set_loop_condition_len): Likewise.
(vect_set_loop_condition): Call vect_set_loop_condition_len for loop 
with
length.
(vect_gen_vector_loop_niters): Use VF as the step for loop with length.
(vect_do_peeling): Adjust for loop with length.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
can_with_length_p and fully_with_length_p.
(release_vec_loop_lens): New function.
(_loop_vec_info::~_loop_vec_info): Use it to free the loop lens.
(vect_verify_loop_lens): New function.
(vect_analyze_loop_costing): Adjust for loop fully with length.
(determine_peel_for_niter): Don't peel if loop fully with length.
(vect_analyze_loop_2): Save LOOP_VINFO_CAN_WITH_LENGTH_P around retries,
and free the length rgroups before retrying.  Check loop-wide reasons 
for
disabling loops with length.  Make the final decision about use vector
access with length or not.
(vect_analyze_loop): Add handlings for epilogue of loop that can use 
vector
with length but not.
(vect_estimate_min_profitable_iters): Adjust for loop with length.
(vectorizable_reduction): Disable loop with length.
(vectorizable_live_operation): Likewise.
(vect_record_loop_len): New function.
(vect_get_loop_len): Likewise.
(vect_transform_loop): Flag final loop iteration could be partial vector
for loop with length.
* tree-vect-stmts.c (check_load_store_with_len): New function.
(vectorizable_store): Handle vector loop with length.
(vectorizable_load): Likewise.
(vect_gen_len): New function.
* tree-vectorizer.h (struct rgroup_lens): New structure.
(vec_loop_lens): New typedef.
(_loop_vec_info): Add lens, can_with_length_p and fully_with_length_p.
(LOOP_VINFO_CAN_WITH_LENGTH_P): New macro.
(LOOP_VINFO_FULLY_WITH_LENGTH_P): Likewise.
(LOOP_VINFO_LENS): Likewise.
(vect_record_loop_len): New declare.
(vect_get_loop_len): Likewise.
(vect_gen_len): Likewise.


---
 gcc/doc/invoke.texi|   7 +
 gcc/params.opt |   4 +
 gcc/tree-vect-loop-manip.c | 268 -
 gcc/tree-vect-loop.c   | 241 -
 gcc/tree-vect-stmts.c  | 152 +
 gcc/tree-vectorizer.h  |  32 +
 6 files changed, 697 insertions(+), 7 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 8b9935dfe65..ac765feab13 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13079,6 +13079,13 @@ by the copy loop headers pass.
 @item vect-epilogues-nomask
 Enable loop epilogue vectorization using smaller vector size.
 
+@item vect-with-length-scope
+Control the scope of vector memory access with length exploitation.  0 means we
+don't expliot any vector memory access with length, 1 means we only exploit
+vector memory access with length for those loops whose iteration number are
+less than VF, such as very small loop or epilogue, 2 means we want to exploit
+vector memory access with length for any loops if possible.
+
 @item slp-max-insns-in-bb
 Maximum number of instructions in basic block to be
 considered for SLP vectorization.
diff --git a/gcc/params.opt b/gcc/params.opt
index 4aec480798b..d4309101067 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -964,4 +964,8 @@ Bound on number of runtime checks inserted by the 
vectorizer's loop versioning f
 Common Joined UInteger Var(param_vect_max_version_for_alignment_checks) 
Init(6) Param Optimization
 Bound on number of runtime checks inserted by the vectorizer's loop versioning 
for alignment check.
 
+-param=vect-with-length-scope=
+Common Joined UInteger Var(param_vect_with_length_scope) Init(0) 
IntegerRange(0, 2) Param Optimization
+Control the vector with length exploitation scope.
+
 ; This comment is to ensure we retain the blank line above.
diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 8c5e696b995..3d5dec6f65c 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -747,6 +747,263 @@ vect_set_loop_condition_masked (class loop *loop, 
loop_vec_info loop_vinfo,
   return cond_stmt;
 }
 
+/* Helper for vect_set_loop_condition_len.  Like vect_set_loop_masks_directly,
+   generate definitions for all the lengths in RGL and return a length that is
+   nonzero when the loop needs to iterate.  Add any new preheader statements to
+   PREHEADER_SEQ.  Use LOOP_COND_GSI to insert code before the exit gcond.
+
+   RGL belongs to loop LOOP.  The loop originally iterated NITERS
+   times and has been vectorized according to LOOP_VINFO.

[PATCH 4/7] hook/rs6000: Add vectorize length mode for vector with length

2020-05-25 Thread Kewen.Lin via Gcc-patches

gcc/ChangeLog:

2020-MM-DD  Kewen Lin  

* config/rs6000/rs6000.c (TARGET_VECTORIZE_LENGTH_MODE): New macro.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: New hook.
* target.def: Likewise.


---
 gcc/config/rs6000/rs6000.c | 3 +++
 gcc/doc/tm.texi| 6 ++
 gcc/doc/tm.texi.in | 2 ++
 gcc/target.def | 7 +++
 4 files changed, 18 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 8435bc15d72..c4d9d558b2f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1659,6 +1659,9 @@ static const struct attribute_spec 
rs6000_attribute_table[] =
 #undef TARGET_HAVE_COUNT_REG_DECR_P
 #define TARGET_HAVE_COUNT_REG_DECR_P true
 
+#undef TARGET_VECTORIZE_LENGTH_MODE
+#define TARGET_VECTORIZE_LENGTH_MODE DImode
+
 /* 10 is infinite cost in IVOPTs.  */
 #undef TARGET_DOLOOP_COST_FOR_GENERIC
 #define TARGET_DOLOOP_COST_FOR_GENERIC 10
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 6e7d9dc54a9..5ea8734a191 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6084,6 +6084,12 @@ The default implementation returns a 
@code{MODE_VECTOR_INT} with the
 same size and number of elements as @var{mode}, if such a mode exists.
 @end deftypefn
 
+@deftypevr {Target Hook} scalar_int_mode TARGET_VECTORIZE_LENGTH_MODE
+For the targets which support vector memory access with length, return
+the scalar int mode to use for the length in bytes.
+The default is to use @code{word_mode}.
+@end deftypevr
+
 @deftypefn {Target Hook} bool TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE 
(unsigned @var{ifn})
 This hook returns true if masked internal function @var{ifn} (really of
 type @code{internal_fn}) should be considered expensive when the mask is
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 3be984bbd5c..83034176b56 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4181,6 +4181,8 @@ address;  but often a machine-dependent strategy can 
generate better code.
 
 @hook TARGET_VECTORIZE_GET_MASK_MODE
 
+@hook TARGET_VECTORIZE_LENGTH_MODE
+
 @hook TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE
 
 @hook TARGET_VECTORIZE_INIT_COST
diff --git a/gcc/target.def b/gcc/target.def
index 07059a87caf..b58d87e1496 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1969,6 +1969,13 @@ same size and number of elements as @var{mode}, if such 
a mode exists.",
  (machine_mode mode),
  default_get_mask_mode)
 
+DEFHOOKPOD
+(length_mode,
+ "For the targets which support vector memory access with length, return\n\
+the scalar int mode to use for the length in bytes.\n\
+The default is to use @code{word_mode}.",
+ scalar_int_mode, word_mode)
+
 /* Function to say whether a masked operation is expensive when the
mask is all zeros.  */
 DEFHOOK
--

[PATCH 3/7] vect: Factor out codes for niters smaller than vf check

2020-05-25 Thread Kewen.Lin via Gcc-patches

gcc/ChangeLog:

2020-MM-DD  Kewen Lin  

* tree-vect-loop.c (known_niters_smaller_than_vf): New function, 
factored out from ...
(vect_analyze_loop_costing): ... here.
---
 gcc/tree-vect-loop.c | 31 ++-
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 4f94b4baad9..80e33b61be7 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -155,6 +155,7 @@ along with GCC; see the file COPYING3.  If not see
 static void vect_estimate_min_profitable_iters (loop_vec_info, int *, int *);
 static stmt_vec_info vect_is_simple_reduction (loop_vec_info, stmt_vec_info,
   bool *, bool *);
+static bool known_niters_smaller_than_vf (loop_vec_info);
 
 /* Subroutine of vect_determine_vf_for_stmt that handles only one
statement.  VECTYPE_MAYBE_SET_P is true if STMT_VINFO_VECTYPE
@@ -1631,15 +1632,7 @@ vect_analyze_loop_costing (loop_vec_info loop_vinfo)
  vectorization factor.  */
   if (!LOOP_VINFO_FULLY_MASKED_P (loop_vinfo))
 {
-  HOST_WIDE_INT max_niter;
-
-  if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
-   max_niter = LOOP_VINFO_INT_NITERS (loop_vinfo);
-  else
-   max_niter = max_stmt_executions_int (loop);
-
-  if (max_niter != -1
- && (unsigned HOST_WIDE_INT) max_niter < assumed_vf)
+  if (known_niters_smaller_than_vf (loop_vinfo))
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -9231,3 +9224,23 @@ vect_iv_limit_for_full_masking (loop_vec_info loop_vinfo)
   return iv_limit;
 }
 
+/* If we know the iteration count is smaller than vectorization factor, return
+   true, otherwise return false.  */
+
+static bool
+known_niters_smaller_than_vf (loop_vec_info loop_vinfo)
+{
+  unsigned int assumed_vf = vect_vf_for_cost (loop_vinfo);
+
+  HOST_WIDE_INT max_niter;
+  if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
+max_niter = LOOP_VINFO_INT_NITERS (loop_vinfo);
+  else
+max_niter = max_stmt_executions_int (LOOP_VINFO_LOOP (loop_vinfo));
+
+  if (max_niter != -1 && (unsigned HOST_WIDE_INT) max_niter < assumed_vf)
+return true;
+
+  return false;
+}
+
--

[PATCH 2/7] rs6000: lenload/lenstore optab support

2020-05-25 Thread Kewen.Lin via Gcc-patches

gcc/ChangeLog:

2020-MM-DD  Kewen Lin  

* config/rs6000/vsx.md (lenloaddi): New define_expand.
(lenstoredi): Likewise.


---
 gcc/config/rs6000/vsx.md | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 2a28215ac5b..cc098d3ccb5 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5082,6 +5082,36 @@
   operands[3] = gen_reg_rtx (DImode);
 })
 
+;; Define optab for vector access with length vectorization exploitation.
+(define_expand "lenloaddi"
+  [(match_operand:VEC_A 0 "vlogical_operand")
+   (match_operand:VEC_A 1 "memory_operand")
+   (match_operand:DI 2 "int_reg_operand")]
+  "TARGET_P9_VECTOR && TARGET_64BIT"
+{
+  rtx mem = XEXP (operands[1], 0);
+  mem = force_reg (DImode, mem);
+  rtx res = gen_reg_rtx (V16QImode);
+  emit_insn (gen_lxvl (res, mem, operands[2]));
+  emit_move_insn (operands[0], gen_lowpart (mode, res));
+  DONE;
+})
+
+(define_expand "lenstoredi"
+  [(match_operand:VEC_A 0 "memory_operand")
+   (match_operand:VEC_A 1 "vlogical_operand")
+   (match_operand:DI 2 "int_reg_operand")
+  ]
+  "TARGET_P9_VECTOR && TARGET_64BIT"
+{
+  rtx val = gen_reg_rtx (V16QImode);
+  emit_move_insn (val, gen_lowpart (V16QImode, operands[1]));
+  rtx mem = XEXP (operands[0], 0);
+  mem = force_reg (DImode, mem);
+  emit_insn (gen_stxvl (val, mem, operands[2]));
+  DONE;
+})
+
 (define_insn "*stxvl"
   [(set (mem:V16QI (match_operand:DI 1 "gpc_reg_operand" "b"))
(unspec:V16QI
--

[PATCH 1/7] ifn/optabs: Support vector load/store with length

2020-05-25 Thread Kewen.Lin via Gcc-patches

gcc/ChangeLog:

2020-MM-DD  Kewen Lin  

* doc/md.texi (lenload@var{m}@var{n}): Document.
(lenstore@var{m}@var{n}): Likewise.
* internal-fn.c (len_load_direct): New macro.
(len_store_direct): Likewise.
(expand_len_load_optab_fn): Likewise.
(expand_len_store_optab_fn): Likewise.
(direct_len_load_optab_supported_p): Likewise.
(direct_len_store_optab_supported_p): Likewise.
(internal_load_fn_p): Handle IFN_LEN_LOAD.
(internal_store_fn_p): Handle IFN_LEN_STORE.
(internal_fn_stored_value_index): Handle IFN_LEN_STORE.
* internal-fn.def (LEN_LOAD): New internal function.
(LEN_STORE): Likewise.
* optabs.def (lenload_optab, lenstore_optab): New optab.


---
 gcc/doc/md.texi | 16 
 gcc/internal-fn.c   | 13 +++--
 gcc/internal-fn.def |  6 ++
 gcc/optabs.def  |  2 ++
 4 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 2c67c818da5..b0c19cd3b81 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5167,6 +5167,22 @@ mode @var{n}.
 
 This pattern is not allowed to @code{FAIL}.
 
+@cindex @code{lenload@var{m}@var{n}} instruction pattern
+@item @samp{lenload@var{m}@var{n}}
+Perform a vector load with length from memory operand 1 of mode @var{m}
+into register operand 0.  Length is provided in register operand 2 of
+mode @var{n}.
+
+This pattern is not allowed to @code{FAIL}.
+
+@cindex @code{lenstore@var{m}@var{n}} instruction pattern
+@item @samp{lenstore@var{m}@var{n}}
+Perform a vector store with length from register operand 1 of mode @var{m}
+into memory operand 0.  Length is provided in register operand 2 of
+mode @var{n}.
+
+This pattern is not allowed to @code{FAIL}.
+
 @cindex @code{vec_perm@var{m}} instruction pattern
 @item @samp{vec_perm@var{m}}
 Output a (variable) vector permutation.  Operand 0 is the destination
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 5e9aa60721e..be64cd86c07 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -104,10 +104,12 @@ init_internal_fns ()
 #define load_lanes_direct { -1, -1, false }
 #define mask_load_lanes_direct { -1, -1, false }
 #define gather_load_direct { 3, 1, false }
+#define len_load_direct { -1, 2, false }
 #define mask_store_direct { 3, 2, false }
 #define store_lanes_direct { 0, 0, false }
 #define mask_store_lanes_direct { 0, 0, false }
 #define scatter_store_direct { 3, 1, false }
+#define len_store_direct { 3, 2, false }
 #define unary_direct { 0, 0, true }
 #define binary_direct { 0, 0, true }
 #define ternary_direct { 0, 0, true }
@@ -2478,7 +2480,7 @@ expand_call_mem_ref (tree type, gcall *stmt, int index)
   return fold_build2 (MEM_REF, type, addr, build_int_cst (alias_ptr_type, 0));
 }
 
-/* Expand MASK_LOAD{,_LANES} call STMT using optab OPTAB.  */
+/* Expand MASK_LOAD{,_LANES} and LEN_LOAD call STMT using optab OPTAB.  */
 
 static void
 expand_mask_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
@@ -2514,8 +2516,9 @@ expand_mask_load_optab_fn (internal_fn, gcall *stmt, 
convert_optab optab)
 }
 
 #define expand_mask_load_lanes_optab_fn expand_mask_load_optab_fn
+#define expand_len_load_optab_fn expand_mask_load_optab_fn
 
-/* Expand MASK_STORE{,_LANES} call STMT using optab OPTAB.  */
+/* Expand MASK_STORE{,_LANES} and LEN_STORE call STMT using optab OPTAB.  */
 
 static void
 expand_mask_store_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
@@ -2547,6 +2550,7 @@ expand_mask_store_optab_fn (internal_fn, gcall *stmt, 
convert_optab optab)
 }
 
 #define expand_mask_store_lanes_optab_fn expand_mask_store_optab_fn
+#define expand_len_store_optab_fn expand_mask_store_optab_fn
 
 static void
 expand_ABNORMAL_DISPATCHER (internal_fn, gcall *)
@@ -3128,10 +3132,12 @@ multi_vector_optab_supported_p (convert_optab optab, 
tree_pair types,
 #define direct_load_lanes_optab_supported_p multi_vector_optab_supported_p
 #define direct_mask_load_lanes_optab_supported_p multi_vector_optab_supported_p
 #define direct_gather_load_optab_supported_p convert_optab_supported_p
+#define direct_len_load_optab_supported_p direct_optab_supported_p
 #define direct_mask_store_optab_supported_p direct_optab_supported_p
 #define direct_store_lanes_optab_supported_p multi_vector_optab_supported_p
 #define direct_mask_store_lanes_optab_supported_p 
multi_vector_optab_supported_p
 #define direct_scatter_store_optab_supported_p convert_optab_supported_p
+#define direct_len_store_optab_supported_p direct_optab_supported_p
 #define direct_while_optab_supported_p convert_optab_supported_p
 #define direct_fold_extract_optab_supported_p direct_optab_supported_p
 #define direct_fold_left_optab_supported_p direct_optab_supported_p
@@ -3498,6 +3504,7 @@ internal_load_fn_p (internal_fn fn)
 case IFN_MASK_LOAD_LANES:
 case IFN_GATHER_LOAD:
 case IFN_MASK_GATHER_LOAD:
+case IFN_LEN_LOAD:
   return true;
 
 default:
@@ -3517,6 +3524,7 @@

[PATCH 0/7] Support vector load/store with length

2020-05-25 Thread Kewen.Lin via Gcc-patches

Hi all,

This patch set adds support for vector load/store with length, Power 
ISA 3.0 brings instructions lxvl/stxvl to perform vector load/store with
length, it's good to be exploited for those cases we don't have enough
stuffs to fill in the whole vector like epilogues.

This support mainly refers to the handlings for fully-predicated loop
but it also covers the epilogue usage.  Now it supports two modes
controlled by parameter vect-with-length-scope, it can support any
loops fully with length or just for those cases with small iteration
counts less than VF like epilogue, for now I don't have ready env to
benchmark it, but based on the current inefficient length generation,
I don't think it's a good idea to adopt vector with length for any loops.
For the main loop which used to be vectorized, it increases register
pressure and introduces extra computation for length, the pro for icache
seems not comparable.  But I think it might be a good idea to keep this
parameter there for functionality testing, further benchmarking and other
ports' potential future supports.

As we don't have any benchmarking, this support isn't enabled by default
for any particular cpus, all testings are with explicit parameter setting.

Bootstrapped on powerpc64le-linux-gnu P9 with all vect-with-length-scope
settings (0/1/2).  Regress-test passed with vector-with-length-scope 0,
for the other twos, several vector related cases need to be updated, no
remarkable failures found.  BTW, P9 is the one which supports the
functionality but not ready to evaluate the performance.

Here still are many things to be supported or improved, not limited to:
  - reduction/live-out support
  - Cost model adding/tweaking
  - IFN gimple folding
  - Some unnecessary ops improvements eg: vector_size check
  - Some possible refactoring
I'll support/post the patches gradually.

Any comments are highly appreciated.

BR,
Kewen
-

Patch set outline:
  [PATCH 1/7] ifn/optabs: Support vector load/store with length
  [PATCH 2/7] rs6000: lenload/lenstore optab support
  [PATCH 3/7] vect: Factor out codes for niters smaller than vf check
  [PATCH 4/7] hook/rs6000: Add vectorize length mode for vector with length
  [PATCH 5/7] vect: Support vector load/store with length in vectorizer
  [PATCH 6/7] ivopts: Add handlings for vector with length IFNs
  [PATCH 7/7] rs6000/testsuite: Vector with length test cases

 gcc/config/rs6000/rs6000.c  |   3 +
 gcc/config/rs6000/vsx.md|  30 ++
 gcc/doc/invoke.texi |   7 +++
 gcc/doc/md.texi |  16 ++
 gcc/doc/tm.texi |   6 ++
 gcc/doc/tm.texi.in  |   2 +
 gcc/internal-fn.c   |  13 -
 gcc/internal-fn.def |   6 ++
 gcc/optabs.def  |   2 +
 gcc/params.opt  |   4 ++
 gcc/target.def  |   7 +++
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-1.h  |  18 ++
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-2.h  |  17 ++
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-3.h  |  31 +++
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-4.h  |  24 
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-5.h  |  29 ++
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-6.h  |  32 +++
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-1.c |  15 +
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-2.c |  15 +
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-3.c |  18 ++
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-4.c |  15 +
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-5.c |  15 +
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-6.c |  16 ++
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-run-1.c |  10 
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-run-2.c |  10 
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-run-3.c |  10 
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-run-4.c |  10 
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-run-5.c |  10 
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-epil-run-6.c |  10 
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-1.c |  16 ++
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-2.c |  16 ++
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-3.c |  17 ++
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-4.c |  16 ++
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-5.c |  16 ++
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-6.c |  16 ++

Re: ChangeLog files - server and client scripts

2020-05-25 Thread Alexandre Oliva

On May 25, 2020, Martin Liška  wrote:

> On 5/21/20 5:14 PM, Rainer Orth wrote:
>> * In changelog_location, you allow only (among others) "a/b/c/" and
>> "\ta/b/c/".  Please also accept the "a/b/c:" and "\ta/b/c:" forms
>> here: especially the second seems quite common.

> Ok, I believe these formats are supported as well. Feel free to mention
> some git revisions that are not recognized.

I've long used the following syntax to start ChangeLog entries:

for  /ChangeLog

It was introduced over 20 years ago, with the (so far never formally
released) GNU CVS-Utilities.  Among other goodies, there were scripts to
turn diffs for ChangeLog files into the above format, and vice-versa,
that I've used to this day.  It went through cvs, svn and git.  It would
be quite nice if I could keep on using it with GCC.

The patch below seems to be enough to pass gcc-verify, and to recognize
and print the expected ChangeLog files.  I suppose I'll have to adjust
the formatting to be able to push it, but, aside from that, is it ok to
install?

Do any hooks need to be adjusted to match?

I'm also a little concerned about '*/ChangeLog.*' files.  Are we no
longer supposed to introduce them, or new ChangeLog entries to them?  Or
should the scripts be extended to cover them?

for  contrib/ChangeLog

* gcc-changelog/git_commit.py (changelog_regex): Accept optional
'for' prefix.

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index 2cfdbc8..b8362c1 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -144,7 +144,7 @@ misc_files = [
 author_line_regex = \
 re.compile(r'^(?P\d{4}-\d{2}-\d{2})\ {2}(?P.*  <.*>)')
 additional_author_regex = re.compile(r'^\t(?P\ *)?(?P.*  <.*>)')
-changelog_regex = re.compile(r'^([a-z0-9+-/]*)/ChangeLog:?')
+changelog_regex = re.compile(r'^(?:[fF]or +)([a-z0-9+-/]*)/ChangeLog:?')
 pr_regex = re.compile(r'\tPR (?P[a-z+-]+\/)?([0-9]+)$')
 dr_regex = re.compile(r'\tDR ([0-9]+)$')
 star_prefix_regex = re.compile(r'\t\*(?P\ *)(?P.*)')

-- 
Alexandre Oliva, freedom fighterhe/himhttps://FSFLA.org/blogs/lxo/
Free Software Evangelist  Stallman was right, but he's left :(
GNU Toolchain Engineer   Live long and free, and prosper ethically

Re: [IMPORTANT] ChangeLog related changes

2020-05-25 Thread Hongtao Liu via Gcc-patches

On Tue, May 26, 2020 at 6:49 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> Hi!
>
> I've turned the strict mode of Martin Liška's hook changes,
> which means that from now on no commits to the trunk or release branches
> should be changing any ChangeLog files together with the other files,
> ChangeLog entry should be solely in the commit message.
> The DATESTAMP bumping script will be updating the ChangeLog files for you.
Oh, no wonder my patch was rejected by git hook with error message
---
ChangeLog files, DATESTAMP, BASE-VER and DEV-PHASE can be modified
only separately from other files
---
> If somebody makes a mistake in that, please wait 24 hours (at least until
i commit a separate patch alone only for ChangeLog files, should i revert it?
> after 00:16 UTC after your commit) so that the script will create the
> ChangeLog entries, and afterwards it can be fixed by adjusting the ChangeLog
> files.  But you can only touch the ChangeLog files in that case (and
> shouldn't write a ChangeLog entry for that in the commit message).
>
> If anything goes wrong, please let me, other RMs and Martin Liška know.
>
> Jakub
>

-- 
BR,
Hongtao

Re: [PATCH 1/2] Seperate -funroll-loops for GIMPLE unroller and RTL unroller

2020-05-25 Thread Jiufu Guo via Gcc-patches

David Edelsohn  writes:

> On Mon, May 25, 2020 at 1:58 PM Richard Biener
>  wrote:
>>
>> On May 25, 2020 7:40:00 PM GMT+02:00, Segher Boessenkool 
>>  wrote:
>> >On Mon, May 25, 2020 at 02:14:02PM +0200, Richard Biener wrote:
>> >> On Mon, May 25, 2020 at 1:10 PM guojiufu 
>> >wrote:
>> >> Since a new flag is not needed to fix the regression please avoid
>> >> adding -fcomplete-unroll-loops.
>> >>
>> >> For -frtl-unroll-loops you should be able to use
>> >
>> >Erm.  That *is* a new command-line option (the internal flags I do not
>> >care about so much: new implementation details are *good*).  And a new
>> >name that is a mistake in my opinion, for many reasons (users do not
>> >know and should not have to care about "rtl"; the name is not
>> >descriptive; it is useless churn, it is not the same name as we have
>> >had for decades now; it is adding a new option for a future where we
>> >will do most unrolling at gimple level, a future we do not know will
>> >ever exist, and we do not know what that will look like anyway; it is
>> >an extra level of indirection (in the name)).
>> >
>> >We should not have an -frtl-unroll-loops if we do not have a
>> >-ftree-unroll-loops (or whatever).
>> >
>> >Unrolling early is not a good idea in general (the problems with the
>> >very trivial complete unrolling case just underline that).  But we
>> >*should* know which code we expect to unroll later, for better costing.
>> >Adding names like "rtl-unroll-loops" only stands in the way of getting
>> >a better design here.
>>
>> You folks made ppc specific hacks instead of a better design. Those
>> now stand in the way as well. But sure, simply do not expose the
>> flag to the users, use
>> Var(flag_rtl_unroll_loops). My other points still stand.
>>
>> Feel free to ignore the regression part on the branch and come up
>> with a great design. But don't expect to backport that then.
>
> I completely agree.

Thanks a lot for all your comments, suggestions, and tips in the
discussion.  Thank Richar, Segher, David, Hanza, and all!

I may have an explanation about the intention of this work.

We know that loop unrolling is a complex and tickly thing.  It could
help some kinds of code in a great manner.  Sometimes there are side
effects.  For different types of loop and different platforms, it may
result in different effects.
It would makes sense to tune the loop unrolling accordingly.  And so, to
help and tune loop unrolling is what we want to do.

Currently, we have loop unroller at GIMPLE part (cunroll/cunrolli) and
RTL part.  There are some options (like -funroll-loops) and --params to
control unrollers.

Through target hook, it would be helpful for different platforms to tune
unroller: checking the type of loops, check optimization level.
Existing hooks may help with something, like turn --params.

Adding separate flags(or options) may be helpful to control different
behaviors independently.  This is one reason for the patch which
introduces internal undocumented options.

One previous patch, r10-4525, is tunning for ppc at -O2. Which
implements an existing hook for rs6000 to check simple loops for RTL
unroller. For cunroll, it just enables it even increasing size at -O2
directly, without check the type of the loops.  And then the
side/negative effects of cunroll are also visible at -O2 besides
positive effects.  In PR95018, the side effect is shown on complex loop
(early exit, and more peeling).
One idea is for cunroll to tune it to avoid side effects. And if the
heuristic is suitable, it would be helpful for other usage, like -O3 and
-funroll-loops.

Thanks for any comments!

Jiufu

>
> This path is digging a deeper and deeper hole.
>
> - David

Re: collect2.exe errors not pruned

2020-05-25 Thread Alexandre Oliva

On May 19, 2020, Joseph Myers  wrote:

> On Tue, 19 May 2020, Alexandre Oliva wrote:
>> > I don't think the error should mention .exe, but I also don't think the 
>> > error should mention collect2 (see what I said in 
>> > , the existence 
>> > of collect2 is an implementation detail).
>> 
>> I'm not changing collect2, at least not in this patch, but I wouldn't
>> mind making the match for the internal executable name optional.  WDYT?
>> Untested patch follows.

> Allowing a missing executable name is reasonable enough, but I was 
> actually thinking that the messages should print "gcc" or whatever command 
> the user ran in place of "collect2".

Uhh, that would require matching pretty much anything, considering all
the potential per-language driver names, for pre-install testing, and
the arbitrary transforms, most commonly s,^,$target-, for post-install
testing.

Should we make the regexps '\[^\n\]*', as in so many other pruned
messages?

-- 
Alexandre Oliva, freedom fighterhe/himhttps://FSFLA.org/blogs/lxo/
Free Software Evangelist  Stallman was right, but he's left :(
GNU Toolchain Engineer   Live long and free, and prosper ethically

RE: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero

2020-05-25 Thread Yangfei (Felix)

Hi,

> -Original Message-
> From: Segher Boessenkool [mailto:seg...@kernel.crashing.org]
> Sent: Tuesday, May 26, 2020 12:27 AM
> To: Yangfei (Felix) 
> Cc: gcc-patches@gcc.gnu.org; Zhanghaijian (A) 
> Subject: Re: [PATCH PR94026] combine missed opportunity to simplify
> comparisons with zero

Snip...

> > I am using Outlook and I didn't find the place to change the MIME type
> > : - (
> 
> The simplest option is to use a different email client, one that plays nicely
> with others.  You use git, maybe you could even use git-send-email?

The bad news is that it would be hard to switch to a different email client 
with my company's IT policy  :-( 
But I think I can ask IT if that is possible. Sorry for the trouble.

> I'll paste things manually...
> 
> > From a19238c02c1e6ab9593a14a13e1e3dff90ed Mon Sep 17 00:00:00
> 2001
> > From: Fei Yang 
> > Date: Mon, 25 May 2020 10:19:30 +0800
> > Subject: [PATCH] combine: missed opportunity to simplify comparisons
> > with zero  [PR94026]
> 
> (Capital "M" on "Missed" please)
> 
> But, the subject should say what the patch *does*.  So maybe
>   combine: Simplify more comparisons with zero (PR94026)

OK. 

> > If we have (and (lshiftrt X C) M) and M is a constant that would
> > select a field of bits within an item, but not the entire word, fold
> > this into a simple AND if we are in an equality comparison against zero.
> 
> But that subject doesn't really describe what the patch does, anyway?

OK.  Modified in the v4 patch.  Does it look better?

> > gcc/
> > PR rtl-optimization/94026
> > * combine.c (make_compound_operation_int): If we have (and
> > (lshiftrt X C) M) and M is a constant that would select a field
> > of bits within an item, but not the entire word, fold this into
> > a simple AND if we are in an equality comparison.
> >
> > gcc/testsuite/
> > PR rtl-optimization/94026
> > * gcc.dg/pr94026.c: New test.
> 
> > --- a/gcc/ChangeLog
> > +++ b/gcc/ChangeLog
> > @@ -1,3 +1,11 @@
> > +2020-05-25  Felix Yang  
> > +
> > +   PR rtl-optimization/94026
> > +   * combine.c (make_compound_operation_int): If we have (and
> > +   (lshiftrt X C) M) and M is a constant that would select a field
> > +   of bits within an item, but not the entire word, fold this into
> > +   a simple AND if we are in an equality comparison.
> 
> Don't put the changelog in the patch.

OK.  I paste it here:

gcc/ChangeLog

+2020-05-26  Felix Yang  
+
+   PR rtl-optimization/94026
+   * combine.c (make_compound_operation_int): If we have (and
+   (lshiftrt X C) M) and M is a constant that would select a field
+   of bits within an item, but not the entire word, fold this into
+   a simple AND if we are in an equality comparison.

gcc/testsuite/ChangeLog

+2020-05-26  Felix Yang  
+
+   PR rtl-optimization/94026
+   * gcc.dg/pr94026.c: New test.

> > diff --git a/gcc/combine.c b/gcc/combine.c index
> > b044f29fd36..76d62b0bd17 100644
> > --- a/gcc/combine.c
> > +++ b/gcc/combine.c
> > @@ -8178,6 +8178,10 @@ make_compound_operation_int
> (scalar_int_mode mode, rtx *x_ptr,
> >if (!CONST_INT_P (XEXP (x, 1)))
> > break;
> >
> > +  HOST_WIDE_INT pos;
> > +  unsigned HOST_WIDE_INT len;
> > +  pos = get_pos_from_mask (UINTVAL (XEXP (x, 1)), );
> 
>   unsigned HOST_WIDE_INT len;
>   HOST_WIDE_INT pos = get_pos_from_mask (UINTVAL (XEXP (x, 1)), );
> 
> > @@ -8231,6 +8235,22 @@ make_compound_operation_int
> (scalar_int_mode mode, rtx *x_ptr,
> >   new_rtx = make_compound_operation (new_rtx, in_code);
> > }
> >
> > +  /* If we have (and (lshiftrt X C) M) and M is a constant that would
> select
> > +a field of bits within an item, but not the entire word, this might be
> > +representable by a simple AND if we are in an equality comparison.
> */
> > +  else if (pos > 0 && equality_comparison
> 
> That "&& equality_comparison" should be on a separate line as well.

OK.

> > +  && GET_CODE (XEXP (x, 0)) == LSHIFTRT
> > +  && CONST_INT_P (XEXP (XEXP (x, 0), 1))
> > +  && pos + UINTVAL (XEXP (XEXP (x, 0), 1))
> > + <= GET_MODE_BITSIZE (mode))
> > +   {
> > + new_rtx = make_compound_operation (XEXP (XEXP (x, 0), 0),
> next_code);
> > + HOST_WIDE_INT real_pos = pos + UINTVAL (XEXP (XEXP (x, 0), 1));
> > + unsigned HOST_WIDE_INT mask = ((unsigned HOST_WIDE_INT)1 <<
> len) -
> > +1;
> 
> Space after cast.

OK.

> > + new_rtx = gen_rtx_AND (mode, new_rtx,
> > +gen_int_mode (mask << real_pos, mode));
> > +   }
> 
> So this changes
>   ((X >> C) & M) == ...
> to
>   (X & (M << C)) == ...
> ?
> 
> Where then does it check what ... is?  This is only valid like this if that 
> is zero.
> 
> Why should this go in combine and not in simplify-rtx instead?

True.  This is only valid when ... is zero.
That's why we need the "&& equality_comparison " condition here.

> > --- /dev/null
> > +++

[IMPORTANT] ChangeLog related changes

2020-05-25 Thread Jakub Jelinek via Gcc-patches

Hi!

I've turned the strict mode of Martin Liška's hook changes,
which means that from now on no commits to the trunk or release branches
should be changing any ChangeLog files together with the other files,
ChangeLog entry should be solely in the commit message.
The DATESTAMP bumping script will be updating the ChangeLog files for you.
If somebody makes a mistake in that, please wait 24 hours (at least until
after 00:16 UTC after your commit) so that the script will create the
ChangeLog entries, and afterwards it can be fixed by adjusting the ChangeLog
files.  But you can only touch the ChangeLog files in that case (and
shouldn't write a ChangeLog entry for that in the commit message).

If anything goes wrong, please let me, other RMs and Martin Liška know.

Jakub

Re: [PATCH RFC] gcc-git: Add prepare-commit-msg hook.

2020-05-25 Thread Jason Merrill via Gcc-patches


On 5/25/20 4:15 PM, Martin Liška wrote:

On 5/25/20 10:08 PM, Jason Merrill wrote:

Like so:


I like the patch, thanks!

The last question I have: Do we want to make installation of the hook
optional in gcc-git-customization.sh?


Sure:

>From 04429ee04769e15842093f154e8c7887f75d9476 Mon Sep 17 00:00:00 2001
From: Jason Merrill 
Date: Fri, 22 May 2020 18:40:35 -0400
Subject: [PATCH] gcc-git: Add prepare-commit-msg hook.
To: gcc-patches@gcc.gnu.org

This patch introduces a prepare-commit-msg hook that appends a ChangeLog
skeleton to a commit message when the GCC_FORCE_MKLOG environment variable
is set, and a 'git commit-mklog' command set that variable while running
'git commit'.

contrib/ChangeLog:

	* prepare-commit-msg: New file.
	* gcc-git-customization.sh: Install it.  Add commit-mklog alias.
	* mklog.py: Add new option -c which appends
	to a ChangeLog file.
---
 contrib/gcc-git-customization.sh | 13 
 contrib/mklog.py | 23 -
 contrib/prepare-commit-msg   | 57 
 3 files changed, 92 insertions(+), 1 deletion(-)
 create mode 100755 contrib/prepare-commit-msg

diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index 7a950ae5f38..dcc42683fa6 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -30,6 +30,8 @@ git config alias.gcc-backport '!f() { rev=$1; git cherry-pick -x $@; } ; f'
 
 git config alias.gcc-mklog '!f() { "`git rev-parse --show-toplevel`/contrib/mklog.py" $@; } ; f'
 
+git config alias.commit-mklog '!f() { GCC_FORCE_MKLOG=1 git commit "$@"; }; f'
+
 # Make diff on MD files use "(define" as a function marker.
 # Use this in conjunction with a .gitattributes file containing
 # *.mddiff=md
@@ -127,6 +129,17 @@ echo "(local branches starting / can be pushed directly to your"
 ask "personal area on the gcc server)" $old_pfx new_pfx
 git config "gcc-config.userpfx" "$new_pfx"
 
+echo
+ask "Install prepare-commit-msg git hook for 'git commit-mklog' alias" yes dohook
+if [ "x$dohook" = xyes ]; then
+hookdir=`git rev-parse --git-path hooks`
+if [ -f "$hookdir/prepare-commit-msg" ]; then
+	echo " Moving existing prepare-commit-msg hook to prepare-commit-msg.bak"
+	mv "$hookdir/prepare-commit-msg" "$hookdir/prepare-commit-msg.bak"
+fi
+install -c "`git rev-parse --show-toplevel`/contrib/prepare-commit-msg" "$hookdir"
+fi
+
 # Scan the existing settings to see if there are any we need to rewrite.
 vendors=$(git config --get-all "remote.${upstream}.fetch" "refs/vendors/" | sed -r "s:.*refs/vendors/([^/]+)/.*:\1:" | sort | uniq)
 url=$(git config --get "remote.${upstream}.url")
diff --git a/contrib/mklog.py b/contrib/mklog.py
index 7a19b5d0949..fb58661b5eb 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -30,6 +30,7 @@ import argparse
 import os
 import re
 import sys
+from itertools import takewhile
 
 import requests
 
@@ -221,6 +222,9 @@ if __name__ == '__main__':
 help='Do not generate function names in ChangeLogs')
 parser.add_argument('-p', '--fill-up-bug-titles', action='store_true',
 help='Download title of mentioned PRs')
+parser.add_argument('-c', '--changelog',
+help='Append the ChangeLog to a git commit message '
+ 'file')
 args = parser.parse_args()
 if args.input == '-':
 args.input = None
@@ -229,4 +233,21 @@ if __name__ == '__main__':
 data = input.read()
 output = generate_changelog(data, args.no_functions,
 args.fill_up_bug_titles)
-print(output, end='')
+if args.changelog:
+lines = open(args.changelog).read().split('\n')
+start = list(takewhile(lambda l: not l.startswith('#'), lines))
+end = lines[len(start):]
+with open(args.changelog, 'w') as f:
+if start:
+# appent empty line
+if start[-1] != '':
+start.append('')
+else:
+# append 2 empty lines
+start = 2 * ['']
+f.write('\n'.join(start))
+f.write('\n')
+f.write(output)
+f.write('\n'.join(end))
+else:
+print(output, end='')
diff --git a/contrib/prepare-commit-msg b/contrib/prepare-commit-msg
new file mode 100755
index 000..820dacc7e42
--- /dev/null
+++ b/contrib/prepare-commit-msg
@@ -0,0 +1,57 @@
+#!/bin/sh
+
+# Copyright (C) 2020 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR

[PATCH] PR fortran/95089 - ICE in gfc_get_derived_type, at fortran/trans-types.c:2843

2020-05-25 Thread Harald Anlauf

Another rather obvious case of a buffer too small to hold a name-mangled symbol,
this time with coarrays enabled.

OK for master?

Thanks,
Harald


PR fortran/95089 - ICE in gfc_get_derived_type, at fortran/trans-types.c:2843

For long module name, derive type and component name, the
generated name-mangled symbol did not fit into a buffer when
coarrays were enabled.  Provide sufficiently large temporary.

gcc/fortran/

2020-05-25  Harald Anlauf  

PR fortran/95089
* trans-types.c (gfc_get_derived_type): Enlarge temporary to hold
mangled name "_caf_symbol".

gcc/testsuite/

2020-05-25  Harald Anlauf  

PR fortran/95089
* gfortran.dg/pr95089.f90: New test.
diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c
index b7712dc74d1..99844812505 100644
--- a/gcc/fortran/trans-types.c
+++ b/gcc/fortran/trans-types.c
@@ -2836,9 +2836,10 @@ copy_derived_types:
 	  && (c->attr.allocatable || c->attr.pointer)
 	  && !derived->attr.is_class)
 	{
-	  char caf_name[GFC_MAX_SYMBOL_LEN];
+	  /* Provide sufficient space to hold "_caf_symbol".  */
+	  char caf_name[GFC_MAX_SYMBOL_LEN + 6];
 	  gfc_component *token;
-	  snprintf (caf_name, GFC_MAX_SYMBOL_LEN, "_caf_%s", c->name);
+	  snprintf (caf_name, sizeof (caf_name), "_caf_%s", c->name);
 	  token = gfc_find_component (derived, caf_name, true, true, NULL);
 	  gcc_assert (token);
 	  c->caf_token = token->backend_decl;
diff --git a/gcc/testsuite/gfortran.dg/pr95089.f90 b/gcc/testsuite/gfortran.dg/pr95089.f90
new file mode 100644
index 000..1cd20f0ccc5
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr95089.f90
@@ -0,0 +1,11 @@
+! { dg-do compile }
+! { dg-options "-fcoarray=lib" }
+!
+! PR fortran/95089 - ICE in gfc_get_derived_type, at fortran/trans-types.c:2843
+
+module m23456789012345678901234567890123456789012345678901234567890123
+  type t23456789012345678901234567890123456789012345678901234567890123
+ type (t23456789012345678901234567890123456789012345678901234567890123), &
+  pointer :: z23456789012345678901234567890123456789012345678901234567890123
+  end type t23456789012345678901234567890123456789012345678901234567890123
+end

Re: ChangeLog files - server and client scripts

2020-05-25 Thread Ian Lance Taylor via Gcc-patches

On Mon, May 25, 2020 at 12:48 AM Martin Liška  wrote:
>
> On 5/23/20 12:14 AM, Ian Lance Taylor wrote:
> > Sure, I can wait.  Thanks for looking at it.
>
> Hello.
>
> Thank you for patience. There's a patch that fixes that,
> I'm going to install it.

Thanks.  I was able to push my change to master.

Ian

libgo patch committed: update x/sys/cpu after gccgo support added

2020-05-25 Thread Ian Lance Taylor via Gcc-patches

This libgo patch updates the x/sys/cpu directory with better support
for gccgo on AIX.  Bootstrapped and ran Go tests on
x86_64-pc-linux-gnu.  Committed to master.

Ian
e479602af14fc5e76c9040846b2f5e85e126a472
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index bc9c1f07eda..284374820b0 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-bc27341f245a5cc54ac7530d037a609db72b677c
+ea58b8491064fbed18a220571a3043c38dccf7c7
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/go/golang.org/x/sys/cpu/cpu_aix.go 
b/libgo/go/golang.org/x/sys/cpu/cpu_aix.go
new file mode 100644
index 000..02d03129e50
--- /dev/null
+++ b/libgo/go/golang.org/x/sys/cpu/cpu_aix.go
@@ -0,0 +1,32 @@
+// Copyright 2019 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+// +build aix
+
+package cpu
+
+const (
+   // getsystemcfg constants
+   _SC_IMPL = 2
+   _IMPL_POWER8 = 0x1
+   _IMPL_POWER9 = 0x2
+)
+
+func init() {
+   impl := getsystemcfg(_SC_IMPL)
+   if impl&_IMPL_POWER8 != 0 {
+   PPC64.IsPOWER8 = true
+   }
+   if impl&_IMPL_POWER9 != 0 {
+   PPC64.IsPOWER9 = true
+   }
+
+   Initialized = true
+}
+
+func getsystemcfg(label int) (n uint64) {
+   r0, _ := callgetsystemcfg(label)
+   n = uint64(r0)
+   return
+}
diff --git a/libgo/go/golang.org/x/sys/cpu/cpu_aix_ppc64.go 
b/libgo/go/golang.org/x/sys/cpu/cpu_aix_ppc64.go
deleted file mode 100644
index b0ede112d4e..000
--- a/libgo/go/golang.org/x/sys/cpu/cpu_aix_ppc64.go
+++ /dev/null
@@ -1,32 +0,0 @@
-// Copyright 2019 The Go Authors. All rights reserved.
-// Use of this source code is governed by a BSD-style
-// license that can be found in the LICENSE file.
-
-// +build aix,ppc64
-
-package cpu
-
-const (
-   // getsystemcfg constants
-   _SC_IMPL = 2
-   _IMPL_POWER8 = 0x1
-   _IMPL_POWER9 = 0x2
-)
-
-func init() {
-   impl := getsystemcfg(_SC_IMPL)
-   if impl&_IMPL_POWER8 != 0 {
-   PPC64.IsPOWER8 = true
-   }
-   if impl&_IMPL_POWER9 != 0 {
-   PPC64.IsPOWER9 = true
-   }
-
-   Initialized = true
-}
-
-func getsystemcfg(label int) (n uint64) {
-   r0, _ := callgetsystemcfg(label)
-   n = uint64(r0)
-   return
-}
diff --git a/libgo/go/golang.org/x/sys/cpu/syscall_aix_gccgo.go 
b/libgo/go/golang.org/x/sys/cpu/syscall_aix_gccgo.go
new file mode 100644
index 000..2609cc49ae7
--- /dev/null
+++ b/libgo/go/golang.org/x/sys/cpu/syscall_aix_gccgo.go
@@ -0,0 +1,27 @@
+// Copyright 2020 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+// Recreate a getsystemcfg syscall handler instead of
+// using the one provided by x/sys/unix to avoid having
+// the dependency between them. (See golang.org/issue/32102)
+// Morover, this file will be used during the building of
+// gccgo's libgo and thus must not use a CGo method.
+
+// +build aix
+// +build gccgo
+
+package cpu
+
+import (
+   "syscall"
+)
+
+//extern getsystemcfg
+func gccgoGetsystemcfg(label uint32) (r uint64)
+
+func callgetsystemcfg(label int) (r1 uintptr, e1 syscall.Errno) {
+   r1 = uintptr(gccgoGetsystemcfg(uint32(label)))
+   e1 = syscall.GetErrno()
+   return
+}

Re: [PATCH RFC] gcc-git: Add prepare-commit-msg hook.

2020-05-25 Thread Martin Liška


On 5/25/20 10:08 PM, Jason Merrill wrote:

Like so:


I like the patch, thanks!

The last question I have: Do we want to make installation of the hook
optional in gcc-git-customization.sh?

Martin

New Swedish PO file for 'gcc' (version 10.1.0)

2020-05-25 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Swedish team of translators.  The file is available at:

https://translationproject.org/latest/gcc/sv.po

(This file, 'gcc-10.1.0.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [PATCH RFC] gcc-git: Add prepare-commit-msg hook.

2020-05-25 Thread Jason Merrill via Gcc-patches

On Mon, May 25, 2020 at 3:50 PM Jason Merrill  wrote:
> On Mon, May 25, 2020 at 2:14 PM Martin Liška  wrote:
> >
> > On 5/23/20 11:39 PM, Jason Merrill via Gcc-patches wrote:
> > > This patch introduces a prepare-commit-msg hook that appends a ChangeLog
> > > skeleton to a commit message that doesn't already have one, and a 'git
> > > amend-mklog' command to amend and append a new ChangeLog skeleton (to be
> > > edited together with existing entries by hand).
> >
> > As mentioned in the previous email, I prefer to make it opt-in, not opt-out.
> >
> > So I'm suggesting the 'git commit-mklog' alias that add the skeleton to
> > a git commit message.
> >
> > I also changed the hook to use named variables. And I would like to preserve
> > the ending '# Please enter the commit message for your changes. Lines 
> > starting'
> > section, so I use the mklog.py to put the skeleton to a proper place.
> >
> > Example:
> >
> > gcc/ChangeLog:
> >
> > * ipa-icf.c (make_pass_ipa_icf):
> >
> > # Please enter the commit message for your changes. Lines starting
> > # with '#' will be ignored, and an empty message aborts the commit.
> > #
> > # On branch me/add-prepare-commit-msg-hook
> > # Changes to be committed:
> > #   modified:   gcc/ipa-icf.c
> > #
> >
> > Thoughts?
>
> I'll integrate your changes with mine.

Like so:
From 47424e02f3d7c5e01e292c43e0d5f5d932178d31 Mon Sep 17 00:00:00 2001
From: Jason Merrill 
Date: Fri, 22 May 2020 18:40:35 -0400
Subject: [PATCH] gcc-git: Add prepare-commit-msg hook.
To: gcc-patches@gcc.gnu.org

This patch introduces a prepare-commit-msg hook that appends a ChangeLog
skeleton to a commit message when the GCC_FORCE_MKLOG environment variable
is set, and a 'git commit-mklog' command set that variable while running
'git commit'.

contrib/ChangeLog:

	* prepare-commit-msg: New file.
	* gcc-git-customization.sh: Install it.  Add commit-mklog alias.
	* mklog.py: Add new option -c which appends
	to a ChangeLog file.
---
 contrib/gcc-git-customization.sh |  5 +++
 contrib/mklog.py | 23 -
 contrib/prepare-commit-msg   | 57 
 3 files changed, 84 insertions(+), 1 deletion(-)
 create mode 100755 contrib/prepare-commit-msg

diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index 7a950ae5f38..a3f7da8d20b 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -30,6 +30,11 @@ git config alias.gcc-backport '!f() { rev=$1; git cherry-pick -x $@; } ; f'
 
 git config alias.gcc-mklog '!f() { "`git rev-parse --show-toplevel`/contrib/mklog.py" $@; } ; f'
 
+hookdir=`git rev-parse --git-path hooks`
+install "`git rev-parse --show-toplevel`/contrib/prepare-commit-msg" "$hookdir"
+
+git config alias.commit-mklog '!f() { GCC_FORCE_MKLOG=1 git commit "$@"; }; f'
+
 # Make diff on MD files use "(define" as a function marker.
 # Use this in conjunction with a .gitattributes file containing
 # *.mddiff=md
diff --git a/contrib/mklog.py b/contrib/mklog.py
index 7a19b5d0949..fb58661b5eb 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -30,6 +30,7 @@ import argparse
 import os
 import re
 import sys
+from itertools import takewhile
 
 import requests
 
@@ -221,6 +222,9 @@ if __name__ == '__main__':
 help='Do not generate function names in ChangeLogs')
 parser.add_argument('-p', '--fill-up-bug-titles', action='store_true',
 help='Download title of mentioned PRs')
+parser.add_argument('-c', '--changelog',
+help='Append the ChangeLog to a git commit message '
+ 'file')
 args = parser.parse_args()
 if args.input == '-':
 args.input = None
@@ -229,4 +233,21 @@ if __name__ == '__main__':
 data = input.read()
 output = generate_changelog(data, args.no_functions,
 args.fill_up_bug_titles)
-print(output, end='')
+if args.changelog:
+lines = open(args.changelog).read().split('\n')
+start = list(takewhile(lambda l: not l.startswith('#'), lines))
+end = lines[len(start):]
+with open(args.changelog, 'w') as f:
+if start:
+# appent empty line
+if start[-1] != '':
+start.append('')
+else:
+# append 2 empty lines
+start = 2 * ['']
+f.write('\n'.join(start))
+f.write('\n')
+f.write(output)
+f.write('\n'.join(end))
+else:
+print(output, end='')
diff --git a/contrib/prepare-commit-msg b/contrib/prepare-commit-msg
new file mode 100755
index 000..820dacc7e42
--- /dev/null
+++ b/contrib/prepare-commit-msg
@@ -0,0 +1,57 @@
+#!/bin/sh
+
+# Copyright (C) 2020 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU

Re: [PATCH RFC] gcc-git: Add prepare-commit-msg hook.

2020-05-25 Thread Jason Merrill via Gcc-patches

I'll integrate your changes with mine.

On Mon, May 25, 2020 at 2:14 PM Martin Liška  wrote:
>
> On 5/23/20 11:39 PM, Jason Merrill via Gcc-patches wrote:
> > This patch introduces a prepare-commit-msg hook that appends a ChangeLog
> > skeleton to a commit message that doesn't already have one, and a 'git
> > amend-mklog' command to amend and append a new ChangeLog skeleton (to be
> > edited together with existing entries by hand).
>
> As mentioned in the previous email, I prefer to make it opt-in, not opt-out.
>
> So I'm suggesting the 'git commit-mklog' alias that add the skeleton to
> a git commit message.
>
> I also changed the hook to use named variables. And I would like to preserve
> the ending '# Please enter the commit message for your changes. Lines 
> starting'
> section, so I use the mklog.py to put the skeleton to a proper place.
>
> Example:
>
> gcc/ChangeLog:
>
> * ipa-icf.c (make_pass_ipa_icf):
>
> # Please enter the commit message for your changes. Lines starting
> # with '#' will be ignored, and an empty message aborts the commit.
> #
> # On branch me/add-prepare-commit-msg-hook
> # Changes to be committed:
> #   modified:   gcc/ipa-icf.c
> #
>
> Thoughts?
> Martin

Re: [PATCH] Port libgccjit to Windows.

2020-05-25 Thread Nicolas Bértolo via Gcc-patches

Hi Dave,

Thanks for your feedback.

> Do you have copyright assignment paperwork on file?
> https://gcc.gnu.org/contribute.html#legal

My paperwork is done.

> The autotools are not my strongest suit.

> In a previous life I was a Windows developer, but I think it's been
> about 18 years since I've done any coding on Windows, so I'm going to
> have to trust your Windows expertise.

I guess the best solution would be to use libtool and Automake, but
that require major changes in the GCC build system. I came up with a
solution that uses an if statement to choose the way to build the
dynamic library.

> > It is not necessary to use --enable-host-shared in Windows (I tested
> it),
> > but I
> > don't know the proper way to disable that check.

I used case statement to check the value of $(target).

> What's the issue with fchmod on Windows?  Does it not exist, or does it
> not work?
> Please add a comment explaining why.

fchmod does not exist in Windows.

> Please can you fix the indentation here to stay within 80 columns.
> It looks like there's very similar (identical?) code to this below in
> several places.
> Can it be put in a subroutine to avoid repetition?

I created a new file "jit-w32.{h,c}" to store the definitions of
win_mkdtemp() and print_last_error().

> Is there a cast to HMODULE everywhere that m_dso_handle is used?
> If so, does it make the code cleaner if the field becomes an HMODULE on
> Windows?

I added a new type "result::handle_t" that abstracts over this
difference.

> This seems like a very awkward way to get a temporary directory, but
> perhaps it's the best that can be done on Windows?

We could copy an implementation of mkdtemp () for Windows or write
something using a random number generator. I did it this way because
it was the fastest.

> Sorry if this seems excessively like nitpicking - thanks again for the
> patch.

No problem.

Here is a new version.

Nico.


0001-Port-libgccjit-to-Windows.patch
Description: Binary data

Re: New mklog script

2020-05-25 Thread Jason Merrill via Gcc-patches

On Mon, May 25, 2020 at 5:23 AM Martin Liška  wrote:
>
> On 5/22/20 11:01 PM, Jason Merrill wrote:
> > On Thu, May 21, 2020 at 6:03 PM Jason Merrill  wrote:
> >>
> >> On Fri, May 15, 2020 at 11:39 AM Martin Liška  wrote:
> >>>
> >>> On 5/15/20 3:22 PM, Marek Polacek wrote:
>  On Fri, May 15, 2020 at 03:12:27PM +0200, Martin Liška wrote:
> > On 5/15/20 2:42 PM, Marek Polacek wrote:
> >> I actually use mklog -i all the time.  But I can work around it if it
> >> disappears.
> >
> > Ah, I can see a consumer.
> > There's an updated version that supports that.
> >
> > For the future, will you still use the option? Wouldn't be better
> > to put the ChangeLog content directly to commit message? Note
> > that you won't have to copy the entries to a particular ChangeLog file.
> 
>  The way I do it is to generate a patch using format-patch, use mklog -i
>  on it, then add the ChangeLog entry to the commit message via commit 
>  --amend.
> >>>
> >>> Hmm, you can do much better with:
> >>>
> >>> $ git diff | ./contrib/mklog > changelog && git commit -a -t changelog
> >>>
> >>> Or for an already created commit you can do:
> >>>
> >>> $ git diff HEAD~ | ./contrib/mklog > changelog && git commit -a --amend 
> >>> -e -F changelog
> >>
> >> With these git aliases:
> >>
> >>  mklog-editor = "!f() { git show | git gcc-mklog >> $1; }; f"
> >>  addlog = "!f() { GIT_EDITOR='git mklog-editor' git commit 
> >> --amend; }; f"
> >>
> >> I can 'git addlog' to append the output of mklog to the current
> >> commit.  Probably better would be to do something with
> >> prepare-commit-msg.
> >
> > This is pretty rudimentary, but good enough as a start:
>
> I like the idea of usage of the prepare commit hook.
>
> >
> > #!/bin/sh
> >
> > #COMMIT_MSG_FILE=$1
> > #COMMIT_SOURCE=$2
> > #SHA1=$3
>
> It's better to use the named arguments.
>
> >
> > if ! [ -f "$1" ]; then exit 0; fi
> >
> > #echo "# $0 $1 $2 $3" >> $1
> >
> > if fgrep 'ChangeLog:' $1 > /dev/null 2>&1; then exit 0; fi
> >
> > if [ -z "$2" ]; then
> >  cmd="diff --cached"
> > elif [ $2 == commit ]; then
> >  cmd="show $3"
> > else
> >  exit 0
> > fi
> >
> > git $cmd | git gcc-mklog >> $1
> >
>
> Well, that will generate changelog entry for each commit.
> For a user branch development, it's not desirable.

It isn't that useful for intermediate commits, but I was thinking it
wasn't harmful either.  But I can remove the on-by-default aspects.

> What about more explicit approach:
>
> 1) making an alias for: git diff | git gcc-mklog > commit.msg
> 2) hook:
>
> if test -f commit.msg; then
>cat commit.msg >> "$COMMIT_MSG_FILE"
>rm commit.msg
> fi
>
> So the changelog is created explicitly and included implicitly.

How about this?
commit 787893dc41fb8288994a2350943c22e2388476e7
Author: Jason Merrill 
Date:   Fri May 22 18:40:35 2020 -0400

gcc-git: Add prepare-commit-msg.

This patch introduces a prepare-commit-msg hook that appends a ChangeLog
skeleton to a commit message when the GCC_FORCE_MKLOG environment variable
is set, and a 'git commit-mklog' command set that variable while running
'git commit'.

contrib/ChangeLog:

* prepare-commit-msg: New file.
* gcc-git-customization.sh: Install it.  Add commit-mklog alias.

diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index 7a950ae5f38..a3f7da8d20b 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -30,6 +30,11 @@ git config alias.gcc-backport '!f() { rev=$1; git cherry-pick -x $@; } ; f'
 
 git config alias.gcc-mklog '!f() { "`git rev-parse --show-toplevel`/contrib/mklog.py" $@; } ; f'
 
+hookdir=`git rev-parse --git-path hooks`
+install "`git rev-parse --show-toplevel`/contrib/prepare-commit-msg" "$hookdir"
+
+git config alias.commit-mklog '!f() { GCC_FORCE_MKLOG=1 git commit "$@"; }; f'
+
 # Make diff on MD files use "(define" as a function marker.
 # Use this in conjunction with a .gitattributes file containing
 # *.mddiff=md
diff --git a/contrib/prepare-commit-msg b/contrib/prepare-commit-msg
new file mode 100644
index 000..b06080b927c
--- /dev/null
+++ b/contrib/prepare-commit-msg
@@ -0,0 +1,43 @@
+#!/bin/sh
+
+COMMIT_MSG_FILE=$1
+COMMIT_SOURCE=$2
+SHA1=$3
+
+#echo "# $*" > $HOME/prepare-commit-msg-args
+
+# Can't do anything if $COMMIT_MSG_FILE isn't a file.
+if ! [ -f "$COMMIT_MSG_FILE" ]; then exit 0; fi
+
+# Don't do anything unless requested to.
+if [ -z "$GCC_FORCE_MKLOG" ]; then exit 0; fi
+
+if [ -z "$COMMIT_SOURCE" ] || [ $COMMIT_SOURCE = template ]; then
+# No source or "template" means new commit.
+cmd="diff --cached"
+
+elif [ $COMMIT_SOURCE = message ]; then
+# "message" means -m, assume a new commit if there are any changes staged.
+if ! git diff --cached --quiet; then
+	cmd="diff --cached"
+else
+	cmd="diff --cached HEAD^"
+fi
+
+# Add a blank line before

Re: [PATCH RFC] gcc-git: Add prepare-commit-msg hook.

2020-05-25 Thread Martin Liška


On 5/23/20 11:39 PM, Jason Merrill via Gcc-patches wrote:

This patch introduces a prepare-commit-msg hook that appends a ChangeLog
skeleton to a commit message that doesn't already have one, and a 'git
amend-mklog' command to amend and append a new ChangeLog skeleton (to be
edited together with existing entries by hand).


As mentioned in the previous email, I prefer to make it opt-in, not opt-out.

So I'm suggesting the 'git commit-mklog' alias that add the skeleton to
a git commit message.

I also changed the hook to use named variables. And I would like to preserve
the ending '# Please enter the commit message for your changes. Lines starting'
section, so I use the mklog.py to put the skeleton to a proper place.

Example:

gcc/ChangeLog:

* ipa-icf.c (make_pass_ipa_icf):

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# On branch me/add-prepare-commit-msg-hook
# Changes to be committed:
#   modified:   gcc/ipa-icf.c
#

Thoughts?
Martin
>From bbceb70bcf56d6feac9e7f90eefa06147417e4d0 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Mon, 25 May 2020 19:38:00 +0200
Subject: [PATCH] Add prepare-commit-msg hook/

contrib/ChangeLog:

	* gcc-git-customization.sh: Install new hook.
	* mklog.py: Add new option -c which append
	to a ChangeLog file.
	* prepare-commit-msg: New file.
---
 contrib/gcc-git-customization.sh |  5 +
 contrib/mklog.py | 23 ++-
 contrib/prepare-commit-msg   | 32 
 3 files changed, 59 insertions(+), 1 deletion(-)
 create mode 100644 contrib/prepare-commit-msg

diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index 7a950ae5f38..110f661f5e5 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -30,6 +30,11 @@ git config alias.gcc-backport '!f() { rev=$1; git cherry-pick -x $@; } ; f'
 
 git config alias.gcc-mklog '!f() { "`git rev-parse --show-toplevel`/contrib/mklog.py" $@; } ; f'
 
+hookdir=`git rev-parse --git-path hooks`
+install "`git rev-parse --show-toplevel`/contrib/prepare-commit-msg" "$hookdir"
+
+git config alias.commit-mklog '!f() { GCC_PREPARE_COMMIT=1 git commit "$@"; }; f'
+
 # Make diff on MD files use "(define" as a function marker.
 # Use this in conjunction with a .gitattributes file containing
 # *.mddiff=md
diff --git a/contrib/mklog.py b/contrib/mklog.py
index 7a19b5d0949..fb58661b5eb 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -30,6 +30,7 @@ import argparse
 import os
 import re
 import sys
+from itertools import takewhile
 
 import requests
 
@@ -221,6 +222,9 @@ if __name__ == '__main__':
 help='Do not generate function names in ChangeLogs')
 parser.add_argument('-p', '--fill-up-bug-titles', action='store_true',
 help='Download title of mentioned PRs')
+parser.add_argument('-c', '--changelog',
+help='Append the ChangeLog to a git commit message '
+ 'file')
 args = parser.parse_args()
 if args.input == '-':
 args.input = None
@@ -229,4 +233,21 @@ if __name__ == '__main__':
 data = input.read()
 output = generate_changelog(data, args.no_functions,
 args.fill_up_bug_titles)
-print(output, end='')
+if args.changelog:
+lines = open(args.changelog).read().split('\n')
+start = list(takewhile(lambda l: not l.startswith('#'), lines))
+end = lines[len(start):]
+with open(args.changelog, 'w') as f:
+if start:
+# appent empty line
+if start[-1] != '':
+start.append('')
+else:
+# append 2 empty lines
+start = 2 * ['']
+f.write('\n'.join(start))
+f.write('\n')
+f.write(output)
+f.write('\n'.join(end))
+else:
+print(output, end='')
diff --git a/contrib/prepare-commit-msg b/contrib/prepare-commit-msg
new file mode 100644
index 000..a14b2ece042
--- /dev/null
+++ b/contrib/prepare-commit-msg
@@ -0,0 +1,32 @@
+#!/bin/sh
+
+COMMIT_MSG_FILE=$1
+COMMIT_SOURCE=$2
+SHA1=$3
+
+# Can't do anything if $COMMIT_MSG_FILE isn't a file.
+if ! [ -f "$COMMIT_MSG_FILE" ]; then exit 0; fi
+
+# Don't mess with existing entries unless requested to.
+if [ -z "$GCC_PREPARE_COMMIT" ]; then exit 0; fi
+
+if [ -z "$COMMIT_SOURCE" ] || [ $COMMIT_SOURCE == template ]; then
+# No source or "template" means new commit.
+cmd="diff --cached"
+elif [ $COMMIT_SOURCE == message ]; then
+# "message" means -m, but could be either a new commit or --amend.
+# Guess which based on whether there are any changes staged.
+if git diff --cached --quiet; then
+	cmd="show"
+else
+	cmd="diff --cached"
+fi
+elif [ $COMMIT_SOURCE == commit ]; then
+# The message of an

Re: [PATCH 1/2] Seperate -funroll-loops for GIMPLE unroller and RTL unroller

2020-05-25 Thread Segher Boessenkool

On Mon, May 25, 2020 at 07:58:09PM +0200, Richard Biener wrote:
> You folks made ppc specific hacks instead of a better design. Those now stand 
> in the way as well. But sure, simply do not expose the flag to the users, use
> Var(flag_rtl_unroll_loops). My other points still stand. 
> 
> Feel free to ignore the regression part on the branch and come up with a 
> great design. But don't expect to backport that then. 

I just do not think fixing the regression should make things worse for
a long time, if that can be avoided.


Segher

Re: [PATCH 1/2] Seperate -funroll-loops for GIMPLE unroller and RTL unroller

2020-05-25 Thread David Edelsohn via Gcc-patches

On Mon, May 25, 2020 at 1:58 PM Richard Biener
 wrote:
>
> On May 25, 2020 7:40:00 PM GMT+02:00, Segher Boessenkool 
>  wrote:
> >On Mon, May 25, 2020 at 02:14:02PM +0200, Richard Biener wrote:
> >> On Mon, May 25, 2020 at 1:10 PM guojiufu 
> >wrote:
> >> Since a new flag is not needed to fix the regression please avoid
> >> adding -fcomplete-unroll-loops.
> >>
> >> For -frtl-unroll-loops you should be able to use
> >
> >Erm.  That *is* a new command-line option (the internal flags I do not
> >care about so much: new implementation details are *good*).  And a new
> >name that is a mistake in my opinion, for many reasons (users do not
> >know and should not have to care about "rtl"; the name is not
> >descriptive; it is useless churn, it is not the same name as we have
> >had for decades now; it is adding a new option for a future where we
> >will do most unrolling at gimple level, a future we do not know will
> >ever exist, and we do not know what that will look like anyway; it is
> >an extra level of indirection (in the name)).
> >
> >We should not have an -frtl-unroll-loops if we do not have a
> >-ftree-unroll-loops (or whatever).
> >
> >Unrolling early is not a good idea in general (the problems with the
> >very trivial complete unrolling case just underline that).  But we
> >*should* know which code we expect to unroll later, for better costing.
> >Adding names like "rtl-unroll-loops" only stands in the way of getting
> >a better design here.
>
> You folks made ppc specific hacks instead of a better design. Those now stand 
> in the way as well. But sure, simply do not expose the flag to the users, use
> Var(flag_rtl_unroll_loops). My other points still stand.
>
> Feel free to ignore the regression part on the branch and come up with a 
> great design. But don't expect to backport that then.

I completely agree.

This path is digging a deeper and deeper hole.

- David

Re: [PATCH 1/2] Seperate -funroll-loops for GIMPLE unroller and RTL unroller

2020-05-25 Thread Richard Biener via Gcc-patches

On May 25, 2020 7:40:00 PM GMT+02:00, Segher Boessenkool 
 wrote:
>On Mon, May 25, 2020 at 02:14:02PM +0200, Richard Biener wrote:
>> On Mon, May 25, 2020 at 1:10 PM guojiufu 
>wrote:
>> Since a new flag is not needed to fix the regression please avoid
>> adding -fcomplete-unroll-loops.
>> 
>> For -frtl-unroll-loops you should be able to use
>
>Erm.  That *is* a new command-line option (the internal flags I do not
>care about so much: new implementation details are *good*).  And a new
>name that is a mistake in my opinion, for many reasons (users do not
>know and should not have to care about "rtl"; the name is not
>descriptive; it is useless churn, it is not the same name as we have
>had for decades now; it is adding a new option for a future where we
>will do most unrolling at gimple level, a future we do not know will
>ever exist, and we do not know what that will look like anyway; it is
>an extra level of indirection (in the name)).
>
>We should not have an -frtl-unroll-loops if we do not have a
>-ftree-unroll-loops (or whatever).
>
>Unrolling early is not a good idea in general (the problems with the
>very trivial complete unrolling case just underline that).  But we
>*should* know which code we expect to unroll later, for better costing.
>Adding names like "rtl-unroll-loops" only stands in the way of getting
>a better design here.

You folks made ppc specific hacks instead of a better design. Those now stand 
in the way as well. But sure, simply do not expose the flag to the users, use
Var(flag_rtl_unroll_loops). My other points still stand. 

Feel free to ignore the regression part on the branch and come up with a great 
design. But don't expect to backport that then. 

Richard. 

>
>Segher

Re: [PATCH 1/2] rs6000: tune cunroll for simple loops at O2

2020-05-25 Thread Segher Boessenkool

On Mon, May 25, 2020 at 02:39:54PM +0200, Richard Biener wrote:
> On Fri, May 22, 2020 at 6:54 PM Segher Boessenkool
>  wrote:
> > > The split above allows the "bug" to be fixed (even on the branch)
> > > without introducing even more target specialities.
> >
> > So does any split.  Or I don't see what you mean?
> 
> Well, a split that does not affect behavior for non-ppc architectures
> when the flags by users are unchanged.  Because that allows
> the ppc regression to be fixed on the branch.
> 
> Then, on trunk, we can think of a better overall flag design.  Note

Oh, as just a (very) temporary thing, it is fine of course (it should
say it is then though).

> that cunroll/cunrolli are not controlled by a flag currently, they
> are gated on optimize >= [2|3] - it's just that either -funroll-loops
> or -fpeel-loops causes its heuristics to allow code-size growth
> by its own metrics according to the unroll --params.
> 
> So it's a bit difficult to retrofit the heuristic behavior onto new
> flags unless we want to completely move that over to a --param
> that may be gets adjusted by -funroll-loops.

Yes, cunroll does not have its own option, and that is a problem.  But
that is easy to fix!  Either with an option, or just with params (the
option wouldn't do more than set params anyway?)


Segher

V4 [PATCH] x86: Move cpuinfo.h from libgcc to common/config/i386

2020-05-25 Thread H.J. Lu via Gcc-patches

On Mon, May 25, 2020 at 2:52 AM Martin Liška  wrote:
>
> Hello.
>
> I really welcome the unification patch and I have some comments (ideas):
>
> 1)
>
> > +static inline int
> > +has_cpu_feature (struct __processor_model *cpu_model,
> > +  unsigned int *cpu_features2,
> > +  enum processor_features f)
> > +{
> > +  unsigned int i;
> > +  if (f < 32)
> > +return cpu_model->__cpu_features[0] & (1U << (f & 31));
> > +  for (i = 0; i < SIZE_OF_CPU_FEATURES; i++)
> > +if (f < (32 + 32 + i * 32))
> > +return cpu_features2[i] & (1U << ((f - (32 + i * 32)) & 31));
> > +  gcc_unreachable ();
> > +}
> > +
> > +static inline void
> > +set_cpu_feature (struct __processor_model *cpu_model,
> > +  unsigned int *cpu_features2,
> > +  enum processor_features f)
> > +{
> > +  unsigned int i;
> > +  if (f < 32)
> > +{
> > +  cpu_model->__cpu_features[0] |= (1U << (f & 31));
> > +  return;
> > +}
> > +  for (i = 0; i < SIZE_OF_CPU_FEATURES; i++)
> > +if (f < (32 + 32 + i * 32))
> > +  {
> > + cpu_features2[i] |= (1U << ((f - (32 + i * 32)) & 31));
> > + return;
> > +  }
> > +  gcc_unreachable ();
> > +}
>
> Can you please add comment about the '32 + 32' and '32 + i * 32', it's 
> unclear from the patch.

Done.

> 2) Can we remove all the:
>
> > +  unsigned int has_lahf_lm, has_sse4a;
> > +  unsigned int has_longmode, has_3dnowp, has_3dnow;
> > +  unsigned int has_movbe, has_sse4_1, has_sse4_2;
> > +  unsigned int has_popcnt, has_aes, has_avx, has_avx2;
> > +  unsigned int has_pclmul, has_abm, has_lwp;
> > +  unsigned int has_fma, has_fma4, has_xop;
> > +  unsigned int has_bmi, has_bmi2, has_tbm, has_lzcnt;
>
> ...
>
> > +  has_sse4a = has_feature (FEATURE_SSE4_A);
> > +  has_fma4 = has_feature (FEATURE_FMA4);
> > +  has_xop = has_feature (FEATURE_XOP);
> > +  has_fma = has_feature (FEATURE_FMA);
>
> ...
>
>if (arch)
>  {
>const char *mmx = has_mmx ? " -mmmx" : " -mno-mmx";
>const char *mmx3dnow = has_3dnow ? " -m3dnow" : " -mno-3dnow";
>const char *sse = has_sse ? " -msse" : " -mno-sse";
> ...
>
>options = concat (options, mmx, mmx3dnow, sse, sse2, sse3, ssse3,
> sse4a, cx16, sahf, movbe, aes, sha, pclmul,
> popcnt, abm, lwp, fma, fma4, xop, bmi, sgx, bmi2,
> pconfig, wbnoinvd,
> tbm, avx, avx2, sse4_2, sse4_1, lzcnt, rtm,
> hle, rdrnd, f16c, fsgsbase, rdseed, prfchw, adx,
> fxsr, xsave, xsaveopt, avx512f, avx512er,
> avx512cd, avx512pf, prefetchwt1, clflushopt,
> xsavec, xsaves, avx512dq, avx512bw, avx512vl,
> avx512ifma, avx512vbmi, avx5124fmaps, avx5124vnniw,
> clwb, mwaitx, clzero, pku, rdpid, gfni, shstk,
> avx512vbmi2, avx512vnni, vaes, vpclmulqdq,
> avx512bitalg, avx512vpopcntdq, movdiri, movdir64b,
> waitpkg, cldemote, ptwrite, avx512bf16, enqcmd,
> avx512vp2intersect, serialize, tsxldtrk, NULL);
>
> and instead mark flags in 'isa_names_table' that should be used for 
> -match=native option emission.
> We know names of the flags, their value (has_feature) and so we can generate 
> that automatically?

Done.

> 3) Similarly we can we do automatically all the:
>
> +  if (has_feature (FEATURE_AVX512VBMI2))
> +assert (__builtin_cpu_supports ("avx512vbmi2"));
>
> We should have both information in 'isa_names_table'.
>

Done.

Here is the updated patch.  OK for master?

Thanks,

-- 
H.J.
From 481d178dfd97334cf5d0ab0565833524e3579c07 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 18 May 2020 05:58:41 -0700
Subject: [PATCH] x86: Move cpuinfo.h from libgcc to common/config/i386

Move cpuinfo.h from libgcc to common/config/i386 and move isa_names_table
to common/config/i386 so that get_intel_cpu can be shared by libgcc, GCC
driver and gcc.target/i386/builtin_target.c to detect the specific type
of Intel and AMD CPUs:

1. Use the same enum processor_features in libgcc and x86 backend.
2. Add more processor features to enum processor_features.
3. Use the same isa_names_table in i386-builtins.c, driver-i386.c and
gcc.target/i386/builtin_target.c.
4. Use isa_names_table to generate ISA command-line options.
5. Use isa_names_table to generate __builtin_cpu_supports tests.
6. Add M_VENDOR, M_CPU_TYPE and M_CPU_SUBTYPE in i386-builtins.c to
avoid duplication in.
7. Use cpu_indicator_init, has_cpu_feature, get_amd_cpu and get_intel_cpu
in driver-i386.c and builtin_target.c.

gcc/

	PR target/95259
	* common/config/i386/cpuinfo-builtins.h: Moved from
	libgcc/config/i386/cpuinfo.h.
	(processor_vendor): Add VENDOR_CENTAUR, VENDOR_CYRIX, VENDOR_NSC
	and BUILTIN_VENDOR_MAX.
	(processor_types): Add BUILTIN_CPU_TYPE_MAX.
	(processor_features): Add FEATURE_3DNOW,

Re: [PATCH 1/2] Seperate -funroll-loops for GIMPLE unroller and RTL unroller

2020-05-25 Thread Segher Boessenkool

On Mon, May 25, 2020 at 02:14:02PM +0200, Richard Biener wrote:
> On Mon, May 25, 2020 at 1:10 PM guojiufu  wrote:
> Since a new flag is not needed to fix the regression please avoid
> adding -fcomplete-unroll-loops.
> 
> For -frtl-unroll-loops you should be able to use

Erm.  That *is* a new command-line option (the internal flags I do not
care about so much: new implementation details are *good*).  And a new
name that is a mistake in my opinion, for many reasons (users do not
know and should not have to care about "rtl"; the name is not
descriptive; it is useless churn, it is not the same name as we have
had for decades now; it is adding a new option for a future where we
will do most unrolling at gimple level, a future we do not know will
ever exist, and we do not know what that will look like anyway; it is
an extra level of indirection (in the name)).

We should not have an -frtl-unroll-loops if we do not have a
-ftree-unroll-loops (or whatever).

Unrolling early is not a good idea in general (the problems with the
very trivial complete unrolling case just underline that).  But we
*should* know which code we expect to unroll later, for better costing.
Adding names like "rtl-unroll-loops" only stands in the way of getting
a better design here.

Segher

Re: [Patch][OpenMP] Fix mapping of artificial variables (PR94874)

2020-05-25 Thread Tobias Burnus


On 5/12/20 1:02 PM, Jakub Jelinek wrote:


I think we want a new hook, the clear cases (mostly DECL_ARTIFICIAL ones,
if it is really something compiler created and not something under user's
control) …


Attached is one version, which is somewhat minimalist; I did not check
what happens with __FUNCTION__ and also the Fortran part is not very extended
but it is a start.

Additionally, I fixed the comments for the existing c_omp_predetermined_sharing.

OK for the trunk?

Tobias

PS: Not tested with an actually offloading compiler due to PR ipa/95320.

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
gcc/c-family/ChangeLog:

	* c-common.h (c_omp_predetermined_mapping): Declare.
	* c-omp.c (c_omp_predetermined_mapping): New.

gcc/c/ChangeLog:

	* c-objc-common.h (LANG_HOOKS_OMP_PREDETERMINED_MAPPING): Redefine.

gcc/cp/ChangeLog:

	* cp-gimplify.c (cxx_omp_predetermined_mapping): New.
	* cp-objcp-common.h (LANG_HOOKS_OMP_PREDETERMINED_MAPPING): Redfine.
	* cp-tree.h (cxx_omp_predetermined_mapping): Declare.

gcc/fortran/ChangeLog:

	* f95-lang.c (LANG_HOOKS_OMP_PREDETERMINED_MAPPING): Redefine.
	* trans-openmp.c (gfc_omp_predetermined_mapping): New.
	* trans.h (gfc_omp_predetermined_mapping): Declare.

gcc/ChangeLog:

	* gimplify.c (omp_notice_variable): Use new hook.
	* langhooks-def.h (lhd_omp_predetermined_mapping): Declare.
	(LANG_HOOKS_OMP_PREDETERMINED_MAPPING): Define
	(LANG_HOOKS_DECLS): Add it.
	* langhooks.c (lhd_omp_predetermined_sharing): Remove bogus unused attr.
	(lhd_omp_predetermined_mapping): New.
	* langhooks.h (struct lang_hooks_for_decls): Add new hook.

gcc/testsuite/ChangeLog
2020-05-25  Thomas Schwinge  
Tobias Burnus  

PR middle-end/94874
* c-c++-common/gomp/pr94874.c: New.

 gcc/c-family/c-common.h   |  1 +
 gcc/c-family/c-omp.c  | 24 +++-
 gcc/c/c-objc-common.h |  3 +++
 gcc/cp/cp-gimplify.c  | 27 ++-
 gcc/cp/cp-objcp-common.h  |  2 ++
 gcc/cp/cp-tree.h  |  1 +
 gcc/fortran/f95-lang.c|  2 ++
 gcc/fortran/trans-openmp.c| 25 -
 gcc/fortran/trans.h   |  1 +
 gcc/gimplify.c|  6 +-
 gcc/langhooks-def.h   |  3 +++
 gcc/langhooks.c   | 13 -
 gcc/langhooks.h   |  4 
 gcc/testsuite/c-c++-common/gomp/pr94874.c | 27 +++
 14 files changed, 134 insertions(+), 5 deletions(-)

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 7c1a6370aae..c74b23db05c 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1206,6 +1206,7 @@ extern tree c_omp_declare_simd_clauses_to_numbers (tree, tree);
 extern void c_omp_declare_simd_clauses_to_decls (tree, tree);
 extern bool c_omp_predefined_variable (tree);
 extern enum omp_clause_default_kind c_omp_predetermined_sharing (tree);
+extern enum omp_clause_defaultmap_kind c_omp_predetermined_mapping (tree);
 extern tree c_omp_check_context_selector (location_t, tree);
 extern void c_omp_mark_declare_variant (location_t, tree, tree);
 extern const char *c_omp_map_clause_name (tree, bool);
diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
index 51c18a4ba08..6f8fba350ed 100644
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -2104,7 +2104,8 @@ c_omp_predefined_variable (tree decl)
   return false;
 }
 
-/* True if OpenMP sharing attribute of DECL is predetermined.  */
+/* OMP_CLAUSE_DEFAULT_UNSPECIFIED unless OpenMP sharing attribute of DECL
+   is predetermined.  */
 
 enum omp_clause_default_kind
 c_omp_predetermined_sharing (tree decl)
@@ -2123,6 +2124,27 @@ c_omp_predetermined_sharing (tree decl)
   return OMP_CLAUSE_DEFAULT_UNSPECIFIED;
 }
 
+/* OMP_CLAUSE_DEFAULTMAP_CATEGORY_UNSPECIFIED unless OpenMP mapping attribute
+   of DECL is predetermined.  */
+
+enum omp_clause_defaultmap_kind
+c_omp_predetermined_mapping (tree decl)
+{
+  /* Predetermine artificial variables holding integral values, those
+ are usually result of gimplify_one_sizepos or SAVE_EXPR
+ gimplification.  */
+  if (VAR_P (decl)
+  && DECL_ARTIFICIAL (decl)
+  && INTEGRAL_TYPE_P (TREE_TYPE (decl)))
+return OMP_CLAUSE_DEFAULTMAP_FIRSTPRIVATE;
+
+  if (c_omp_predefined_variable (decl))
+return OMP_CLAUSE_DEFAULTMAP_TO;
+
+  return OMP_CLAUSE_DEFAULTMAP_CATEGORY_UNSPECIFIED;
+}
+
+
 /* Diagnose errors in an OpenMP context selector, return CTX if
it is correct or error_mark_node otherwise.  */
 
diff --git a/gcc/c/c-objc-common.h b/gcc/c/c-objc-common.h
index bfdb2797ff4..5471fc7e355 100644
--- a/gcc/c/c-objc-common.h
+++ b/gcc/c/c-objc-common.h
@@ -107,6 +107,9 @@ along with GCC; see the

Re: [PATCH v1 1/2][PPC64] [PR88877]

2020-05-25 Thread Iain Sandoe via Gcc-patches


Jakub Jelinek via Gcc-patches  wrote:


On Mon, May 25, 2020 at 06:37:57PM +0200, Richard Biener wrote:

I thought of using std::tuple  but it requires c++11 support.
I am not sure we always build gcc with c++11?


https://gcc.gnu.org/install/prerequisites.html

We do for GCC 11 :-)  Since we pay the price for progress, let's reap
the
benefits as well :-)


Not sure if the benefit is enough to warrant an extra (complex?)  
standard header in almost every TU.


Yeah, especially when one can just define the 3 fields of the small struct
to be descriptive, rather than being first/second/third.

+1,
tuple (and pair) have the property that the use of the fields might be  
obvious to the code author, but probably will not be to the code reader.   
IMO most of the time as Jakub says something descriptive is better - and I  
don’t believe it’s a performance penalty.


Iain

Re: [PATCH v1 1/2][PPC64] [PR88877]

2020-05-25 Thread Jakub Jelinek via Gcc-patches

On Mon, May 25, 2020 at 06:37:57PM +0200, Richard Biener wrote:
> >> I thought of using std::tuple  but it requires c++11 support.
> >> I am not sure we always build gcc with c++11?
> >
> >https://gcc.gnu.org/install/prerequisites.html
> >
> >We do for GCC 11 :-)  Since we pay the price for progress, let's reap
> >the
> >benefits as well :-)
> 
> Not sure if the benefit is enough to warrant an extra (complex?) standard 
> header in almost every TU. 

Yeah, especially when one can just define the 3 fields of the small struct
to be descriptive, rather than being first/second/third.

Jakub

Re: [PATCH v1 1/2][PPC64] [PR88877]

2020-05-25 Thread Richard Biener

On May 25, 2020 6:31:29 PM GMT+02:00, Segher Boessenkool 
 wrote:
>On Mon, May 25, 2020 at 12:46:02PM +0530, kamlesh kumar wrote:
>> > OTOH, you don't need to name Tuple at all...  It should not *have*
>a
>> > constructor, since you declared it as class...  But you can just
>use
>> > std::tuple here?
>> 
>> I thought of using std::tuple  but it requires c++11 support.
>> I am not sure we always build gcc with c++11?
>
>https://gcc.gnu.org/install/prerequisites.html
>
>We do for GCC 11 :-)  Since we pay the price for progress, let's reap
>the
>benefits as well :-)

Not sure if the benefit is enough to warrant an extra (complex?) standard 
header in almost every TU. 

>> > > (emit_library_call): Added default arg unsigned_p.
>> > > (emit_library_call_value): Added default arg unsigned_p.
>> >
>> > Yeah, eww.  Default arguments have all the problems you had before,
>> > except now it is hidden and much more surprising.
>> >
>> > Those functions really should take rtx_mode_t arguments?
>> 
>> I was thinking the same. will incorporate this.
>
>Thanks!
>
>
>Segher

Re: [PATCH v1 1/2][PPC64] [PR88877]

2020-05-25 Thread Segher Boessenkool

On Mon, May 25, 2020 at 12:46:02PM +0530, kamlesh kumar wrote:
> > OTOH, you don't need to name Tuple at all...  It should not *have* a
> > constructor, since you declared it as class...  But you can just use
> > std::tuple here?
> 
> I thought of using std::tuple  but it requires c++11 support.
> I am not sure we always build gcc with c++11?

https://gcc.gnu.org/install/prerequisites.html

We do for GCC 11 :-)  Since we pay the price for progress, let's reap the
benefits as well :-)

> > > (emit_library_call): Added default arg unsigned_p.
> > > (emit_library_call_value): Added default arg unsigned_p.
> >
> > Yeah, eww.  Default arguments have all the problems you had before,
> > except now it is hidden and much more surprising.
> >
> > Those functions really should take rtx_mode_t arguments?
> 
> I was thinking the same. will incorporate this.

Thanks!


Segher

Re: [PATCH 2/2] Provide diagnostic hints for missing C++ cinttypes string constants.

2020-05-25 Thread Jason Merrill via Gcc-patches


On 5/23/20 8:30 PM, Mark Wielaard wrote:

When reporting an error in cp_parser and we notice a string literal
followed by an unknown name check whether there is a known standard
header containing a string macro with the same name, then add a hint
to the error message to include that header.

gcc/c-family/ChangeLog:

* known-headers.cc (get_cp_stdlib_header_for_string_macro_name):
New function.
* known-headers.h (get_c_stdlib_header_for_string_macro_name):


Missing 'p'.


New function definition.


Declaration, not definition.

The C++ changes are OK with these ChangeLog corrections.


gcc/cp/ChangeLog:

* parser.c (cp_lexer_safe_previous_token): New function.
(cp_parser_error_1): Add name_hint if the previous token is
a string literal and next token is a CPP_NAME and we have a
missing header suggestion for the name.

gcc/testsuite/ChangeLog:

* g++.dg/spellcheck-inttypes.C: Add string-literal testcases.
---
  gcc/c-family/known-headers.cc  |  8 +
  gcc/c-family/known-headers.h   |  1 +
  gcc/cp/parser.c| 36 
  gcc/testsuite/g++.dg/spellcheck-inttypes.C | 39 ++
  4 files changed, 84 insertions(+)

diff --git a/gcc/c-family/known-headers.cc b/gcc/c-family/known-headers.cc
index c07cfd1db815..977230a586db 100644
--- a/gcc/c-family/known-headers.cc
+++ b/gcc/c-family/known-headers.cc
@@ -268,6 +268,14 @@ get_c_stdlib_header_for_string_macro_name (const char 
*name)
return get_string_macro_hint (name, STDLIB_C);
  }
  
+/* Given non-NULL NAME, return the header name defining a string macro

+   within the C++ standard library (with '<' and '>'), or NULL.  */
+const char *
+get_cp_stdlib_header_for_string_macro_name (const char *name)
+{
+  return get_string_macro_hint (name, STDLIB_CPLUSPLUS);
+}
+
  /* Implementation of class suggest_missing_header.  */
  
  /* suggest_missing_header's ctor.  */

diff --git a/gcc/c-family/known-headers.h b/gcc/c-family/known-headers.h
index a69bbbf28e76..f0c89dc9019d 100644
--- a/gcc/c-family/known-headers.h
+++ b/gcc/c-family/known-headers.h
@@ -24,6 +24,7 @@ extern const char *get_c_stdlib_header_for_name (const char 
*name);
  extern const char *get_cp_stdlib_header_for_name (const char *name);
  
  extern const char *get_c_stdlib_header_for_string_macro_name (const char *n);

+extern const char *get_cp_stdlib_header_for_string_macro_name (const char *n);
  
  /* Subclass of deferred_diagnostic for suggesting to the user

 that they have missed a #include.  */
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 54ca875ce54c..95b8c635fc65 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  If not see
  #include "tree-iterator.h"
  #include "cp-name-hint.h"
  #include "memmodel.h"
+#include "c-family/known-headers.h"
  
  

  /* The lexer.  */
@@ -776,6 +777,20 @@ cp_lexer_previous_token (cp_lexer *lexer)
return cp_lexer_token_at (lexer, tp);
  }
  
+/* Same as above, but return NULL when the lexer doesn't own the token

+   buffer or if the next_token is at the start of the token
+   vector.  */
+
+static cp_token *
+cp_lexer_safe_previous_token (cp_lexer *lexer)
+{
+  if (lexer->buffer)
+if (lexer->next_token != lexer->buffer->address ())
+  return cp_lexer_previous_token (lexer);
+
+  return NULL;
+}
+
  /* Overload for make_location, taking the lexer to mean the location of the
 previous token.  */
  
@@ -2919,6 +2934,7 @@ cp_parser_error_1 (cp_parser* parser, const char* gmsgid,

}
  }
  
+  auto_diagnostic_group d;

gcc_rich_location richloc (input_location);
  
bool added_matching_location = false;

@@ -2941,6 +2957,26 @@ cp_parser_error_1 (cp_parser* parser, const char* gmsgid,
  = richloc.add_location_if_nearby (matching_location);
  }
  
+  /* If we were parsing a string-literal and there is an unknown name

+ token right after, then check to see if that could also have been
+ a literal string by checking the name against a list of known
+ standard string literal constants defined in header files. If
+ there is one, then add that as an hint to the error message. */
+  name_hint h;
+  cp_token *prev_token = cp_lexer_safe_previous_token (parser->lexer);
+  if (prev_token && cp_parser_is_string_literal (prev_token)
+  && token->type == CPP_NAME)
+{
+  tree name = token->u.value;
+  const char *token_name = IDENTIFIER_POINTER (name);
+  const char *header_hint
+   = get_cp_stdlib_header_for_string_macro_name (token_name);
+  if (header_hint != NULL)
+   h = name_hint (NULL, new suggest_missing_header (token->location,
+token_name,
+header_hint));
+}
+
/* Actually emit the error.  */
c_parse_error (gmsgid,

Re: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero

2020-05-25 Thread Segher Boessenkool

On Mon, May 25, 2020 at 02:59:30AM +, Yangfei (Felix) wrote:
> > It creates better code on all targets :-)  A quite small improvement, but 
> > not
> > entirely trivial.
> 
> Thanks for the effort.  It's great to hear that :- )

Yes :-)

> > > > p.s.  Please use a correct mime type?  application/octet-stream
> > > > isn't something I can reply to.  Just text/plain is fine :-)
> > >
> > > I have using plain text now, hope that works for you.  :-)
> > 
> > Nope:
> > 
> > [-- Attachment #2: pr94026-v2.diff --]
> > [-- Type: application/octet-stream, Encoding: base64, Size: 5.9K --]
> 
> This time I switched to use UUEncode type for the attachment.  Does it work?

No:

[-- Attachment #2: pr94026-v3.diff --]
[-- Type: application/octet-stream, Encoding: base64, Size: 5.8K --]

> I am using Outlook and I didn't find the place to change the MIME type : - (

The simplest option is to use a different email client, one that plays
nicely with others.  You use git, maybe you could even use git-send-email?

I'll paste things manually...

> From a19238c02c1e6ab9593a14a13e1e3dff90ed Mon Sep 17 00:00:00 2001
> From: Fei Yang 
> Date: Mon, 25 May 2020 10:19:30 +0800
> Subject: [PATCH] combine: missed opportunity to simplify comparisons with zero
>  [PR94026]

(Capital "M" on "Missed" please)

But, the subject should say what the patch *does*.  So maybe
  combine: Simplify more comparisons with zero (PR94026)

> If we have (and (lshiftrt X C) M) and M is a constant that would select
> a field of bits within an item, but not the entire word, fold this into
> a simple AND if we are in an equality comparison against zero.

But that subject doesn't really describe what the patch does, anyway?

> gcc/
> PR rtl-optimization/94026
> * combine.c (make_compound_operation_int): If we have (and
> (lshiftrt X C) M) and M is a constant that would select a field
> of bits within an item, but not the entire word, fold this into
> a simple AND if we are in an equality comparison.
> 
> gcc/testsuite/
> PR rtl-optimization/94026
> * gcc.dg/pr94026.c: New test.

> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,11 @@
> +2020-05-25  Felix Yang  
> +
> + PR rtl-optimization/94026
> + * combine.c (make_compound_operation_int): If we have (and
> + (lshiftrt X C) M) and M is a constant that would select a field
> + of bits within an item, but not the entire word, fold this into
> + a simple AND if we are in an equality comparison.

Don't put the changelog in the patch.

> diff --git a/gcc/combine.c b/gcc/combine.c
> index b044f29fd36..76d62b0bd17 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -8178,6 +8178,10 @@ make_compound_operation_int (scalar_int_mode mode, rtx 
> *x_ptr,
>if (!CONST_INT_P (XEXP (x, 1)))
>   break;
>  
> +  HOST_WIDE_INT pos;
> +  unsigned HOST_WIDE_INT len;
> +  pos = get_pos_from_mask (UINTVAL (XEXP (x, 1)), );

  unsigned HOST_WIDE_INT len;
  HOST_WIDE_INT pos = get_pos_from_mask (UINTVAL (XEXP (x, 1)), );

> @@ -8231,6 +8235,22 @@ make_compound_operation_int (scalar_int_mode mode, rtx 
> *x_ptr,
> new_rtx = make_compound_operation (new_rtx, in_code);
>   }
>  
> +  /* If we have (and (lshiftrt X C) M) and M is a constant that would 
> select
> +  a field of bits within an item, but not the entire word, this might be
> +  representable by a simple AND if we are in an equality comparison.  */
> +  else if (pos > 0 && equality_comparison

That "&& equality_comparison" should be on a separate line as well.

> +&& GET_CODE (XEXP (x, 0)) == LSHIFTRT
> +&& CONST_INT_P (XEXP (XEXP (x, 0), 1))
> +&& pos + UINTVAL (XEXP (XEXP (x, 0), 1))
> +   <= GET_MODE_BITSIZE (mode))
> + {
> +   new_rtx = make_compound_operation (XEXP (XEXP (x, 0), 0), next_code);
> +   HOST_WIDE_INT real_pos = pos + UINTVAL (XEXP (XEXP (x, 0), 1));
> +   unsigned HOST_WIDE_INT mask = ((unsigned HOST_WIDE_INT)1 << len) - 1;

Space after cast.

> +   new_rtx = gen_rtx_AND (mode, new_rtx,
> +  gen_int_mode (mask << real_pos, mode));
> + }

So this changes
  ((X >> C) & M) == ...
to
  (X & (M << C)) == ...
?

Where then does it check what ... is?  This is only valid like this if
that is zero.

Why should this go in combine and not in simplify-rtx instead?

> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr94026.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile { target aarch64*-*-* i?86-*-* x86_64-*-* } } */

Why restrict this to only some targets?

> +/* { dg-options "-O2 -fdump-rtl-combine" } */
> +
> +int
> +foo (int c)
> +{
> +  int a = (c >> 8) & 7;
> +
> +  if (a >= 2) {
> +return 1;
> +  }
> +
> +  return 0;
> +}
> +
> +/* The combine phase should transform (compare (and (lshiftrt x 8) 6) 0)
> +   to (compare (and (x 1536)) 0). We look for the *attempt* to match this
> +   RTL pattern, regardless of whether an actual insn may be found

Re: zstd not found if installed in non-system prefix

2020-05-25 Thread Matthias Klose

On 5/20/20 9:32 PM, Michael Kuhn wrote:
> Hi,
> 
> when specifying a non-system prefix with --with-zstd, the build fails
> because the header and library cannot be found (see 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95005).
> 
> The attached patch fixes the problem and is what we use in Spack to
> make GCC build with zstd support.

documentation for this is missing in gcc/doc/install.texi.  Also please don't
introduce configurations via environment variables, but add options like for
gmp. See

@item --with-gmp=@var{pathname}
@itemx --with-gmp-include=@var{pathname}
@itemx --with-gmp-lib=@var{pathname}

Matthias

[Ada] Spurious accessibility error on return aggregate in GNATprove mode

2020-05-25 Thread Pierre-Marie de Rodat

This patch fixes an issue whereby valid actuals within return aggregates
could trigger spurious accessibility errors in GNATprove mode.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-05-25  Justin Squirek  

gcc/ada/

* sem_ch6.adb (Check_Return_Obj_Accessibility): Use original
node to avoid looking at expansion done in GNATprove mode.--- gcc/ada/sem_ch6.adb
+++ gcc/ada/sem_ch6.adb
@@ -798,44 +798,44 @@ package body Sem_Ch6 is
   N_Discriminant_Association)
then
   Expr := Expression (Assoc);
+   else
+  Expr := Empty;
end if;
 
--  This anonymous access discriminant has an associated
--  expression which needs checking.
 
-   if Nkind (Expr) = N_Attribute_Reference
+   if Present (Expr)
+ and then Nkind (Expr) = N_Attribute_Reference
  and then Attribute_Name (Expr) /= Name_Unrestricted_Access
then
   --  Obtain the object to perform static checks on by moving
   --  up the prefixes in the expression taking into account
   --  named access types.
 
-  Obj := Prefix (Expr);
+  Obj := Original_Node (Prefix (Expr));
   while Nkind_In (Obj, N_Indexed_Component,
N_Selected_Component)
   loop
+ Obj := Original_Node (Prefix (Obj));
+
  --  When we encounter a named access type then we can
  --  ignore accessibility checks on the dereference.
 
- if Ekind (Etype (Prefix (Obj)))
+ if Ekind (Etype (Obj))
   in E_Access_Type ..
  E_Access_Protected_Subprogram_Type
  then
-if Nkind (Obj) = N_Selected_Component then
-   Obj := Selector_Name (Obj);
+if Nkind (Parent (Obj)) = N_Selected_Component then
+   Obj := Selector_Name (Parent (Obj));
 end if;
 exit;
  end if;
 
  --  Skip over the explicit dereference
 
- if Nkind (Prefix (Obj)) = N_Explicit_Dereference then
-Obj := Prefix (Prefix (Obj));
-
- --  Otherwise move up to the next prefix
-
- else
-Obj := Prefix (Obj);
+ if Nkind (Obj) = N_Explicit_Dereference then
+Obj := Original_Node (Prefix (Obj));
  end if;
   end loop;

[Ada] Fix spurious error on checking of null Abstract_State

2020-05-25 Thread Pierre-Marie de Rodat

A declaration of an object inside a declare block of elaboration code
should not count as hidden state of the package. Nor should declarations
inside a task or an entry. Now fixed.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-05-25  Yannick Moy  

gcc/ada/

* sem_util.adb (Check_No_Hidden_State): Stop propagation at
first block/task/entry.--- gcc/ada/sem_util.adb
+++ gcc/ada/sem_util.adb
@@ -3387,10 +3387,14 @@ package body Sem_Util is
 return;
 
  --  Objects and states that appear immediately within a subprogram or
- --  inside a construct nested within a subprogram do not introduce a
- --  hidden state. They behave as local variable declarations.
+ --  entry inside a construct nested within a subprogram do not
+ --  introduce a hidden state. They behave as local variable
+ --  declarations. The same is true for elaboration code inside a block
+ --  or a task.
 
- elsif Is_Subprogram (Context) then
+ elsif Is_Subprogram_Or_Entry (Context)
+   or else Ekind_In (Context, E_Block, E_Task_Type)
+ then
 return;
 
  --  When examining a package body, use the entity of the spec as it

[Ada] Change pragma Compile_Time_Error to force compile-time evaluation

2020-05-25 Thread Pierre-Marie de Rodat

Issue an error now when the expression in pragma Compile_Time_Error
cannot be evaluated at compile time. This allows static analyzers like
GNATprove to rely on such expressions always being False.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-05-25  Yannick Moy  

gcc/ada/

* doc/gnat_rm/implementation_defined_pragmas.rst: Document
changes to pragmas Compile_Time_Error/Compile_Time_Warning.
* gnat_rm.texi: Regenerate.
* libgnat/g-bytswa.adb: Change uses of Compile_Time_Error to
Compile_Time_Warning, as the actual expression may not always be
known statically.
* sem_prag.adb (Analyze_Pragma): Handle differently pragma
Compile_Time_Error in both compilation and in GNATprove mode.
(Validate_Compile_Time_Warning_Or_Error): Issue an error or
warning when the expression is not known at compile time.
* usage.adb: Add missing documentation for warning switches _c
and _r.
* warnsw.ads: Update comment.--- gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst
+++ gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst
@@ -1094,14 +1094,14 @@ This pragma can be used to generate additional compile time
 error messages. It
 is particularly useful in generics, where errors can be issued for
 specific problematic instantiations. The first parameter is a boolean
-expression. The pragma is effective only if the value of this expression
-is known at compile time, and has the value True. The set of expressions
+expression. The pragma ensures that the value of an expression
+is known at compile time, and has the value False. The set of expressions
 whose values are known at compile time includes all static boolean
 expressions, and also other values which the compiler can determine
 at compile time (e.g., the size of a record type set by an explicit
 size representation clause, or the value of a variable which was
 initialized to a constant and is known not to have been modified).
-If these conditions are met, an error message is generated using
+If these conditions are not met, an error message is generated using
 the value given as the second argument. This string value may contain
 embedded ASCII.LF characters to break the message into multiple lines.
 
@@ -1118,7 +1118,10 @@ Syntax:
 
 
 Same as pragma Compile_Time_Error, except a warning is issued instead
-of an error message. Note that if this pragma is used in a package that
+of an error message. If switch *-gnatw_C* is used, a warning is only issued
+if the value of the expression is known to be True at compile time, not when
+the value of the expression is not known at compile time.
+Note that if this pragma is used in a package that
 is with'ed by a client, the client will get the warning even though it
 is issued by a with'ed package (normally warnings in with'ed units are
 suppressed, but this is a special exception to that rule).

--- gcc/ada/gnat_rm.texi
+++ gcc/ada/gnat_rm.texi
@@ -21,7 +21,7 @@
 
 @copying
 @quotation
-GNAT Reference Manual , Dec 10, 2019
+GNAT Reference Manual , May 04, 2020
 
 AdaCore
 
@@ -2492,14 +2492,14 @@ This pragma can be used to generate additional compile time
 error messages. It
 is particularly useful in generics, where errors can be issued for
 specific problematic instantiations. The first parameter is a boolean
-expression. The pragma is effective only if the value of this expression
-is known at compile time, and has the value True. The set of expressions
+expression. The pragma ensures that the value of an expression
+is known at compile time, and has the value False. The set of expressions
 whose values are known at compile time includes all static boolean
 expressions, and also other values which the compiler can determine
 at compile time (e.g., the size of a record type set by an explicit
 size representation clause, or the value of a variable which was
 initialized to a constant and is known not to have been modified).
-If these conditions are met, an error message is generated using
+If these conditions are not met, an error message is generated using
 the value given as the second argument. This string value may contain
 embedded ASCII.LF characters to break the message into multiple lines.
 
@@ -2516,7 +2516,10 @@ pragma Compile_Time_Warning
 @end example
 
 Same as pragma Compile_Time_Error, except a warning is issued instead
-of an error message. Note that if this pragma is used in a package that
+of an error message. If switch @emph{-gnatw_C} is used, a warning is only issued
+if the value of the expression is known to be True at compile time, not when
+the value of the expression is not known at compile time.
+Note that if this pragma is used in a package that
 is with'ed by a client, the client will get the warning even though it
 is issued by a with'ed package (normally warnings in with'ed units are
 suppressed, but this is a special exception to that rule).

--- gcc/ada/libgnat/g-bytswa.adb
+++

[PATCH] Add debug (slp_tree) and dump infrastructure for this

2020-05-25 Thread Richard Biener

This adds an alternate debug_dump_context similar to the one for
selftests but for interactive debugging routines.  This allows
to share code between user-visible dumping via the dump_* API
and those debugging routines.  The primary driver was SLP node
dumping which wasn't accessible from inside a gdb session up to
now.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

David, does this look OK?

Thanks,
Richard.

2020-05-25  Richard Biener  

* dump-context.h (debug_dump_context): New class.
(dump_context): Make it friend.
* dumpfile.c (debug_dump_context::debug_dump_context):
Implement.
(debug_dump_context::~debug_dump_context): Likewise.
* tree-vect-slp.c: Include dump-context.h.
(vect_print_slp_tree): Dump a single SLP node.
(debug): New overload for slp_tree.
(vect_print_slp_graph): Rename from vect_print_slp_tree and
use that.
(vect_analyze_slp_instance): Adjust.
---
 gcc/dump-context.h  | 19 ++
 gcc/dumpfile.c  | 26 +
 gcc/tree-vect-slp.c | 47 +
 3 files changed, 80 insertions(+), 12 deletions(-)

diff --git a/gcc/dump-context.h b/gcc/dump-context.h
index 347477f331e..6d51eaf31ad 100644
--- a/gcc/dump-context.h
+++ b/gcc/dump-context.h
@@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.  If not see
 
 class optrecord_json_writer;
 namespace selftest { class temp_dump_context; }
+class debug_dump_context;
 
 /* A class for handling the various dump_* calls.
 
@@ -42,6 +43,7 @@ namespace selftest { class temp_dump_context; }
 class dump_context
 {
   friend class selftest::temp_dump_context;
+  friend class debug_dump_context;
 
  public:
   static dump_context  () { return *s_current; }
@@ -195,6 +197,23 @@ private:
   auto_vec m_stashed_items;
 };
 
+/* An RAII-style class for use in debug dumpers for temporarily using a
+   different dump_context.  */
+
+class debug_dump_context
+{
+ public:
+  debug_dump_context ();
+  ~debug_dump_context ();
+
+ private:
+  dump_context m_context;
+  dump_context *m_saved;
+  dump_flags_t m_saved_flags;
+  FILE *m_saved_file;
+};
+
+
 #if CHECKING_P
 
 namespace selftest {
diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c
index 54718784fd4..0c0c076d890 100644
--- a/gcc/dumpfile.c
+++ b/gcc/dumpfile.c
@@ -2078,6 +2078,32 @@ enable_rtl_dump_file (void)
   return num_enabled > 0;
 }
 
+/* debug_dump_context's ctor.  Temporarily override the dump_context
+   (to forcibly enable output to stderr).  */
+
+debug_dump_context::debug_dump_context ()
+: m_context (),
+  m_saved (_context::get ()),
+  m_saved_flags (dump_flags),
+  m_saved_file (dump_file)
+{
+  set_dump_file (stderr);
+  dump_context::s_current = _context;
+  pflags = dump_flags = MSG_ALL_KINDS | MSG_ALL_PRIORITIES;
+  dump_context::get ().refresh_dumps_are_enabled ();
+}
+
+/* debug_dump_context's dtor.  Restore the saved dump_context.  */
+
+debug_dump_context::~debug_dump_context ()
+{
+  set_dump_file (m_saved_file);
+  dump_context::s_current = m_saved;
+  pflags = dump_flags = m_saved_flags;
+  dump_context::get ().refresh_dumps_are_enabled ();
+}
+
+
 #if CHECKING_P
 
 namespace selftest {
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index c4fd045e9be..c0c9afd0bd2 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -44,6 +44,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "vec-perm-indices.h"
 #include "gimple-fold.h"
 #include "internal-fn.h"
+#include "dump-context.h"
 
 
 /* Initialize a SLP node.  */
@@ -1579,20 +1580,17 @@ fail:
   return node;
 }
 
-/* Dump a slp tree NODE using flags specified in DUMP_KIND.  */
+/* Dump a single SLP tree NODE.  */
 
 static void
 vect_print_slp_tree (dump_flags_t dump_kind, dump_location_t loc,
-slp_tree node, hash_set )
+slp_tree node)
 {
   unsigned i, j;
-  stmt_vec_info stmt_info;
   slp_tree child;
+  stmt_vec_info stmt_info;
   tree op;
 
-  if (visited.add (node))
-return;
-
   dump_metadata_t metadata (dump_kind, loc.get_impl_location ());
   dump_user_location_t user_loc = loc.get_user_location ();
   dump_printf_loc (metadata, user_loc, "node%s %p (max_nunits=%u, 
refcnt=%u)\n",
@@ -1626,16 +1624,41 @@ vect_print_slp_tree (dump_flags_t dump_kind, 
dump_location_t loc,
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
 dump_printf (dump_kind, " %p", (void *)child);
   dump_printf (dump_kind, "\n");
+}
+
+DEBUG_FUNCTION void
+debug (slp_tree node)
+{
+  debug_dump_context ctx;
+  vect_print_slp_tree (MSG_NOTE,
+  dump_location_t::from_location_t (UNKNOWN_LOCATION),
+  node);
+}
+
+/* Dump a slp tree NODE using flags specified in DUMP_KIND.  */
+
+static void
+vect_print_slp_graph (dump_flags_t dump_kind, dump_location_t loc,
+ slp_tree node, hash_set )
+{
+  unsigned i;
+  slp_tree child;
+
+  if (visited.add (node))
+

BRIG FE testsuite: Fix dump scan patterns in packed.hsail test

2020-05-25 Thread Martin Jambor

Hi,

Starting with r11-165-eb72dc663e9 which converted DECL_GIMPLE_REG_P to
DECL_NOT_GIMPLE_REG_P we have failing BRIG testcase:

-PASS: packed.hsail.brig scan-tree-dump gimple "_[0-9]+ = q2 \\+ q3;"
-PASS: packed.hsail.brig scan-tree-dump gimple "= VEC_PERM_EXPR 
;"
+FAIL: packed.hsail.brig scan-tree-dump gimple "_[0-9]+ = q2 \\+ q3;"
+FAIL: packed.hsail.brig scan-tree-dump gimple "= VEC_PERM_EXPR 
;"

because the gimplifier is now smarter and generates nicer code, which
however, does not match the regexp in the testsuite:

--- before/packed.hsail.brig.005t.gimple2020-05-12 17:59:26.434305513 
+0200
+++ after/packed.hsail.brig.005t.gimple 2020-05-12 17:52:34.477055987 +0200
@@ -109,277 +109,267 @@
   q2 = q1 + _24;
   _25 = VEC_PERM_EXPR ;
   q3 = q2 + _25;
-  _26 = q2 + q3;
-  new_output.11 = _26;
-  new_output.21_27 = new_output.11;
-  _28 = VEC_PERM_EXPR ;
-  s_output.12 = _28;
+  new_output.11 = q2 + q3;
+  s_output.12 = VEC_PERM_EXPR ;
   q4 = s_output.12;

I have looked at the SSA dump and verified that the variable in
question is a gimple register because it gets its SSA name.  I have
not looked into why the gimplifier previously thought it had to go
through the additional temporaries though.

Tested with make -k check-brig.  Pushed to master.

* brig.dg/test/gimple/packed.hsail: Fix scan dump patterns.
---
 gcc/testsuite/ChangeLog| 4 
 gcc/testsuite/brig.dg/test/gimple/packed.hsail | 4 ++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index dfb92ec5748..58521437e2d 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2020-05-25  Martin Jambor  
+
+   * brig.dg/test/gimple/packed.hsail: Fix scan dump patterns.
+
 2020-05-25  Richard Biener  
 
PR tree-optimization/95308
diff --git a/gcc/testsuite/brig.dg/test/gimple/packed.hsail 
b/gcc/testsuite/brig.dg/test/gimple/packed.hsail
index 9137490b2ab..1e2bb53de0d 100644
--- a/gcc/testsuite/brig.dg/test/gimple/packed.hsail
+++ b/gcc/testsuite/brig.dg/test/gimple/packed.hsail
@@ -61,10 +61,10 @@ prog kernel (kernarg_u64 %input_ptr, kernarg_u64 
%output_ptr)
 /* For the add_ss we assume performing the computation over the whole vector 
is cheaper than */
 /* extracting the scalar and performing a scalar operation. This aims to stay 
in the vector
 /* datapath as long as possible. */
-/* { dg-final { scan-tree-dump "_\[0-9\]+ = q2 \\\+ q3;" "gimple" } } */
+/* { dg-final { scan-tree-dump "new_output.\[0-9\]+ = q2 \\\+ q3;" "gimple" } 
} */
 
 /* Insert the lowest element of the result to the lowest element of the result 
register. */
-/* { dg-final { scan-tree-dump "= VEC_PERM_EXPR ;" "gimple" } } */
+/* { dg-final { scan-tree-dump "= VEC_PERM_EXPR ;" "gimple" } } */
 
 /* FIXME */
 /* { dg-final { scan-tree-dump "q4 = 
\(VIEW_CONVERT_EXPR\\\()?s_output.\[0-9\]+\(_\[0-9\]+\)*\\\)?;" 
"gimple" } } */
-- 
2.26.2

[PATCH] tree-optimization/95309 - fix invariant SLP node costing

2020-05-25 Thread Richard Biener

This makes sure to compute SLP_TREE_NUMBER_OF_VEC_STMTS during SLP
analysis even for invariant / external nodes so costing properly
knows what to cost.

Bootstrapped and tested on x86_64-unknown-linux-gnu.  This might
also fix a few other cost issues reported.

2020-05-25  Richard Biener  

PR tree-optimization/95309
* tree-vect-slp.c (vect_get_constant_vectors): Move number
of vector computation ...
(vect_slp_analyze_node_operations): ... to analysis phase.
---
 gcc/tree-vect-slp.c | 41 -
 1 file changed, 28 insertions(+), 13 deletions(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index ec3675e7070..c4fd045e9be 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2856,17 +2856,37 @@ vect_slp_analyze_node_operations (vec_info *vinfo, 
slp_tree node,
  other referrers.  */
   if (res)
 FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), j, child)
-  if (SLP_TREE_DEF_TYPE (child) != vect_internal_def)
+  if ((SLP_TREE_DEF_TYPE (child) == vect_constant_def
+  || SLP_TREE_DEF_TYPE (child) == vect_external_def)
+ /* Perform usual caching, note code-generation still
+code-gens these nodes multiple times but we expect
+to CSE them later.  */
+ && !visited.contains (child)
+ && !lvisited.add (child))
{
  /* ???  After auditing more code paths make a "default"
 and push the vector type from NODE to all children
 if it is not already set.  */
- /* Perform usual caching, note code-generation still
-code-gens these nodes multiple times but we expect
-to CSE them later.  */
- if (!visited.contains (child)
- && !lvisited.add (child))
-   vect_prologue_cost_for_slp (vinfo, child, cost_vec);
+ /* Compute the number of vectors to be generated.  */
+ tree vector_type = SLP_TREE_VECTYPE (child);
+ if (!vector_type)
+   {
+ /* For shifts with a scalar argument we don't need
+to cost or code-generate anything.
+???  Represent this more explicitely.  */
+ gcc_assert ((STMT_VINFO_TYPE (SLP_TREE_SCALAR_STMTS (node)[0])
+  == shift_vec_info_type)
+ && j == 1);
+ continue;
+   }
+ unsigned group_size = SLP_TREE_SCALAR_OPS (child).length ();
+ poly_uint64 vf = 1;
+ if (loop_vec_info loop_vinfo = dyn_cast  (vinfo))
+   vf = loop_vinfo->vectorization_factor;
+ SLP_TREE_NUMBER_OF_VEC_STMTS (child)
+   = vect_get_num_vectors (vf * group_size, vector_type);
+ /* And cost them.  */
+ vect_prologue_cost_for_slp (vinfo, child, cost_vec);
}
 
   /* If this node can't be vectorized, try pruning the tree here rather
@@ -3638,12 +3658,7 @@ vect_get_constant_vectors (vec_info *vinfo,
 (vinfo, TREE_TYPE (op), op_node)));
 }
 
-  poly_uint64 vf = 1;
-  if (loop_vec_info loop_vinfo = dyn_cast  (vinfo))
-vf = loop_vinfo->vectorization_factor;
-  unsigned int number_of_vectors
-= vect_get_num_vectors (vf * group_size, vector_type);
-
+  unsigned int number_of_vectors = SLP_TREE_NUMBER_OF_VEC_STMTS (op_node);
   vec_oprnds->create (number_of_vectors);
   auto_vec voprnds (number_of_vectors);
 
-- 
2.25.1

Re: [PATCH] Implement no_stack_protect attribute.

2020-05-25 Thread Martin Liška


On 5/21/20 4:53 PM, Martin Sebor wrote:

On 5/21/20 5:28 AM, Martin Liška wrote:

On 5/18/20 10:37 PM, Martin Sebor wrote:

I know there are some somewhat complex cases the attribute exclusion
mechanism isn't general enough to handle but this seems simple enough
that it should work.  Unless I'm missing something that makes it not
feasible I would suggest to use it.


Hi Martin.

Do we have a better place where we check for attribute collision?


If by collision you mean the same thing as the mutual exclusion I was
talking about then that's done by creating an attribute_spec::exclusions
array like for instance attr_cold_hot_exclusions in c-attribs.c and
pointing to it from the attribute_spec entries for each of
the mutually exclusive attributes in the attribute table.  Everything
else is handled automatically by decl_attributes.

Martin


Thanks, I'm sending updated version of the patch that utilizes the conflict
detection.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin
>From b026a8684a0a82394ce2faf010257fa3f0978ca0 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Fri, 15 May 2020 14:42:12 +0200
Subject: [PATCH] Implement no_stack_protect attribute.

gcc/ChangeLog:

2020-05-18  Martin Liska  

	PR c/94722
	* cfgexpand.c (stack_protect_decl_phase):
	Guard with lookup_attribute("no_stack_protect") at
	various places.
	(expand_used_vars): Likewise here.
	* doc/extend.texi: Document no_stack_protect attribute.

gcc/ada/ChangeLog:

2020-05-18  Martin Liska  

	PR c/94722
	* gcc-interface/utils.c (handle_no_stack_protect_attribute):
	New.
	(handle_stack_protect_attribute): Add error message for a
	no_stack_protect function.

gcc/c-family/ChangeLog:

2020-05-18  Martin Liska  

	PR c/94722
	* c-attribs.c (handle_no_stack_protect_function_attribute): New.
	(handle_stack_protect_attribute): Add error message for a
	no_stack_protect function.

gcc/testsuite/ChangeLog:

2020-05-18  Martin Liska  

	PR c/94722
	* g++.dg/no-stack-protect-attr-2.C: New test.
	* g++.dg/no-stack-protect-attr-3.C: New test.
	* g++.dg/no-stack-protect-attr.C: New test.
---
 gcc/ada/gcc-interface/utils.c | 31 ++-
 gcc/c-family/c-attribs.c  | 32 +++-
 gcc/cfgexpand.c   | 82 ++-
 gcc/doc/extend.texi   |  4 +
 .../g++.dg/no-stack-protect-attr-2.C  |  7 ++
 .../g++.dg/no-stack-protect-attr-3.C  | 23 ++
 gcc/testsuite/g++.dg/no-stack-protect-attr.C  | 16 
 7 files changed, 154 insertions(+), 41 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/no-stack-protect-attr-2.C
 create mode 100644 gcc/testsuite/g++.dg/no-stack-protect-attr-3.C
 create mode 100644 gcc/testsuite/g++.dg/no-stack-protect-attr.C

diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c
index fb08b6c90ed..b6a60eb1b11 100644
--- a/gcc/ada/gcc-interface/utils.c
+++ b/gcc/ada/gcc-interface/utils.c
@@ -91,6 +91,7 @@ static tree handle_nonnull_attribute (tree *, tree, tree, int, bool *);
 static tree handle_sentinel_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noreturn_attribute (tree *, tree, tree, int, bool *);
 static tree handle_stack_protect_attribute (tree *, tree, tree, int, bool *);
+static tree handle_no_stack_protect_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noinline_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noclone_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noicf_attribute (tree *, tree, tree, int, bool *);
@@ -115,6 +116,13 @@ static const struct attribute_spec::exclusions attr_cold_hot_exclusions[] =
   { NULL  , false, false, false }
 };
 
+static const struct attribute_spec::exclusions attr_stack_protect_exclusions[] =
+{
+  { "stack_protect", true, false, false },
+  { "no_stack_protect", true, false, false },
+  { NULL, false, false, false },
+};
+
 /* Fake handler for attributes we don't properly support, typically because
they'd require dragging a lot of the common-c front-end circuitry.  */
 static tree fake_attribute_handler (tree *, tree, tree, int, bool *);
@@ -140,7 +148,11 @@ const struct attribute_spec gnat_internal_attribute_table[] =
   { "noreturn", 0, 0,  true,  false, false, false,
 handle_noreturn_attribute, NULL },
   { "stack_protect",0, 0, true,  false, false, false,
-handle_stack_protect_attribute, NULL },
+handle_stack_protect_attribute,
+attr_stack_protect_exclusions },
+  { "no_stack_protect",0, 0, true,  false, false, false,
+handle_no_stack_protect_attribute,
+attr_stack_protect_exclusions },
   { "noinline", 0, 0,  true,  false, false, false,
 handle_noinline_attribute, NULL },
   { "noclone",  0, 0,  true,  false, false, false,
@@ -6524,6 +6536,23 @@ handle_stack_protect_attribute (tree *node, tree name, tree, int,
   return NULL_TREE;
 }
 
+/* Handle a "no_stack_protect" attribute;

Re: [PATCH] Fix nonconforming memory_operand for vpmov instructions which has memory operand narrow than 128 bits [avx512f]

2020-05-25 Thread Uros Bizjak via Gcc-patches

On Mon, May 25, 2020 at 2:21 PM Hongtao Liu  wrote:
>
>   According to Intel SDM, VPMOVQB xmm1/m16 {k1}{z}, xmm2 has 16-bit
> memory_operand instead of 128-bit one which exists in current
> implementation. Also for other vpmov instructions which have
> memory_operand narrower than 128bits.
>
>   Bootstrap is ok, regression test for i386/x86-64 backend is ok.


+  [(set (match_operand:HI 0 "memory_operand" "=m")
+(subreg:HI (any_truncate:V2QI
+ (match_operand:V2DI 1 "register_operand" "v")) 0))]

This should store in V2QImode, subregs are not allowed in insn patterns.

You need a pre-reload splitter to split from register_operand to a
memory_operand, Jakub fixed a bunch of pmov patterns a while ago, so
perhaps he can give some additional advice here.

Uros.


> gcc/ChangeLog
>
> * config/i386/sse.md (*avx512vl_v2div2qi2_store): Refine
> size of memory_operand according to Intel SDM.
> (avx512vl_v2div2qi2_mask_store): Ditto.
> (*avx512vl_v4qi2_store): Ditto.
> (avx512vl_v4qi2_mask_store): Ditto.
> (*avx512vl_v8qi2_store): Ditto.
> (avx512vl_v8qi2_mask_store): Ditto.
> (*avx512vl_v4hi2_store): Ditto.
> (avx512vl_v4hi2_mask_store): Ditto.
> (*avx512vl_v2div2hi2_store): Ditto.
> (avx512vl_v2div2hi2_mask_store): Ditto.
> (*avx512vl_v2div2si2_store): Ditto.
> (avx512vl_v2div2si2_mask_store): Ditto.
> (*avx512f_v8div16qi2_store): Ditto.
> (avx512f_v8div16qi2_mask_store): Ditto.
> * config/i386/i386-builtin-types.def: Adjust builtin type.
> * config/i386/i386-expand.c: Ditto.
> * config/i386/i386-builtin.def: Adjust builtin.
> * config/i386/avx512fintrin.h: Ditto.
> * config/i386/avx512vlbwintrin.h: Ditto.
> * config/i386/avx512vlintrin.h: Ditto.
>
>   I think the code i changed is already covered by existed intrinsics
> tests, so i didn't add any new tests.
> --
> BR,
> Hongtao

Re: [PATCH 1/2] rs6000: tune cunroll for simple loops at O2

2020-05-25 Thread Richard Biener via Gcc-patches

On Fri, May 22, 2020 at 6:54 PM Segher Boessenkool
 wrote:
>
> On Fri, May 22, 2020 at 01:22:10PM +0200, Richard Biener wrote:
> > On Wed, May 20, 2020 at 10:37 PM Segher Boessenkool
> >  wrote:
> > >
> > > On Wed, May 20, 2020 at 12:30:30PM +0200, Richard Biener wrote:
> > > > I think this is the wrong way to approach this.  You're doing too many
> > > > things at once.  Try to fix the powerpc regression with the extra
> > > > flag_rtl_unroll_loops, that could be backported.  Then you can
> > > > independently see whether enabling more unrolling at -O2 makes
> > > > sense.  Because currently we _do_ unroll at -O2 when it does
> > > > not increase size.  Its just your patches make this as aggressive
> > > > as -O3.
> > >
> > > Just do a separate flag (and option) for cunroll, instead?
> > >
> > > The RTL unroller is *the* unroller, and has been since forever.
> >
> > Sorry but that ship has sailed - I don't think we should make
> > -O2 -funroll-loops no longer affect GIMPLE.
>
> Of course, and it would certainly be good if we could do some non-
> trivial unrolling earlier.  My point is, why rename all RTL stuff
> now?  That is just churn.
>
> (And we will need to keep unrolling in RTL for a long time still, if
> not forever).
>
> > But sure, what I am proposing is have
> >
> > -frtl-unroll-loops
> > -fgimple-unroll-loops
> >
> > with obvious semantics and -funroll-loops enabling both.
> > Users should not really care about what is RTL or GIMPLE.
>
> Yeah -- so this shouldn't be part of the name at all.  What you care
> about are higher level characteristics of the unrolling, like
> "complete" or "early" or whatever.
>
> > The split above allows the "bug" to be fixed (even on the branch)
> > without introducing even more target specialities.
>
> So does any split.  Or I don't see what you mean?

Well, a split that does not affect behavior for non-ppc architectures
when the flags by users are unchanged.  Because that allows
the ppc regression to be fixed on the branch.

Then, on trunk, we can think of a better overall flag design.  Note
that cunroll/cunrolli are not controlled by a flag currently, they
are gated on optimize >= [2|3] - it's just that either -funroll-loops
or -fpeel-loops causes its heuristics to allow code-size growth
by its own metrics according to the unroll --params.

So it's a bit difficult to retrofit the heuristic behavior onto new
flags unless we want to completely move that over to a --param
that may be gets adjusted by -funroll-loops.

Richard.

>
> Segher

Re: [patch] Fix internal error on store to FP component at -O2

2020-05-25 Thread Richard Biener via Gcc-patches

On Mon, May 25, 2020 at 2:14 PM Eric Botcazou  wrote:
>
> > Hmm, MAX_BITSIZE_MODE_ANY_INT looks a bit arbitrary since on
> > x86 it (IIRC) includes things like OImode. Maybe MOVE_MAX or
> > MAX_FIXED_MODE_SIZE are better suited here?
>
> I forgot to mention that I picked MAX_BITSIZE_MODE_ANY_INT because of:
>
>   bool invalid = (base_addr == NULL_TREE
>   || (maybe_gt (bitsize,
> (unsigned int) MAX_BITSIZE_MODE_ANY_INT)
>   && TREE_CODE (rhs) != INTEGER_CST
>   && (TREE_CODE (rhs) != CONSTRUCTOR
>   || CONSTRUCTOR_NELTS (rhs) != 0)));

That's likely because of the encoding limits of wide_ints I guess?  That is,
an implementation limit rather than one for code-quality.

Richard.

> --
> Eric Botcazou

[PATCH] Fix nonconforming memory_operand for vpmov instructions which has memory operand narrow than 128 bits [avx512f]

2020-05-25 Thread Hongtao Liu via Gcc-patches

  According to Intel SDM, VPMOVQB xmm1/m16 {k1}{z}, xmm2 has 16-bit
memory_operand instead of 128-bit one which exists in current
implementation. Also for other vpmov instructions which have
memory_operand narrower than 128bits.

  Bootstrap is ok, regression test for i386/x86-64 backend is ok.

gcc/ChangeLog

* config/i386/sse.md (*avx512vl_v2div2qi2_store): Refine
size of memory_operand according to Intel SDM.
(avx512vl_v2div2qi2_mask_store): Ditto.
(*avx512vl_v4qi2_store): Ditto.
(avx512vl_v4qi2_mask_store): Ditto.
(*avx512vl_v8qi2_store): Ditto.
(avx512vl_v8qi2_mask_store): Ditto.
(*avx512vl_v4hi2_store): Ditto.
(avx512vl_v4hi2_mask_store): Ditto.
(*avx512vl_v2div2hi2_store): Ditto.
(avx512vl_v2div2hi2_mask_store): Ditto.
(*avx512vl_v2div2si2_store): Ditto.
(avx512vl_v2div2si2_mask_store): Ditto.
(*avx512f_v8div16qi2_store): Ditto.
(avx512f_v8div16qi2_mask_store): Ditto.
* config/i386/i386-builtin-types.def: Adjust builtin type.
* config/i386/i386-expand.c: Ditto.
* config/i386/i386-builtin.def: Adjust builtin.
* config/i386/avx512fintrin.h: Ditto.
* config/i386/avx512vlbwintrin.h: Ditto.
* config/i386/avx512vlintrin.h: Ditto.

  I think the code i changed is already covered by existed intrinsics
tests, so i didn't add any new tests.
-- 
BR,
Hongtao
From b7bdb089b941a77da3ba0342d393a0cbfd4ac3aa Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Mon, 25 May 2020 16:10:06 +0800
Subject: [PATCH] Fix nonconforming memory_operand for
 vpmovq{d,w,b}/vpmovd{w,b}/vpmovwb.

According to Intel SDM, VPMOVQB xmm1/m16 {k1}{z}, xmm2 has 16-bit
memory_operand instead of 128-bit one which existed in current
implementation. Also for other vpmov instructions which have
memory_operand narrower than 128bits.

2020-05-25  Hongtao Liu  

gcc/ChangeLog

	* config/i386/sse.md (*avx512vl_v2div2qi2_store): Refine
	size of memory_operand according to Intel SDM.
	(avx512vl_v2div2qi2_mask_store): Ditto.
	(*avx512vl_v4qi2_store): Ditto.
	(avx512vl_v4qi2_mask_store): Ditto.
	(*avx512vl_v8qi2_store): Ditto.
	(avx512vl_v8qi2_mask_store): Ditto.
	(*avx512vl_v4hi2_store): Ditto.
	(avx512vl_v4hi2_mask_store): Ditto.
	(*avx512vl_v2div2hi2_store): Ditto.
	(avx512vl_v2div2hi2_mask_store): Ditto.
	(*avx512vl_v2div2si2_store): Ditto.
	(avx512vl_v2div2si2_mask_store): Ditto.
	(*avx512f_v8div16qi2_store): Ditto.
	(avx512f_v8div16qi2_mask_store): Ditto.
	* config/i386/i386-builtin-types.def: Adjust builtin type.
	* config/i386/i386-expand.c: Ditto.
	* config/i386/i386-builtin.def: Adjust builtin.
	* config/i386/avx512fintrin.h: Ditto.
	* config/i386/avx512vlbwintrin.h: Ditto.
	* config/i386/avx512vlintrin.h: Ditto.
---
 gcc/config/i386/avx512fintrin.h|   7 +-
 gcc/config/i386/avx512vlbwintrin.h |   6 +-
 gcc/config/i386/avx512vlintrin.h   |  49 ++--
 gcc/config/i386/i386-builtin-types.def |  20 +-
 gcc/config/i386/i386-builtin.def   |  60 ++---
 gcc/config/i386/i386-expand.c  |  20 +-
 gcc/config/i386/sse.md | 313 ++---
 7 files changed, 207 insertions(+), 268 deletions(-)

diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h
index 012cf4eb31e..4bcd697387a 100644
--- a/gcc/config/i386/avx512fintrin.h
+++ b/gcc/config/i386/avx512fintrin.h
@@ -5613,7 +5613,8 @@ extern __inline void
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_mask_cvtepi64_storeu_epi8 (void * __P, __mmask8 __M, __m512i __A)
 {
-  __builtin_ia32_pmovqb512mem_mask ((__v16qi *) __P, (__v8di) __A, __M);
+  __builtin_ia32_pmovqb512mem_mask ((unsigned long long *) __P,
+(__v8di) __A, __M);
 }
 
 extern __inline __m128i
@@ -5648,7 +5649,7 @@ extern __inline void
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_mask_cvtsepi64_storeu_epi8 (void * __P, __mmask8 __M, __m512i __A)
 {
-  __builtin_ia32_pmovsqb512mem_mask ((__v16qi *) __P, (__v8di) __A, __M);
+  __builtin_ia32_pmovsqb512mem_mask ((unsigned long long *) __P, (__v8di) __A, __M);
 }
 
 extern __inline __m128i
@@ -5683,7 +5684,7 @@ extern __inline void
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_mask_cvtusepi64_storeu_epi8 (void * __P, __mmask8 __M, __m512i __A)
 {
-  __builtin_ia32_pmovusqb512mem_mask ((__v16qi *) __P, (__v8di) __A, __M);
+  __builtin_ia32_pmovusqb512mem_mask ((unsigned long long *) __P, (__v8di) __A, __M);
 }
 
 extern __inline __m128i
diff --git a/gcc/config/i386/avx512vlbwintrin.h b/gcc/config/i386/avx512vlbwintrin.h
index bee2639d60a..cd4275e0781 100644
--- a/gcc/config/i386/avx512vlbwintrin.h
+++ b/gcc/config/i386/avx512vlbwintrin.h
@@ -255,7 +255,7 @@ extern __inline void
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_mask_cvtsepi16_storeu_epi8 (void * __P, __mmask8 __M,__m128i __A)
 {
-  __builtin_ia32_pmovswb128mem_mask ((__v8qi

Re: Do not stream redudnant stuff

2020-05-25 Thread Richard Biener

On Mon, 25 May 2020, Jan Hubicka wrote:

> Hi,
> as discussed on IRC this adds knob to disable stuff we stream "just for fun"
> (or to make it easier to debug streamer desychnonization).
> 
> Te size of .o files in gcc subdirectory is reduced form 506MB to 492MB
> 
> Bootstrapped/regtested x86_64-linux, OK?

OK.

> gcc/ChangeLog:
> 
> 2020-05-25  Jan Hubicka  
> 
>   * lto-streamer-out.c (lto_output_tree): Add streamer_debugging check.
>   * lto-streamer.h (streamer_debugging): New constant
>   * tree-streamer-in.c (streamer_read_tree_bitfields): Add
>   * streamer_debugging check.
>   (streamer_get_pickled_tree): Likewise.
>   * tree-streamer-out.c (pack_ts_base_value_fields): Likewise.
> 
> diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
> index 887a51d3eae..288e3c0f4c6 100644
> --- a/gcc/lto-streamer-out.c
> +++ b/gcc/lto-streamer-out.c
> @@ -1748,8 +1748,9 @@ lto_output_tree (struct output_block *ob, tree expr,
>will instantiate two different nodes for the same object.  */
>streamer_write_record_start (ob, LTO_tree_pickle_reference);
>streamer_write_uhwi (ob, ix);
> -  streamer_write_enum (ob->main_stream, LTO_tags, LTO_NUM_TAGS,
> -lto_tree_code_to_tag (TREE_CODE (expr)));
> +  if (streamer_debugging)
> + streamer_write_enum (ob->main_stream, LTO_tags, LTO_NUM_TAGS,
> +  lto_tree_code_to_tag (TREE_CODE (expr)));
>lto_stats.num_pickle_refs_output++;
>  }
>else
> diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
> index a466fb8b329..fbf1dd7162e 100644
> --- a/gcc/lto-streamer.h
> +++ b/gcc/lto-streamer.h
> @@ -125,6 +125,9 @@ along with GCC; see the file COPYING3.  If not see
>  
>  typedef unsigned charlto_decl_flags_t;
>  
> +/* Stream additional data to LTO object files to make it easier to debug
> +   streaming code.  This changes object files.  */
> +static const bool streamer_debugging = false;
>  
>  /* Tags representing the various IL objects written to the bytecode file
> (GIMPLE statements, basic blocks, EH regions, tree nodes, etc).
> diff --git a/gcc/tree-streamer-in.c b/gcc/tree-streamer-in.c
> index 450f40d58d3..d2e45e33554 100644
> --- a/gcc/tree-streamer-in.c
> +++ b/gcc/tree-streamer-in.c
> @@ -487,9 +487,13 @@ streamer_read_tree_bitfields (class lto_input_block *ib,
>  
>/* The first word in BP contains the code of the tree that we
>   are about to read.  */
> -  code = (enum tree_code) bp_unpack_value (, 16);
> -  lto_tag_check (lto_tree_code_to_tag (code),
> -  lto_tree_code_to_tag (TREE_CODE (expr)));
> +  if (streamer_debugging)
> +{
> +  code = (enum tree_code) bp_unpack_value (, 16);
> +  lto_tag_check (lto_tree_code_to_tag (code),
> +  lto_tree_code_to_tag (TREE_CODE (expr)));
> +}
> +  code = TREE_CODE (expr);
>  
>/* Note that all these functions are highly sensitive to changes in
>   the types and sizes of each of the fields being packed.  */
> @@ -1107,11 +,14 @@ streamer_get_pickled_tree (class lto_input_block *ib, 
> class data_in *data_in)
>enum LTO_tags expected_tag;
>  
>ix = streamer_read_uhwi (ib);
> -  expected_tag = streamer_read_enum (ib, LTO_tags, LTO_NUM_TAGS);
> -
>result = streamer_tree_cache_get_tree (data_in->reader_cache, ix);
> -  gcc_assert (result
> -  && TREE_CODE (result) == lto_tag_to_tree_code (expected_tag));
> +
> +  if (streamer_debugging)
> +{
> +  expected_tag = streamer_read_enum (ib, LTO_tags, LTO_NUM_TAGS);
> +  gcc_assert (result
> +   && TREE_CODE (result) == lto_tag_to_tree_code (expected_tag));
> +}
>  
>return result;
>  }
> diff --git a/gcc/tree-streamer-out.c b/gcc/tree-streamer-out.c
> index 4e8a12c71e6..94635c4a8ae 100644
> --- a/gcc/tree-streamer-out.c
> +++ b/gcc/tree-streamer-out.c
> @@ -71,7 +71,8 @@ write_identifier (struct output_block *ob,
>  static inline void
>  pack_ts_base_value_fields (struct bitpack_d *bp, tree expr)
>  {
> -  bp_pack_value (bp, TREE_CODE (expr), 16);
> +  if (streamer_debugging)
> +bp_pack_value (bp, TREE_CODE (expr), 16);
>if (!TYPE_P (expr))
>  {
>bp_pack_value (bp, TREE_SIDE_EFFECTS (expr), 1);
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Do not stream redudnant stuff

2020-05-25 Thread Jan Hubicka

Hi,
as discussed on IRC this adds knob to disable stuff we stream "just for fun"
(or to make it easier to debug streamer desychnonization).

Te size of .o files in gcc subdirectory is reduced form 506MB to 492MB

Bootstrapped/regtested x86_64-linux, OK?

gcc/ChangeLog:

2020-05-25  Jan Hubicka  

* lto-streamer-out.c (lto_output_tree): Add streamer_debugging check.
* lto-streamer.h (streamer_debugging): New constant
* tree-streamer-in.c (streamer_read_tree_bitfields): Add
* streamer_debugging check.
(streamer_get_pickled_tree): Likewise.
* tree-streamer-out.c (pack_ts_base_value_fields): Likewise.

diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index 887a51d3eae..288e3c0f4c6 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -1748,8 +1748,9 @@ lto_output_tree (struct output_block *ob, tree expr,
 will instantiate two different nodes for the same object.  */
   streamer_write_record_start (ob, LTO_tree_pickle_reference);
   streamer_write_uhwi (ob, ix);
-  streamer_write_enum (ob->main_stream, LTO_tags, LTO_NUM_TAGS,
-  lto_tree_code_to_tag (TREE_CODE (expr)));
+  if (streamer_debugging)
+   streamer_write_enum (ob->main_stream, LTO_tags, LTO_NUM_TAGS,
+lto_tree_code_to_tag (TREE_CODE (expr)));
   lto_stats.num_pickle_refs_output++;
 }
   else
diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index a466fb8b329..fbf1dd7162e 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -125,6 +125,9 @@ along with GCC; see the file COPYING3.  If not see
 
 typedef unsigned char  lto_decl_flags_t;
 
+/* Stream additional data to LTO object files to make it easier to debug
+   streaming code.  This changes object files.  */
+static const bool streamer_debugging = false;
 
 /* Tags representing the various IL objects written to the bytecode file
(GIMPLE statements, basic blocks, EH regions, tree nodes, etc).
diff --git a/gcc/tree-streamer-in.c b/gcc/tree-streamer-in.c
index 450f40d58d3..d2e45e33554 100644
--- a/gcc/tree-streamer-in.c
+++ b/gcc/tree-streamer-in.c
@@ -487,9 +487,13 @@ streamer_read_tree_bitfields (class lto_input_block *ib,
 
   /* The first word in BP contains the code of the tree that we
  are about to read.  */
-  code = (enum tree_code) bp_unpack_value (, 16);
-  lto_tag_check (lto_tree_code_to_tag (code),
-lto_tree_code_to_tag (TREE_CODE (expr)));
+  if (streamer_debugging)
+{
+  code = (enum tree_code) bp_unpack_value (, 16);
+  lto_tag_check (lto_tree_code_to_tag (code),
+lto_tree_code_to_tag (TREE_CODE (expr)));
+}
+  code = TREE_CODE (expr);
 
   /* Note that all these functions are highly sensitive to changes in
  the types and sizes of each of the fields being packed.  */
@@ -1107,11 +,14 @@ streamer_get_pickled_tree (class lto_input_block *ib, 
class data_in *data_in)
   enum LTO_tags expected_tag;
 
   ix = streamer_read_uhwi (ib);
-  expected_tag = streamer_read_enum (ib, LTO_tags, LTO_NUM_TAGS);
-
   result = streamer_tree_cache_get_tree (data_in->reader_cache, ix);
-  gcc_assert (result
-  && TREE_CODE (result) == lto_tag_to_tree_code (expected_tag));
+
+  if (streamer_debugging)
+{
+  expected_tag = streamer_read_enum (ib, LTO_tags, LTO_NUM_TAGS);
+  gcc_assert (result
+ && TREE_CODE (result) == lto_tag_to_tree_code (expected_tag));
+}
 
   return result;
 }
diff --git a/gcc/tree-streamer-out.c b/gcc/tree-streamer-out.c
index 4e8a12c71e6..94635c4a8ae 100644
--- a/gcc/tree-streamer-out.c
+++ b/gcc/tree-streamer-out.c
@@ -71,7 +71,8 @@ write_identifier (struct output_block *ob,
 static inline void
 pack_ts_base_value_fields (struct bitpack_d *bp, tree expr)
 {
-  bp_pack_value (bp, TREE_CODE (expr), 16);
+  if (streamer_debugging)
+bp_pack_value (bp, TREE_CODE (expr), 16);
   if (!TYPE_P (expr))
 {
   bp_pack_value (bp, TREE_SIDE_EFFECTS (expr), 1);

Re: [PATCH 1/2] Seperate -funroll-loops for GIMPLE unroller and RTL unroller

2020-05-25 Thread Richard Biener via Gcc-patches

On Mon, May 25, 2020 at 1:10 PM guojiufu  wrote:
>
> Currently option -funroll-loops controls both GIMPLE unroler and
> RTL unroller. It is not able to control GIMPLE cunroller and
> RTL unroller independently.  This patch introducing different flags
> to control them seperately, and this also provide more freedom to
> tune one of them without affecting another.
>
> This patch introduces two undocumented flags: -fcomplete-unroll-loops
> for GIMPLE cunroll, and -frtl-unroll-loops for RTL unroller.  And
> these two options are enabled by original -funroll-loops.

I don't like the new -fcomplete-unroll-loops name.  It does not
match the implementation either (the gate functions of cunroll
and cunrolli only check optimize against 2/3), given it just
controls whether unrolling can grow code size.

> -  SET_OPTION_IF_UNSET (opts, opts_set, flag_unroll_loops, value);
> +  SET_OPTION_IF_UNSET (opts, opts_set, flag_complete_unroll_loops, value);

This is also not a no-op change.

Since a new flag is not needed to fix the regression please avoid
adding -fcomplete-unroll-loops.

For -frtl-unroll-loops you should be able to use

EnabledBy(funroll-loops)

and thus you can simplify option handling accordingly.

Thanks,
Richard.

> Bootstrap and regtest pass on powerpc64le, is this ok for trunk?
>
> Jiufu
>
> ChangeLog:
> 2020-05-25  Jiufu Guo   
>
> * common.opt: Add -frtl-unroll-loops and -fcomplete-unroll-loops.
> * opts.c (enable_fdo_optimizations): Replace flag_unroll_loops
> with flag_complete_unroll_loops.
> * toplev.c (process_options): set flag_rtl_unroll_loops and
> flag_complete_unroll_loops if not explicitly set by user.
> * tree-ssa-loop-ivcanon.c (pass_complete_unroll::execute): Replace
> flag_unroll_loops with flag_complete_unroll_loops.
> * loop-init.c (pass_loop2::gate): Replace flag_unroll_loops with
> flag_rtl_unroll_loops.
> (pass_rtl_unroll_loops::gate): Replace flag_unroll_loops with
> flag_rtl_unroll_loops.
> ---
>  gcc/common.opt  | 8 
>  gcc/loop-init.c | 6 +++---
>  gcc/opts.c  | 2 +-
>  gcc/toplev.c| 6 ++
>  gcc/tree-ssa-loop-ivcanon.c | 2 +-
>  5 files changed, 19 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 4464049fc1f..3b5ab52bb9d 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2856,6 +2856,14 @@ funroll-all-loops
>  Common Report Var(flag_unroll_all_loops) Optimization
>  Perform loop unrolling for all loops.
>
> +frtl-unroll-loops
> +Common Undocumented Var(flag_rtl_unroll_loops) Init(2) Optimization
> +; Perform rtl loop unrolling when iteration count is known.
> +
> +fcomplete-unroll-loops
> +Common Undocumented Var(flag_complete_unroll_loops) Init(2) Optimization
> +; Perform GIMPLE loop complete unrolling.
> +
>  ; Nonzero means that loop optimizer may assume that the induction variables
>  ; that control loops do not overflow and that the loops with nontrivial
>  ; exit condition are not infinite
> diff --git a/gcc/loop-init.c b/gcc/loop-init.c
> index 401e5282907..e955068f36c 100644
> --- a/gcc/loop-init.c
> +++ b/gcc/loop-init.c
> @@ -360,7 +360,7 @@ pass_loop2::gate (function *fun)
>if (optimize > 0
>&& (flag_move_loop_invariants
>   || flag_unswitch_loops
> - || flag_unroll_loops
> + || flag_rtl_unroll_loops
>   || (flag_branch_on_count_reg && targetm.have_doloop_end ())
>   || cfun->has_unroll))
>  return true;
> @@ -560,7 +560,7 @@ public:
>/* opt_pass methods: */
>virtual bool gate (function *)
>  {
> -  return (flag_unroll_loops || flag_unroll_all_loops || 
> cfun->has_unroll);
> +  return (flag_rtl_unroll_loops || flag_unroll_all_loops || 
> cfun->has_unroll);
>  }
>
>virtual unsigned int execute (function *);
> @@ -576,7 +576,7 @@ pass_rtl_unroll_loops::execute (function *fun)
>if (dump_file)
> df_dump (dump_file);
>
> -  if (flag_unroll_loops)
> +  if (flag_rtl_unroll_loops)
> flags |= UAP_UNROLL;
>if (flag_unroll_all_loops)
> flags |= UAP_UNROLL_ALL;
> diff --git a/gcc/opts.c b/gcc/opts.c
> index ec3ca0720f9..64c35d8d7fc 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -1702,7 +1702,7 @@ enable_fdo_optimizations (struct gcc_options *opts,
>  {
>SET_OPTION_IF_UNSET (opts, opts_set, flag_branch_probabilities, value);
>SET_OPTION_IF_UNSET (opts, opts_set, flag_profile_values, value);
> -  SET_OPTION_IF_UNSET (opts, opts_set, flag_unroll_loops, value);
> +  SET_OPTION_IF_UNSET (opts, opts_set, flag_complete_unroll_loops, value);
>SET_OPTION_IF_UNSET (opts, opts_set, flag_peel_loops, value);
>SET_OPTION_IF_UNSET (opts, opts_set, flag_tracer, value);
>SET_OPTION_IF_UNSET (opts, opts_set, flag_value_profile_transformations,
> diff --git a/gcc/toplev.c b/gcc/toplev.c
> index 96316fbd23b..c2b94d33464 100644
> ---

Re: [patch] Fix internal error on store to FP component at -O2

2020-05-25 Thread Eric Botcazou

> Hmm, MAX_BITSIZE_MODE_ANY_INT looks a bit arbitrary since on
> x86 it (IIRC) includes things like OImode. Maybe MOVE_MAX or
> MAX_FIXED_MODE_SIZE are better suited here?

I forgot to mention that I picked MAX_BITSIZE_MODE_ANY_INT because of:

  bool invalid = (base_addr == NULL_TREE
  || (maybe_gt (bitsize,
(unsigned int) MAX_BITSIZE_MODE_ANY_INT)
  && TREE_CODE (rhs) != INTEGER_CST
  && (TREE_CODE (rhs) != CONSTRUCTOR
  || CONSTRUCTOR_NELTS (rhs) != 0)));

-- 
Eric Botcazou

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-25 Thread Hongtao Liu via Gcc-patches

On Mon, May 25, 2020 at 8:00 PM Uros Bizjak  wrote:
>
> On Mon, May 25, 2020 at 1:56 PM Hongtao Liu  wrote:
> >
> > On Mon, May 25, 2020 at 7:36 PM Richard Biener  wrote:
> > >
> > > On Mon, 25 May 2020, Uros Bizjak wrote:
> > >
> > > > On Mon, May 25, 2020 at 8:27 AM Richard Biener  
> > > > wrote:
> > > > >
> > > > > On May 25, 2020 8:12:12 AM GMT+02:00, Uros Bizjak  
> > > > > wrote:
> > > > > >On Mon, May 25, 2020 at 7:53 AM Hongtao Liu  
> > > > > >wrote:
> > > > > >
> > > > > >> > We have to introduce a new expander, that will have conforming 
> > > > > >> > mode
> > > > > >of
> > > > > >> > output operand (V2SF) and will produce RTX that will match
> > > > > >> > *floatv2div2sf2. A paradoxical output subreg from
> > > > > >> > V2SFmode V4SFmode is needed, generated by simplify_gen_subreg as 
> > > > > >> > is
> > > > > >> > the case with paradoxical input subreg.
> > > > > >>
> > > > > >> Problem`is simplify_gen_subreg (V4SFmode, operands[0], V2SFmode, 0)
> > > > > >> will return NULL since
> > > > > >> 
> > > > > >> 948  /* Subregs involving floating point modes are not allowed to
> > > > > >> 949 change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
> > > > > >> 950 (subreg:SI (reg:DF) 0) isn't.  */
> > > > > >
> > > > > >But, we are not changing size, we are still operating with SFmode. It
> > > > > >looks to me that this limitation is too strict, the intention is to
> > > > > >not expand scalar SFmode to DFmode.
> > > > >
> > > > > I guess so. The test probably wants to tes the component mode.
> > > >
> > > > There is already some fishy x86 specific workaround in place, which I
> > > > took the liberty to extend to vector modes. The attached patch that
> > > > exercises this code works as expected, and produces expected code.
> > > >
> > > > Liu, can you please test the attached patch?
> > > >
> >
> > Bootstrap is ok, regression test for i386/x86-64 backend is ok.
>
> Nice. Will you ammend the ChangeLog and commit the patch?
>

Sure, i'll wait 24 hours in case there's other comments, then i'll
submit the patch.

> > > > Richi, is the middle end part OK?
> > >
> > > + i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0).  This
> > > + surely isn't the cleanest way to represent this.  It's questionable
> > > + if this ought to be represented at all -- why can't this all be
> > > hidden
> > > + in post-reload splitters that make arbitrarily mode changes to the
> > > + registers themselves.  */
> > > +  else if (VECTOR_MODE_P (omode)
> > > +  && (GET_MODE_INNER (omode) == imode
> > > +  || (VECTOR_MODE_P (imode)
> > > +  && (GET_MODE_INNER (omode) == GET_MODE_INNER
> > > (imode)
> > >  ;
> > >
> > > I'd formulate this simpler, as
> > >
> > >   else if (VECTOR_MODE_P (omode)
> > >&& GET_MODE_INNER (omode) == GET_MODE_INNER (imode))
> > >
> >
> > I think they are functionally the same, so i won't bother to run
> > another round test.
>
> That is correct.
>
> Thanks,
> Uros.



-- 
BR,
Hongtao

Re: [patch] Fix internal error on store to FP component at -O2

2020-05-25 Thread Eric Botcazou

> Hmm, MAX_BITSIZE_MODE_ANY_INT looks a bit arbitrary since on
> x86 it (IIRC) includes things like OImode. Maybe MOVE_MAX or
> MAX_FIXED_MODE_SIZE are better suited here?

You're right, MAX_FIXED_MODE_SIZE is better, I'm going to change it.

> The IL correctness fix looks OK to me but it smells like we
> might have issues with inserting into x86 XFmode fields
> where ordinary XFmode stores store less bytes than
> an integer mode of the same size?  That issue should be
> latent of course (unless it always ICEd before).  Also
> it's likely reproducible only on Ada(?).

Yes, this can only happen in Ada because it effectively can have bit-fields of 
any type, and not only of integral types.

-- 
Eric Botcazou

Re: [patch] Fix internal error on store to FP component at -O2

2020-05-25 Thread Richard Biener via Gcc-patches

On Mon, May 25, 2020 at 12:33 PM Eric Botcazou  wrote:
>
> Hi,
>
> the attached Ada testcase triggers a GIMPLE verification failure at -O2 or
> above because the GIMPLE store merging pass generates a NOP_EXPR between a FP
> type and an integral type.  This happens when the bit-field insertion path is
> taken for a FP field, which can happen in Ada for bit-packed record types.
>
> It is fixed by generating an intermediate VIEW_CONVERT_EXPR.  The patch also
> tames a little the bit-field insertion path because, for bit-packed record
> types in Ada, you can end up with large bit-field regions, which results in a
> lot of mask-and-shifts instructions.
>
> Tested on x86-64/Linux, OK for the mainline?

Hmm, MAX_BITSIZE_MODE_ANY_INT looks a bit arbitrary since on
x86 it (IIRC) includes things like OImode. Maybe MOVE_MAX or
MAX_FIXED_MODE_SIZE are better suited here?

Of course the size of the whole region isn't what matters but how
many (if more than one) "region piece" (each word_mode size?)
a store crosses?  That is, do we want to prevent store-merging
across such boundaries?  Ah, that's the

@@ -4788,7 +4800,7 @@ pass_store_merging::process_store (gimple *stmt)
  && bitsize.is_constant (_bitsize)
  && ((const_bitsize % BITS_PER_UNIT) != 0
  || !multiple_p (bitpos, BITS_PER_UNIT))
- && const_bitsize <= 64)
+ && const_bitsize <= MAX_BITSIZE_MODE_ANY_INT)
{

hunk to which I have similar concerns.

The IL correctness fix looks OK to me but it smells like we
might have issues with inserting into x86 XFmode fields
where ordinary XFmode stores store less bytes than
an integer mode of the same size?  That issue should be
latent of course (unless it always ICEd before).  Also
it's likely reproducible only on Ada(?).

Thanks,
Richard.

>
> 2020-05-25  Eric Botcazou  
>
> * gimple-ssa-store-merging.c (merged_store_group::can_be_merged_into):
> Only turn MEM_REFs into bit-field stores for small bit-field regions.
> (imm_store_chain_info::output_merged_store): Be prepared for sources
> with non-integral type in the bit-field insertion case.
> (pass_store_merging::process_store): Use MAX_BITSIZE_MODE_ANY_INT as
> the largest size for the bit-field case.
>
>
> 2020-05-25  Eric Botcazou  
>
> * gnat.dg/opt84.adb: New test.
>
> --
> Eric Botcazou

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-25 Thread Uros Bizjak via Gcc-patches

On Mon, May 25, 2020 at 1:56 PM Hongtao Liu  wrote:
>
> On Mon, May 25, 2020 at 7:36 PM Richard Biener  wrote:
> >
> > On Mon, 25 May 2020, Uros Bizjak wrote:
> >
> > > On Mon, May 25, 2020 at 8:27 AM Richard Biener  wrote:
> > > >
> > > > On May 25, 2020 8:12:12 AM GMT+02:00, Uros Bizjak  
> > > > wrote:
> > > > >On Mon, May 25, 2020 at 7:53 AM Hongtao Liu  wrote:
> > > > >
> > > > >> > We have to introduce a new expander, that will have conforming mode
> > > > >of
> > > > >> > output operand (V2SF) and will produce RTX that will match
> > > > >> > *floatv2div2sf2. A paradoxical output subreg from
> > > > >> > V2SFmode V4SFmode is needed, generated by simplify_gen_subreg as is
> > > > >> > the case with paradoxical input subreg.
> > > > >>
> > > > >> Problem`is simplify_gen_subreg (V4SFmode, operands[0], V2SFmode, 0)
> > > > >> will return NULL since
> > > > >> 
> > > > >> 948  /* Subregs involving floating point modes are not allowed to
> > > > >> 949 change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
> > > > >> 950 (subreg:SI (reg:DF) 0) isn't.  */
> > > > >
> > > > >But, we are not changing size, we are still operating with SFmode. It
> > > > >looks to me that this limitation is too strict, the intention is to
> > > > >not expand scalar SFmode to DFmode.
> > > >
> > > > I guess so. The test probably wants to tes the component mode.
> > >
> > > There is already some fishy x86 specific workaround in place, which I
> > > took the liberty to extend to vector modes. The attached patch that
> > > exercises this code works as expected, and produces expected code.
> > >
> > > Liu, can you please test the attached patch?
> > >
>
> Bootstrap is ok, regression test for i386/x86-64 backend is ok.

Nice. Will you ammend the ChangeLog and commit the patch?

> > > Richi, is the middle end part OK?
> >
> > + i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0).  This
> > + surely isn't the cleanest way to represent this.  It's questionable
> > + if this ought to be represented at all -- why can't this all be
> > hidden
> > + in post-reload splitters that make arbitrarily mode changes to the
> > + registers themselves.  */
> > +  else if (VECTOR_MODE_P (omode)
> > +  && (GET_MODE_INNER (omode) == imode
> > +  || (VECTOR_MODE_P (imode)
> > +  && (GET_MODE_INNER (omode) == GET_MODE_INNER
> > (imode)
> >  ;
> >
> > I'd formulate this simpler, as
> >
> >   else if (VECTOR_MODE_P (omode)
> >&& GET_MODE_INNER (omode) == GET_MODE_INNER (imode))
> >
>
> I think they are functionally the same, so i won't bother to run
> another round test.

That is correct.

Thanks,
Uros.

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-25 Thread Hongtao Liu via Gcc-patches

On Mon, May 25, 2020 at 7:36 PM Richard Biener  wrote:
>
> On Mon, 25 May 2020, Uros Bizjak wrote:
>
> > On Mon, May 25, 2020 at 8:27 AM Richard Biener  wrote:
> > >
> > > On May 25, 2020 8:12:12 AM GMT+02:00, Uros Bizjak  
> > > wrote:
> > > >On Mon, May 25, 2020 at 7:53 AM Hongtao Liu  wrote:
> > > >
> > > >> > We have to introduce a new expander, that will have conforming mode
> > > >of
> > > >> > output operand (V2SF) and will produce RTX that will match
> > > >> > *floatv2div2sf2. A paradoxical output subreg from
> > > >> > V2SFmode V4SFmode is needed, generated by simplify_gen_subreg as is
> > > >> > the case with paradoxical input subreg.
> > > >>
> > > >> Problem`is simplify_gen_subreg (V4SFmode, operands[0], V2SFmode, 0)
> > > >> will return NULL since
> > > >> 
> > > >> 948  /* Subregs involving floating point modes are not allowed to
> > > >> 949 change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
> > > >> 950 (subreg:SI (reg:DF) 0) isn't.  */
> > > >
> > > >But, we are not changing size, we are still operating with SFmode. It
> > > >looks to me that this limitation is too strict, the intention is to
> > > >not expand scalar SFmode to DFmode.
> > >
> > > I guess so. The test probably wants to tes the component mode.
> >
> > There is already some fishy x86 specific workaround in place, which I
> > took the liberty to extend to vector modes. The attached patch that
> > exercises this code works as expected, and produces expected code.
> >
> > Liu, can you please test the attached patch?
> >

Bootstrap is ok, regression test for i386/x86-64 backend is ok.

> > Richi, is the middle end part OK?
>
> + i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0).  This
> + surely isn't the cleanest way to represent this.  It's questionable
> + if this ought to be represented at all -- why can't this all be
> hidden
> + in post-reload splitters that make arbitrarily mode changes to the
> + registers themselves.  */
> +  else if (VECTOR_MODE_P (omode)
> +  && (GET_MODE_INNER (omode) == imode
> +  || (VECTOR_MODE_P (imode)
> +  && (GET_MODE_INNER (omode) == GET_MODE_INNER
> (imode)
>  ;
>
> I'd formulate this simpler, as
>
>   else if (VECTOR_MODE_P (omode)
>&& GET_MODE_INNER (omode) == GET_MODE_INNER (imode))
>

I think they are functionally the same, so i won't bother to run
another round test.

> since GET_MODE_INNER (DFmode) == DFmode.
>
> We also allow (subreg:V4SF (reg:V2DF) 0) where we effectively
> subref DF with SF but the size check covers the whole vector, so
> the comment on the code with float modes is a bit misleading IMHO.
>
> It also leads to the strange situation that both
> (subreg:V4SF (reg:V2DF) 0) and (subreg:V8SF (reg:V4SF) 0) is valid
> but (subreg:V8SF (reg:V2DF) 0) is not.
>
> Anyway, I think generalizing the existing exception is obvious enough
> thus OK with my suggestion for the simplification if nobody else
> has differing comments.
>
> Thanks,
> Richard.



-- 
BR,
Hongtao

Re: [PATCH] Implement no_stack_protect attribute.

2020-05-25 Thread Richard Biener via Gcc-patches

On Mon, May 25, 2020 at 11:27 AM Martin Liška  wrote:
>
> On 5/22/20 12:51 PM, Richard Biener wrote:
> > On Thu, May 21, 2020 at 12:09 PM Martin Liška  wrote:
> >>
> >> On 5/18/20 1:49 PM, Richard Biener wrote:
> >>> On Mon, May 18, 2020 at 1:10 PM Jakub Jelinek via Gcc-patches
> >>>  wrote:
> 
>  On Mon, May 18, 2020 at 01:03:40PM +0200, Jakub Jelinek wrote:
> >> The optimize attribute is used to specify that a function is to be 
> >> compiled
> >> with different optimization options than specified on the command line.
> >> ```
> >>
> >> It's pretty clear about the command line arguments (that are ignored).
> >
> > That is not clear at all.  The difference is primarily in what the 
> > option
> > string says in there.
> 
>  And if we decide we want to keep existing behavior (haven't checked if we
>  actually behave that way yet), we should add some syntax that will allow
>  ammending command line options rather than replacing it.
> >>
> >> Hello.
> >>
> >> Back to this, I must say that option handling is a complex thing in the 
> >> GCC.
> >>
> >>>
> >>> We do behave this way - while we're running against the current
> >>> gcc_options we call decode_options which first does
> >>> default_options_optimization.  So it behaves inconsistently with
> >>> respect to options not explicitly listed in default_options_table.
> >>
> >> Yes, we basically build on top of the currently selection flags.
> >> We use default_options_table because any optimization level selection
> >> in an optimization attribute should efficiently enforce call of 
> >> default_options_table.
> >>
> >> What about using them only in case one changes optimization level (patch)?
> >
> > I'm not sure if that works, no -On means -O0, so one might interpret
> > optimize("omit-frame-pointer") as -O0 -fomit-frame-pointer.   Also -On
> > are not treated in position dependent way, thus -fno-tree-pre -O2 is
> > the same as -O2 -fno-tree-pre rather than re-enabling PRE over 
> > -fno-tree-pre.
> > So your patch would turn a -O2 -fno-tree-pre invocation, when
> > optimize("O3") is encountered, to one with PRE enabled, not matching
> > behavior of -O2 -fno-tree-pre -O3.
> >
> > I think the only sensible way is to save the original decoded options
> > from the gcc invocation (and have #pragma optimize push/pop append
> > accordingly) and append the current attribute/pragmas optimization
> > string to them and run that whole thing on a default option state.
>
> That should work. Anyway, that's a task with unclear finish.
> Would it be possible to add the no_stack_protect attribute for the time being?

I think it's a reasonable request, yes.

Richard.

> The limitation is there for ages and we have another "optimize" attributes
> that can be theoretically replaced with proper optimize handling.
>
> >
> > That makes semantics equivalent to appending more options to the
> > command-line.  Well, hopefully.  Interaction with the target attribute
> > might be interesting (that likely also needs to append to that
> > "current decoded options" set).
>
> I fear from the interaction of optimization and target attribute.
>
> >
> > As you say, option handling is difficult.
>
> Anyway, I'm planning to work on the options in stage1. It's quite close
> relative to PR92860.
>
> Martin
>
> >
> > Richard.
> >
> >> Martin
> >>
> >>>
> >>> But I also think doing the whole processing as in decode_options
> >>> is the only thing that's really tested.  And a fix would be to not
> >>> start from the current set of options but from a clean slate...
> >>>
> >>> The code clearly thinks we're doing incremental changes:
> >>>
> >>> /* Save current options.  */
> >>> cl_optimization_save (_opts, _options);
> >>>
> >>> /* If we previously had some optimization options, use them as the
> >>>default.  */
> >>> if (old_opts)
> >>>   cl_optimization_restore (_options,
> >>>TREE_OPTIMIZATION (old_opts));
> >>>
> >>> /* Parse options, and update the vector.  */
> >>> parse_optimize_options (args, true);
> >>> DECL_FUNCTION_SPECIFIC_OPTIMIZATION (*node)
> >>>   = build_optimization_node (_options);
> >>>
> >>> /* Restore current options.  */
> >>> cl_optimization_restore (_options, _opts);
> >>>
> >>> so for example you should see -fipa-pta preserved from the
> >>> command-line while -fno-tree-pta would be overridden even
> >>> if not mentioned explicitely in the optimize attribute.
> >>>
> >>> IMHO the current situation is far from useful...
> >>>
> >>> Richard.
> >>>
>  Could be say start the optimize attribute string with + or something
>  similar.
>  Does target attribute behave that way too?
> 
>    Jakub
> 
> >>
>

[PATCH] tree-optimization/95295 - fix wrong-code with SM

2020-05-25 Thread Richard Biener

We failed to compare the rematerialized store values when merging
paths after walking PHIs.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2020-05-25  Richard Biener  

PR tree-optimization/95295
* tree-ssa-loop-im.c (sm_seq_valid_bb): Compare remat stores
RHSes and drop to full sm_other if they are not equal.

* gcc.dg/torture/pr95295-1.c: New testcase.
* gcc.dg/torture/pr95295-2.c: Likewise.
* gcc.dg/torture/pr95283.c: Likewise.
---
 gcc/testsuite/gcc.dg/torture/pr95283.c   | 19 +++
 gcc/testsuite/gcc.dg/torture/pr95295-1.c | 15 +++
 gcc/testsuite/gcc.dg/torture/pr95295-2.c | 14 ++
 gcc/tree-ssa-loop-im.c   |  8 
 4 files changed, 56 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr95283.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr95295-1.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr95295-2.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr95283.c 
b/gcc/testsuite/gcc.dg/torture/pr95283.c
new file mode 100644
index 000..950d3b07ead
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr95283.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+
+short c;
+_Bool d;
+unsigned e, f;
+char g, h;
+extern _Bool i[];
+void j()
+{
+  for (char a = 0; a < 100; a++)
+for (char b = 0; b < 20; b += 2)
+  {
+   if (e)
+ d = f = 0;
+   else
+ g = i[8] = 0;
+   h = c;
+  }
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr95295-1.c 
b/gcc/testsuite/gcc.dg/torture/pr95295-1.c
new file mode 100644
index 000..76a19f3d17b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr95295-1.c
@@ -0,0 +1,15 @@
+/* { dg-do run } */
+/* { dg-additional-sources "pr95295-2.c" } */
+
+extern int var_4, a;
+extern unsigned var_9;
+extern short arr_272[];
+void test()
+{
+  for (int b = 0; b < 9; b++)
+for (int c = 0; c < 9; c += 4)
+  {
+   arr_272[c] = var_9 ? var_4 : 0;
+   a = 0;
+  }
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr95295-2.c 
b/gcc/testsuite/gcc.dg/torture/pr95295-2.c
new file mode 100644
index 000..80caeb8740d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr95295-2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+
+int a;
+int var_4 = 1;
+unsigned var_9 = 8;
+short arr_272[20];
+void test();
+int main()
+{
+  test();
+  if (arr_272[4] != 1)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index fcca099355a..b399bd0f729 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -2402,6 +2402,7 @@ sm_seq_valid_bb (class loop *loop, basic_block bb, tree 
vdef,
  if (edge_seq[i].second == sm_ord)
bitmap_set_bit (refs_not_supported, edge_seq[i].first);
  first_edge_seq[i].second = sm_other;
+ first_edge_seq[i].from = NULL_TREE;
}
  /* sm_other prevails.  */
  else if (first_edge_seq[i].second != edge_seq[i].second)
@@ -2410,7 +2411,14 @@ sm_seq_valid_bb (class loop *loop, basic_block bb, tree 
vdef,
  gcc_assert (bitmap_bit_p (refs_not_supported,
first_edge_seq[i].first));
  first_edge_seq[i].second = sm_other;
+ first_edge_seq[i].from = NULL_TREE;
}
+ else if (first_edge_seq[i].second == sm_other
+  && first_edge_seq[i].from != NULL_TREE
+  && (edge_seq[i].from == NULL_TREE
+  || !operand_equal_p (first_edge_seq[i].from,
+   edge_seq[i].from, 0)))
+   first_edge_seq[i].from = NULL_TREE;
}
  /* Any excess elements become sm_other since they are now
 coonditionally executed.  */
-- 
2.16.4

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-25 Thread Richard Biener

On Mon, 25 May 2020, Uros Bizjak wrote:

> On Mon, May 25, 2020 at 8:27 AM Richard Biener  wrote:
> >
> > On May 25, 2020 8:12:12 AM GMT+02:00, Uros Bizjak  wrote:
> > >On Mon, May 25, 2020 at 7:53 AM Hongtao Liu  wrote:
> > >
> > >> > We have to introduce a new expander, that will have conforming mode
> > >of
> > >> > output operand (V2SF) and will produce RTX that will match
> > >> > *floatv2div2sf2. A paradoxical output subreg from
> > >> > V2SFmode V4SFmode is needed, generated by simplify_gen_subreg as is
> > >> > the case with paradoxical input subreg.
> > >>
> > >> Problem`is simplify_gen_subreg (V4SFmode, operands[0], V2SFmode, 0)
> > >> will return NULL since
> > >> 
> > >> 948  /* Subregs involving floating point modes are not allowed to
> > >> 949 change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
> > >> 950 (subreg:SI (reg:DF) 0) isn't.  */
> > >
> > >But, we are not changing size, we are still operating with SFmode. It
> > >looks to me that this limitation is too strict, the intention is to
> > >not expand scalar SFmode to DFmode.
> >
> > I guess so. The test probably wants to tes the component mode.
> 
> There is already some fishy x86 specific workaround in place, which I
> took the liberty to extend to vector modes. The attached patch that
> exercises this code works as expected, and produces expected code.
> 
> Liu, can you please test the attached patch?
> 
> Richi, is the middle end part OK?

+ i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0).  This
+ surely isn't the cleanest way to represent this.  It's questionable
+ if this ought to be represented at all -- why can't this all be 
hidden
+ in post-reload splitters that make arbitrarily mode changes to the
+ registers themselves.  */
+  else if (VECTOR_MODE_P (omode)
+  && (GET_MODE_INNER (omode) == imode
+  || (VECTOR_MODE_P (imode)
+  && (GET_MODE_INNER (omode) == GET_MODE_INNER 
(imode)
 ;

I'd formulate this simpler, as

  else if (VECTOR_MODE_P (omode)
   && GET_MODE_INNER (omode) == GET_MODE_INNER (imode))

since GET_MODE_INNER (DFmode) == DFmode.

We also allow (subreg:V4SF (reg:V2DF) 0) where we effectively
subref DF with SF but the size check covers the whole vector, so
the comment on the code with float modes is a bit misleading IMHO.

It also leads to the strange situation that both
(subreg:V4SF (reg:V2DF) 0) and (subreg:V8SF (reg:V4SF) 0) is valid
but (subreg:V8SF (reg:V2DF) 0) is not.

Anyway, I think generalizing the existing exception is obvious enough
thus OK with my suggestion for the simplification if nobody else
has differing comments.

Thanks,
Richard.

Re: testsuite: clarify scan-dump file globbing behavior

2020-05-25 Thread Frederik Harwath

Frederik Harwath  writes:

Hi Rainer, hi Mike,
ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545803.html

Best regards,
Frederik

> Hi Thomas,
>
> Thomas Schwinge  writes:
>
>> I can't formally approve testsuite patches, but did a review anyway:
>
> Thanks for the review!
>
>> On 2020-05-15T12:31:54+0200, Frederik Harwath  
>> wrote:
>
>>> The dump
>>> scanning procedures are changed to make the test unresolved
>>> if globbing matches more than one file.
>>
>> (The code changes look good, but I have not tested that specific aspect.)
>
> We do not have automated tests for the testsuite commands :-), but I
> have of course tested this manually.
>
>> As I said, not an approval, and minor comments (see below), but still:
>>
>> Reviewed-by: Thomas Schwinge 
>>
>> Do we have to similarly also audit/alter other testsuite infrastructure
>> files, anything that uses '[glob [...]]'?  (..., and then generalize
>> 'glob-dump-file' into 'glob-one-file', or similar.)  That can be done
>> incrementally, as far as I'm concerned.
>
> I also think it would make sense to adapt similar test commands as well.
>
>> May also make this more useful/explicit:
>>
>> This is useful if, for example, if a pass has several static
>> instances [correct terminology?], and depending on torture testing
>> command-line flags, a different instance executes and produces a dump
>> file, and so in the test case you can use a generic [put example
>> here] to scan the varying dump files names.
>>
>> (Or similar.)
>
> I have moved the explanation below the description of the individual
> commands and added an example. See the attached revised patch.
>
> Best regards,
> Frederik
>
> From 2a17749d6dbcac690d698323240438722d6119ef Mon Sep 17 00:00:00 2001
> From: Frederik Harwath 
> Date: Fri, 15 May 2020 10:35:48 +0200
> Subject: [PATCH] testsuite: clarify scan-dump file globbing behavior
>
> The test commands for scanning optimization dump files
> perform globbing on the argument that specifies the suffix
> of the dump files to be scanned.  This behavior is currently
> undocumented.  Furthermore, the current implementation of
> "scan-dump" and similar procedures yields an error whenever
> the file name globbing matches more than one file (due to an
> attempt to call "open" on multiple files) while a failure to
> match any file results in an unresolved test.
>
> This commit documents the globbing behavior.  The dump
> scanning procedures are changed to make the test unresolved
> if globbing matches more than one file.
>
> gcc/ChangeLog:
>
> 2020-05-19  Frederik Harwath  
>
>   * doc/sourcebuild.texi: Describe globbing of the
>   dump file scanning commands "suffix" argument.
>
> gcc/testsuite/ChangeLog:
>
> 2020-05-19  Frederik Harwath  
>
>   * lib/scandump.exp (glob-dump-file): New proc.
>   (scan-dump): Use glob-dump-file for file name expansion.
>   (scan-dump-times): Likewise.
>   (scan-dump-dem): Likewise.
>   (scan-dump-dem-not): Likewise.
>
> Reviewed-by: Thomas Schwinge 
> ---
>  gcc/doc/sourcebuild.texi   | 13 
>  gcc/testsuite/lib/scandump.exp | 54 +++---
>  2 files changed, 56 insertions(+), 11 deletions(-)
>
> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> index 240d6e4b08e..9df4b06d460 100644
> --- a/gcc/doc/sourcebuild.texi
> +++ b/gcc/doc/sourcebuild.texi
> @@ -2911,6 +2911,19 @@ Passes if @var{regex} does not match demangled text in 
> the dump file with
>  suffix @var{suffix}.
>  @end table
>
> +The @var{suffix} argument which describes the dump file to be scanned
> +may contain a glob pattern that must expand to exactly one file
> +name. This is useful if, e.g., different pass instances are executed
> +depending on torture testing command-line flags, producing dump files
> +whose names differ only in their pass instance number suffix.  For
> +example, to scan instances 1, 2, 3 of a tree pass ``mypass'' for
> +occurrences of the string ``code has been optimized'', use:
> +@smallexample
> +/* @{ dg-options "-fdump-tree-mypass" @} */
> +/* @{ dg-final @{ scan-tree-dump "code has been optimized" "mypass\[1-3\]" 
> @} @} */
> +@end smallexample
> +
> +
>  @subsubsection Check for output files
>
>  @table @code
> diff --git a/gcc/testsuite/lib/scandump.exp b/gcc/testsuite/lib/scandump.exp
> index d6ba350acc8..f3a991b590a 100644
> --- a/gcc/testsuite/lib/scandump.exp
> +++ b/gcc/testsuite/lib/scandump.exp
> @@ -39,6 +39,34 @@ proc dump-base { args } {
>  return $dumpbase
>  }
>
> +# Expand dump file name pattern to exactly one file.
> +# Return a single dump file name or an empty string
> +# if the pattern matches no file or more than one file.
> +#
> +# Argument 0 is the testcase name
> +# Argument 1 is the dump file glob pattern
> +proc glob-dump-file { args } {
> +
> +set pattern [lindex $args 1]
> +set dump_file "[glob -nocomplain $pattern]"
> +set num_files [llength $dump_file]
> +
> +if {

[PATCH 2/2] rs6000: Turn on -frtl-unroll-loops instead -funroll-loops at -O2

2020-05-25 Thread guojiufu via Gcc-patches

Previously, turning -funroll-loops on at -O2, which also turn on
GIMPLE cunroll fully.  While cunroll unrolls some complex loops.

This patch turn on -frtl-unroll-loops at -O2 only, and continue to
use previous tuned rs6000 heurisitics for small loops.  While this
patch does not turn on GIMPLE cunroll any more.  We may tune
cunroll in near future at -O2.

In this patch, it become simpler to check/set -fweb, -frename-register
and -munroll-only-small-loops.  Together with -frtl-unroll-loops, -fweb
is useful, then turn -fweb on;  and -frename-registers is no need to
be checked, because it is affected by -frtl-unroll-loops.


Bootstrap and regtest pass on powerpc64le, is this ok for trunk?
And backport to GCC10 together with the patch "Seperate -funroll-loops
 for GIMPLE unroller and RTL unroller"

Jiufu


gcc/ChangeLog
2020-05-25  Jiufu Guo   

PR target/95018
* common/config/rs6000/rs6000-common.c
(rs6000_option_optimization_table)
[OPT_LEVELS_2_PLUS_SPEED_ONLY]: Replace -funroll-loops
with -frtl-unroll-loops.  Remove -munroll-only-small-loops
and add -fweb.
[OPT_LEVELS_ALL]: Remove turn off -frename-registers.
* config/rs6000/rs6000.c (rs6000_option_override_internal):
-funroll-loops overrides -munroll-only-small-loops and
-frtl-unroll-loops.
---
 gcc/common/config/rs6000/rs6000-common.c | 11 +++
 gcc/config/rs6000/rs6000.c   | 21 ++---
 2 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/gcc/common/config/rs6000/rs6000-common.c 
b/gcc/common/config/rs6000/rs6000-common.c
index ee37b9dc90b..c7388edb867 100644
--- a/gcc/common/config/rs6000/rs6000-common.c
+++ b/gcc/common/config/rs6000/rs6000-common.c
@@ -34,14 +34,9 @@ static const struct default_options 
rs6000_option_optimization_table[] =
 { OPT_LEVELS_ALL, OPT_fsplit_wide_types_early, NULL, 1 },
 /* Enable -fsched-pressure for first pass instruction scheduling.  */
 { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
-/* Enable -munroll-only-small-loops with -funroll-loops to unroll small
-   loops at -O2 and above by default.  */
-{ OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_funroll_loops, NULL, 1 },
-{ OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_munroll_only_small_loops, NULL, 1 },
-
-/* -frename-registers leads to non-optimal codegen and performance
-   on rs6000, turn it off by default.  */
-{ OPT_LEVELS_ALL, OPT_frename_registers, NULL, 0 },
+/* Enable -frtl-unroll-loops and -fweb at -O2 and above by default.  */
+{ OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_frtl_unroll_loops, NULL, 1 },
+{ OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_fweb, NULL, 1 },
 
 /* Double growth factor to counter reduced min jump length.  */
 { OPT_LEVELS_ALL, OPT__param_max_grow_copy_bb_insns_, NULL, 16 },
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 8435bc15d72..96620651a59 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4557,17 +4557,16 @@ rs6000_option_override_internal (bool global_init_p)
   param_sched_pressure_algorithm,
   SCHED_PRESSURE_MODEL);
 
-  /* Explicit -funroll-loops turns -munroll-only-small-loops off, and
-turns -frename-registers on.  */
-  if ((global_options_set.x_flag_unroll_loops && flag_unroll_loops)
-  || (global_options_set.x_flag_unroll_all_loops
-  && flag_unroll_all_loops))
-   {
- if (!global_options_set.x_unroll_only_small_loops)
-   unroll_only_small_loops = 0;
- if (!global_options_set.x_flag_rename_registers)
-   flag_rename_registers = 1;
-   }
+  /* if -f[no-]unroll-loops is specified explicitly, turn [off/]on
+-frtl-unroll-loops.  */
+  if (global_options_set.x_flag_unroll_loops
+ && !global_options_set.x_flag_rtl_unroll_loops)
+   flag_rtl_unroll_loops = flag_unroll_loops;
+   
+  /* If flag_unroll_loops is effect, not _only_ small loops, but
+large loops are unrolled if possible.  */
+  if (!global_options_set.x_unroll_only_small_loops)
+   unroll_only_small_loops = flag_unroll_loops ? 0 : 1;
 
   /* If using typedef char *va_list, signal that
 __builtin_va_start (, 0) can be optimized to
-- 
2.17.1

[PATCH 1/2] Seperate -funroll-loops for GIMPLE unroller and RTL unroller

2020-05-25 Thread guojiufu via Gcc-patches

Currently option -funroll-loops controls both GIMPLE unroler and
RTL unroller. It is not able to control GIMPLE cunroller and
RTL unroller independently.  This patch introducing different flags
to control them seperately, and this also provide more freedom to
tune one of them without affecting another.

This patch introduces two undocumented flags: -fcomplete-unroll-loops
for GIMPLE cunroll, and -frtl-unroll-loops for RTL unroller.  And
these two options are enabled by original -funroll-loops.

Bootstrap and regtest pass on powerpc64le, is this ok for trunk?

Jiufu

ChangeLog:
2020-05-25  Jiufu Guo   

* common.opt: Add -frtl-unroll-loops and -fcomplete-unroll-loops.
* opts.c (enable_fdo_optimizations): Replace flag_unroll_loops
with flag_complete_unroll_loops.
* toplev.c (process_options): set flag_rtl_unroll_loops and
flag_complete_unroll_loops if not explicitly set by user.
* tree-ssa-loop-ivcanon.c (pass_complete_unroll::execute): Replace
flag_unroll_loops with flag_complete_unroll_loops.
* loop-init.c (pass_loop2::gate): Replace flag_unroll_loops with
flag_rtl_unroll_loops.
(pass_rtl_unroll_loops::gate): Replace flag_unroll_loops with
flag_rtl_unroll_loops.
---
 gcc/common.opt  | 8 
 gcc/loop-init.c | 6 +++---
 gcc/opts.c  | 2 +-
 gcc/toplev.c| 6 ++
 gcc/tree-ssa-loop-ivcanon.c | 2 +-
 5 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 4464049fc1f..3b5ab52bb9d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2856,6 +2856,14 @@ funroll-all-loops
 Common Report Var(flag_unroll_all_loops) Optimization
 Perform loop unrolling for all loops.
 
+frtl-unroll-loops
+Common Undocumented Var(flag_rtl_unroll_loops) Init(2) Optimization
+; Perform rtl loop unrolling when iteration count is known.
+
+fcomplete-unroll-loops
+Common Undocumented Var(flag_complete_unroll_loops) Init(2) Optimization
+; Perform GIMPLE loop complete unrolling.
+
 ; Nonzero means that loop optimizer may assume that the induction variables
 ; that control loops do not overflow and that the loops with nontrivial
 ; exit condition are not infinite
diff --git a/gcc/loop-init.c b/gcc/loop-init.c
index 401e5282907..e955068f36c 100644
--- a/gcc/loop-init.c
+++ b/gcc/loop-init.c
@@ -360,7 +360,7 @@ pass_loop2::gate (function *fun)
   if (optimize > 0
   && (flag_move_loop_invariants
  || flag_unswitch_loops
- || flag_unroll_loops
+ || flag_rtl_unroll_loops
  || (flag_branch_on_count_reg && targetm.have_doloop_end ())
  || cfun->has_unroll))
 return true;
@@ -560,7 +560,7 @@ public:
   /* opt_pass methods: */
   virtual bool gate (function *)
 {
-  return (flag_unroll_loops || flag_unroll_all_loops || cfun->has_unroll);
+  return (flag_rtl_unroll_loops || flag_unroll_all_loops || 
cfun->has_unroll);
 }
 
   virtual unsigned int execute (function *);
@@ -576,7 +576,7 @@ pass_rtl_unroll_loops::execute (function *fun)
   if (dump_file)
df_dump (dump_file);
 
-  if (flag_unroll_loops)
+  if (flag_rtl_unroll_loops)
flags |= UAP_UNROLL;
   if (flag_unroll_all_loops)
flags |= UAP_UNROLL_ALL;
diff --git a/gcc/opts.c b/gcc/opts.c
index ec3ca0720f9..64c35d8d7fc 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -1702,7 +1702,7 @@ enable_fdo_optimizations (struct gcc_options *opts,
 {
   SET_OPTION_IF_UNSET (opts, opts_set, flag_branch_probabilities, value);
   SET_OPTION_IF_UNSET (opts, opts_set, flag_profile_values, value);
-  SET_OPTION_IF_UNSET (opts, opts_set, flag_unroll_loops, value);
+  SET_OPTION_IF_UNSET (opts, opts_set, flag_complete_unroll_loops, value);
   SET_OPTION_IF_UNSET (opts, opts_set, flag_peel_loops, value);
   SET_OPTION_IF_UNSET (opts, opts_set, flag_tracer, value);
   SET_OPTION_IF_UNSET (opts, opts_set, flag_value_profile_transformations,
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 96316fbd23b..c2b94d33464 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1474,6 +1474,12 @@ process_options (void)
   if (flag_unroll_all_loops)
 flag_unroll_loops = 1;
 
+  if (flag_rtl_unroll_loops == AUTODETECT_VALUE)
+flag_rtl_unroll_loops = flag_unroll_loops;
+
+  if (flag_complete_unroll_loops == AUTODETECT_VALUE)
+flag_complete_unroll_loops = flag_unroll_loops;
+
   /* web and rename-registers help when run after loop unrolling.  */
   if (flag_web == AUTODETECT_VALUE)
 flag_web = flag_unroll_loops;
diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c
index 8ab6ab3330c..cd5df353df5 100644
--- a/gcc/tree-ssa-loop-ivcanon.c
+++ b/gcc/tree-ssa-loop-ivcanon.c
@@ -1603,7 +1603,7 @@ pass_complete_unroll::execute (function *fun)
  re-peeling the same loop multiple times.  */
   if (flag_peel_loops)
 peeled_loops = BITMAP_ALLOC (NULL);
-  unsigned int val = tree_unroll_loops_completely (flag_unroll_loops
+

[PATCH] tree-optimization/95271 - fix bswap vectorization invariant SLP type

2020-05-25 Thread Richard Biener

This properly updates invariant SLP nodes vector types for bswap
vectorization.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied.

2020-05-25  Richard Biener  

PR tree-optimization/95271
* tree-vect-stmts.c (vectorizable_bswap): Update invariant SLP
children vector type.
(vectorizable_call): Pass down slp ops.

* gcc.dg/vect/bb-slp-pr95271.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-pr95271.c | 19 +++
 gcc/tree-vect-stmts.c  | 12 +++-
 2 files changed, 30 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-pr95271.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr95271.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr95271.c
new file mode 100644
index 000..2f235980405
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr95271.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=cooperlake" { target x86_64-*-* i?86-*-* } 
} */
+
+int a;
+struct b c;
+long d;
+struct b {
+  unsigned long address;
+  unsigned long e;
+};
+void f()
+{
+  d = (long)()[0] << 56 | (long)((unsigned char *))[1] << 48 |
+  (long)((unsigned char *))[2] << 40 |
+  (long)((unsigned char *))[3] << 32 |
+  (long)((unsigned char *))[4] << 24 | ((unsigned char *))[5] << 16 |
+  ((unsigned char *))[6] << 8 | ((unsigned char *))[7];
+  c.address = c.e = d;
+}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index e7822c44951..9023dd4c216 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -3009,6 +3009,7 @@ static bool
 vectorizable_bswap (vec_info *vinfo,
stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
stmt_vec_info *vec_stmt, slp_tree slp_node,
+   slp_tree *slp_op,
tree vectype_in, stmt_vector_for_cost *cost_vec)
 {
   tree op, vectype;
@@ -3051,6 +3052,15 @@ vectorizable_bswap (vec_info *vinfo,
 
   if (! vec_stmt)
 {
+  if (slp_node
+ && !vect_maybe_update_slp_op_vectype (slp_op[0], vectype_in))
+   {
+ if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"incompatible vector types for invariants\n");
+ return false;
+   }
+
   STMT_VINFO_TYPE (stmt_info) = call_vec_info_type;
   DUMP_VECT_SCOPE ("vectorizable_bswap");
   if (! slp_node)
@@ -3377,7 +3387,7 @@ vectorizable_call (vec_info *vinfo,
   || gimple_call_builtin_p (stmt, BUILT_IN_BSWAP32)
   || gimple_call_builtin_p (stmt, BUILT_IN_BSWAP64)))
return vectorizable_bswap (vinfo, stmt_info, gsi, vec_stmt, slp_node,
-  vectype_in, cost_vec);
+  slp_op, vectype_in, cost_vec);
   else
{
  if (dump_enabled_p ())
-- 
2.25.1

[PATCH] tree-optimization/95297 - handle scalar shift arg for SLP invariant vectype

2020-05-25 Thread Richard Biener

This skips invariant vector type setting for a scalar shift argument.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2020-05-25  Richard Biener  

PR tree-optimization/95297
* tree-vect-stmts.c (vectorizable_shift): For scalar_shift_arg
skip updating operand 1 vector type.

* g++.dg/vect/pr95297.cc: New testcase.
---
 gcc/testsuite/g++.dg/vect/pr95297.cc | 22 ++
 gcc/tree-vect-stmts.c|  3 ++-
 2 files changed, 24 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/vect/pr95297.cc

diff --git a/gcc/testsuite/g++.dg/vect/pr95297.cc 
b/gcc/testsuite/g++.dg/vect/pr95297.cc
new file mode 100644
index 000..6ffc92e5fd0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/pr95297.cc
@@ -0,0 +1,22 @@
+// { dg-do compile }
+// { dg-additional-options "-O3 -fvect-cost-model=dynamic" }
+
+extern bool var_10;
+extern int var_16;
+extern short var_17;
+extern long var_18;
+extern int arr_3[][13];
+
+int min(const int , const int )
+{
+  return a < b ? a : b;
+}
+
+void test() {
+for (short a = 0; a < 010; a++)
+  for (char b = 0; b < 012; b++)
+   arr_3[a][b] = min(-var_10, 0) + 2147483647 >> var_10;
+var_16 = (bool)4;
+var_17 = 0;
+var_18 = -1594153176;
+}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 9023dd4c216..76c7b995817 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -5784,7 +5784,8 @@ vectorizable_shift (vec_info *vinfo,
 {
   if (slp_node
  && (!vect_maybe_update_slp_op_vectype (slp_op0, vectype)
- || !vect_maybe_update_slp_op_vectype (slp_op1, op1_vectype)))
+ || (!scalar_shift_arg
+ && !vect_maybe_update_slp_op_vectype (slp_op1, op1_vectype
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-- 
2.25.1

[WIP] Fold 'NON_LVALUE_EXPR' some more (was: C++ 'NON_LVALUE_EXPR' in OMP array section handling)

2020-05-25 Thread Thomas Schwinge

Hi!

Anyone have any input here, especially whether something like the WIP
patch attached to generally "Fold 'NON_LVALUE_EXPR' some more" is
preferable over local 'STRIP_NOPS'?

On 2020-03-26T20:53:19+0100, I wrote:
> On 2020-03-26T09:09:01-0600, Sandra Loosemore  wrote:
>> On 3/26/20 8:27 AM, Thomas Schwinge wrote:
>>> Note that as this code is shared between OpenACC/OpenMP, this might
>>> affect OpenMP, too, as far as I can tell.  (Subject updated.)  Jakub, can
>>> you please have a look, too?
>>>
>>> On 2020-03-25T23:02:38-0600, Sandra Loosemore  
>>> wrote:
 The attached patch fixes a bug I found in the C++ front end's handling
 of OpenACC data clauses.  The problem here is that the C++ front end
 wraps the array bounds expressions in NON_LVALUE_EXPR tree nodes, and
 the code here (which appears to have been copied from similar code in
 the C front end) was failing to strip those before checking to see if
 they were INTEGER_CST nodes, etc.
>>>
>>> So, I had a quick look.  I'm confirming the 'NON_LVALUE_EXPR' (C++ only,
>>> not seen for C) difference, and that 'STRIP_NOPS' gets rid of these.
>>> However, I also in some code paths see, for example, 'integer_nonzerop'
>>> calls, which internally do 'STRIP_ANY_LOCATION_WRAPPER'.
>>>
>>> I don't know if 'STRIP_NOPS' is the correct "hammer" to use here, I don't
>>> know what the usual convention is: explicitly remove (via 'STRIP_NOPS' as
>>> you suggested, or something else), or have the enquiry functions do it
>>> ('STRIP_ANY_LOCATION_WRAPPER' as 'integer_nonzerop' is doing, for
>>> example).
>>>
>>> Empirical data doesn't mean too much here, of course, I'm not seeing a
>>> lot of explicit 'STRIP_*' calls in 'gcc/cp/semantics.c'.  ;-)
>>
>> I saw that STRIP_NOPS seem to be the preferred way to do things in e.g.
>> fold-const.c.  I don't know if there's a reason to use some less general
>> form of STRIP_* here?

 Sadly, I have no test case for this because it was only triggering an
 error in conjunction with some other OpenACC patches that are not yet on
 trunk

>> In the current code [we have]
>> checks like
>>
>> TREE_CODE (low_bound) == INTEGER_CST
>>
>> etc.  So when they're wrapped in NON_LVALUE_EXPRs, it's basically
>> failing to detect constants in array dimension specifiers and falling
>> through to other code.
>
> Eh, indeed...  ;-\ (So we should be able to deduce some misbehavior from
> that, to construct a test case.  I'll have a look.)

(Have not yet been able to look into constructing any test cases.)


> So.  I'm not objecting to handling 'NON_LVALUE_EXPR's locally here via
> any kind of 'STRIP_*', but it somehow doesn't seem the general solution.
> Another option seems to be to teach 'fold_simple' to handle
> 'NON_LVALUE_EXPR's, so that the existing code:
>
> /* We need to reduce to real constant-values for checks below.  */
> if (length)
>   length = fold_simple (length);
> if (low_bound)
>   low_bound = fold_simple (low_bound);
> if (low_bound
> && TREE_CODE (low_bound) == INTEGER_CST
> && [...])
>   low_bound = fold_convert (sizetype, low_bound);
> if (length
> && TREE_CODE (length) == INTEGER_CST
> && [...])
>   length = fold_convert (sizetype, length);
>
> ... would then just work.  But: I don't know if 'fold_simple' (and
> others?) should handle 'NON_LVALUE_EXPR's, or if there's a reason why it
> doesn't.  (Have not yet tried to figure that out.)  What I can tell is
> that the attached patch does solve the issue in the C++ OMP array section
> handling, and survives a powerpc64le-unknown-linux-gnu
> '--enable-checking=yes' (will now re-run with 'fold' checking) bootstrap,
> with no testsuite regressions.
>
> Hmm.


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From 611fbe24b7e459829c0a304a58963d4987c8de0a Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Thu, 26 Mar 2020 21:22:54 +0100
Subject: [PATCH] Fold 'NON_LVALUE_EXPR' some more

---
 gcc/cp/constexpr.c | 1 +
 gcc/fold-const.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 192face9a3a..f31d61c1460 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -6650,6 +6650,7 @@ fold_simple_1 (tree t)
 case BIT_NOT_EXPR:
 case TRUTH_NOT_EXPR:
 case NOP_EXPR:
+case NON_LVALUE_EXPR:
 case VIEW_CONVERT_EXPR:
 case CONVERT_EXPR:
 case FLOAT_EXPR:
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 71a1d3eb735..b6bc5080ff3 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -1739,6 +1739,7 @@ const_unop (enum tree_code code, tree type, tree arg0)
   switch (code)
 {
 CASE_CONVERT:
+case NON_LVALUE_EXPR:
 case FLOAT_EXPR:
 case FIX_TRUNC_EXPR:
 case FIXED_CONVERT_EXPR:
-- 
2.17.1

[PATCH] tree-optimization/95308 - really avoid forward propagating of

2020-05-25 Thread Richard Biener

This fixes a hole that still allowed forwarding of TARGET_MEM_REF
addresses.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2020-05-25  Richard Biener  

PR tree-optimization/95308
* tree-ssa-forwprop.c (pass_forwprop::execute): Generalize
test for TARGET_MEM_REFs.

* g++.dg/torture/pr95308.C: New testcase.
---
 gcc/testsuite/g++.dg/torture/pr95308.C | 21 +
 gcc/tree-ssa-forwprop.c| 14 +++---
 2 files changed, 28 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/pr95308.C

diff --git a/gcc/testsuite/g++.dg/torture/pr95308.C 
b/gcc/testsuite/g++.dg/torture/pr95308.C
new file mode 100644
index 000..01aa8ad17a6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr95308.C
@@ -0,0 +1,21 @@
+// { dg-do compile }
+// { dg-additional-options "-march=skylake-avx512" { target x86_64-*-* 
i?86-*-* } }
+
+extern int a[][18];
+extern short b[], c[];
+extern char d[][18];
+int e;
+void i(char f, long g[][100][100][100])
+{
+  for (int h = 0;; h += 2)
+for (char j = 0; j < 17; j++) {
+   if (e ? f : 0) {
+   a[h][j] = 5;
+   for (int k = 0; k < 12; k += 4)
+ for (short l = 0; l < 015; l += 2)
+   b[k * 3 + l] = bool(g[2][j][k][l]);
+   } else
+ d[h][j] = 0;
+   c[j] = 3;
+}
+}
diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
index b2ea3622d70..759baf56897 100644
--- a/gcc/tree-ssa-forwprop.c
+++ b/gcc/tree-ssa-forwprop.c
@@ -2763,18 +2763,18 @@ pass_forwprop::execute (function *fun)
 
  /* If this statement sets an SSA_NAME to an address,
 try to propagate the address into the uses of the SSA_NAME.  */
- if (code == ADDR_EXPR
- /* Handle pointer conversions on invariant addresses
-as well, as this is valid gimple.  */
- || (CONVERT_EXPR_CODE_P (code)
- && TREE_CODE (rhs) == ADDR_EXPR
- && POINTER_TYPE_P (TREE_TYPE (lhs
+ if ((code == ADDR_EXPR
+  /* Handle pointer conversions on invariant addresses
+ as well, as this is valid gimple.  */
+  || (CONVERT_EXPR_CODE_P (code)
+  && TREE_CODE (rhs) == ADDR_EXPR
+  && POINTER_TYPE_P (TREE_TYPE (lhs
+ && TREE_CODE (TREE_OPERAND (rhs, 0)) != TARGET_MEM_REF)
{
  tree base = get_base_address (TREE_OPERAND (rhs, 0));
  if ((!base
   || !DECL_P (base)
   || decl_address_invariant_p (base))
- && TREE_CODE (base) != TARGET_MEM_REF
  && !stmt_references_abnormal_ssa_name (stmt)
  && forward_propagate_addr_expr (lhs, rhs, true))
{
-- 
2.12.3

[patch] Fix internal error on store to FP component at -O2

2020-05-25 Thread Eric Botcazou

Hi,

the attached Ada testcase triggers a GIMPLE verification failure at -O2 or 
above because the GIMPLE store merging pass generates a NOP_EXPR between a FP 
type and an integral type.  This happens when the bit-field insertion path is 
taken for a FP field, which can happen in Ada for bit-packed record types.

It is fixed by generating an intermediate VIEW_CONVERT_EXPR.  The patch also 
tames a little the bit-field insertion path because, for bit-packed record 
types in Ada, you can end up with large bit-field regions, which results in a 
lot of mask-and-shifts instructions.

Tested on x86-64/Linux, OK for the mainline?


2020-05-25  Eric Botcazou  

* gimple-ssa-store-merging.c (merged_store_group::can_be_merged_into):
Only turn MEM_REFs into bit-field stores for small bit-field regions.
(imm_store_chain_info::output_merged_store): Be prepared for sources
with non-integral type in the bit-field insertion case.
(pass_store_merging::process_store): Use MAX_BITSIZE_MODE_ANY_INT as
the largest size for the bit-field case.


2020-05-25  Eric Botcazou  

* gnat.dg/opt84.adb: New test.

-- 
Eric Botcazou--  { dg-do compile }
--  { dg-options "-O2" }

with Ada.Text_IO;
with Interfaces;

procedure Opt84 is

   type Integer_8  is new Interfaces.Integer_8;
   type Integer_16 is new Interfaces.Integer_16;
   type Integer_32 is new Interfaces.Integer_32;

   type Float_32 is new Interfaces.IEEE_Float_32;

   type Natural_4 is range 0 .. 2 ** 4 - 1;
   for Natural_4'Size use 4;

   type Rec_Type is
  record
 Field_A_Int_8: Integer_8;
 Field_B_Nat_4: Natural_4;
 Field_C_Nat_4: Natural_4;
 Field_D_Int_32   : Integer_32;
 Field_E_Int_32   : Integer_32;
 Field_F_Float_32 : Float_32;
 Field_G_Float_32 : Float_32;
 Field_H_Float_32 : Float_32;
 Field_I_Float_32 : Float_32;
 Field_J_Int_16   : Integer_16;
 Field_K_Int_16   : Integer_16;
 Field_L_Int_16   : Integer_16;
 Field_M_Int_16   : Integer_16;
 Field_N_Float_32 : Float_32;
 Field_O_Float_32 : Float_32;
  end record;
   pragma Pack (Rec_Type);
   for Rec_Type'Alignment use 1;

   procedure Print
 (Item : in Rec_Type) is
   begin
  Ada.Text_IO.Put_Line (Item.Field_F_Float_32'Image);
  Ada.Text_IO.Put_Line (Item.Field_G_Float_32'Image);
  Ada.Text_IO.Put_Line (Item.Field_H_Float_32'Image);
  Ada.Text_IO.Put_Line (Item.Field_I_Float_32'Image);
   end Print;

   procedure Test_Foo is
  Source : Rec_Type;
  Dest   : Rec_Type;
   begin
  Source.Field_A_Int_8 := 0;
  Dest.Field_A_Int_8:= 1;
  Dest.Field_B_Nat_4:= Source.Field_B_Nat_4;
  Dest.Field_C_Nat_4:= Source.Field_C_Nat_4;
  Dest.Field_D_Int_32   := Source.Field_D_Int_32;
  Dest.Field_E_Int_32   := Source.Field_E_Int_32;
  Dest.Field_F_Float_32 := Source.Field_F_Float_32;
  Dest.Field_G_Float_32 := Source.Field_G_Float_32;
  Dest.Field_H_Float_32 := Source.Field_H_Float_32;
  Dest.Field_I_Float_32 := Source.Field_I_Float_32;
  Dest.Field_J_Int_16   := Source.Field_J_Int_16;
  Dest.Field_K_Int_16   := Source.Field_K_Int_16;
  Dest.Field_L_Int_16   := Source.Field_L_Int_16;
  Dest.Field_M_Int_16   := Source.Field_M_Int_16;
  Dest.Field_N_Float_32 := Source.Field_N_Float_32;
  Dest.Field_O_Float_32 := Source.Field_O_Float_32;
  Print (Source);
  Print (Dest);
   end Test_Foo;

begin
   Test_Foo;
end;
diff --git a/gcc/gimple-ssa-store-merging.c b/gcc/gimple-ssa-store-merging.c
index c8e1877f540..65e27f38fd4 100644
--- a/gcc/gimple-ssa-store-merging.c
+++ b/gcc/gimple-ssa-store-merging.c
@@ -1867,19 +1867,22 @@ merged_store_group::can_be_merged_into (store_immediate_info *info)
   if (stores[0]->rhs_code == BIT_INSERT_EXPR && info->rhs_code == INTEGER_CST)
 return true;
 
-  /* We can turn MEM_REF into BIT_INSERT_EXPR for bit-field stores.  */
+  /* We can turn MEM_REF into BIT_INSERT_EXPR for bit-field stores, but do it
+ only for small regions since this can generate a lot of instructions.  */
   if (info->rhs_code == MEM_REF
   && (stores[0]->rhs_code == INTEGER_CST
 	  || stores[0]->rhs_code == BIT_INSERT_EXPR)
   && info->bitregion_start == stores[0]->bitregion_start
-  && info->bitregion_end == stores[0]->bitregion_end)
+  && info->bitregion_end == stores[0]->bitregion_end
+  && info->bitregion_end - info->bitregion_start < MAX_BITSIZE_MODE_ANY_INT)
 return true;
 
   if (stores[0]->rhs_code == MEM_REF
   && (info->rhs_code == INTEGER_CST
 	  || info->rhs_code == BIT_INSERT_EXPR)
   && info->bitregion_start == stores[0]->bitregion_start
-  && info->bitregion_end == stores[0]->bitregion_end)
+  && info->bitregion_end == stores[0]->bitregion_end
+  && info->bitregion_end - info->bitregion_start < MAX_BITSIZE_MODE_ANY_INT)
 return true;
 
   return false;
@@

Re: [PATCH v3] Add -fuse-ld= to specify an arbitrary executable as the linker

2020-05-25 Thread Martin Liška


On 5/22/20 6:42 AM, Fangrui Song wrote:

but I can't fix this one because joining two lines will break the 80-column 
rule.


What about this:

diff --git a/gcc/collect2.c b/gcc/collect2.c
index cc57a20e08b..e5b54b080f7 100644
--- a/gcc/collect2.c
+++ b/gcc/collect2.c
@@ -1138,8 +1138,8 @@ main (int argc, char **argv)
   /* Search the ordinary system bin directories
 for `ld' (if native linking) or `TARGET-ld' (if cross).  */
   if (ld_file_name == 0)
-   ld_file_name =
- find_a_file (, full_ld_suffixes[selected_linker], X_OK);
+   ld_file_name
+ = find_a_file (, full_ld_suffixes[selected_linker], X_OK);
 }
 
 #ifdef REAL_NM_FILE_NAME


Apart from that, the patch is fine.

Martin

Re: V3 [PATCH] x86: Move cpuinfo.h from libgcc to common/config/i386

2020-05-25 Thread Martin Liška


Hello.

I really welcome the unification patch and I have some comments (ideas):

1)


+static inline int
+has_cpu_feature (struct __processor_model *cpu_model,
+  unsigned int *cpu_features2,
+  enum processor_features f)
+{
+  unsigned int i;
+  if (f < 32)
+return cpu_model->__cpu_features[0] & (1U << (f & 31));
+  for (i = 0; i < SIZE_OF_CPU_FEATURES; i++)
+if (f < (32 + 32 + i * 32))
+return cpu_features2[i] & (1U << ((f - (32 + i * 32)) & 31));
+  gcc_unreachable ();
+}
+
+static inline void
+set_cpu_feature (struct __processor_model *cpu_model,
+  unsigned int *cpu_features2,
+  enum processor_features f)
+{
+  unsigned int i;
+  if (f < 32)
+{
+  cpu_model->__cpu_features[0] |= (1U << (f & 31));
+  return;
+}
+  for (i = 0; i < SIZE_OF_CPU_FEATURES; i++)
+if (f < (32 + 32 + i * 32))
+  {
+ cpu_features2[i] |= (1U << ((f - (32 + i * 32)) & 31));
+ return;
+  }
+  gcc_unreachable ();
+}


Can you please add comment about the '32 + 32' and '32 + i * 32', it's unclear 
from the patch.

2) Can we remove all the:


+  unsigned int has_lahf_lm, has_sse4a;
+  unsigned int has_longmode, has_3dnowp, has_3dnow;
+  unsigned int has_movbe, has_sse4_1, has_sse4_2;
+  unsigned int has_popcnt, has_aes, has_avx, has_avx2;
+  unsigned int has_pclmul, has_abm, has_lwp;
+  unsigned int has_fma, has_fma4, has_xop;
+  unsigned int has_bmi, has_bmi2, has_tbm, has_lzcnt;


...


+  has_sse4a = has_feature (FEATURE_SSE4_A);
+  has_fma4 = has_feature (FEATURE_FMA4);
+  has_xop = has_feature (FEATURE_XOP);
+  has_fma = has_feature (FEATURE_FMA);


...

  if (arch)
{
  const char *mmx = has_mmx ? " -mmmx" : " -mno-mmx";
  const char *mmx3dnow = has_3dnow ? " -m3dnow" : " -mno-3dnow";
  const char *sse = has_sse ? " -msse" : " -mno-sse";
...

  options = concat (options, mmx, mmx3dnow, sse, sse2, sse3, ssse3,
sse4a, cx16, sahf, movbe, aes, sha, pclmul,
popcnt, abm, lwp, fma, fma4, xop, bmi, sgx, bmi2,
pconfig, wbnoinvd,
tbm, avx, avx2, sse4_2, sse4_1, lzcnt, rtm,
hle, rdrnd, f16c, fsgsbase, rdseed, prfchw, adx,
fxsr, xsave, xsaveopt, avx512f, avx512er,
avx512cd, avx512pf, prefetchwt1, clflushopt,
xsavec, xsaves, avx512dq, avx512bw, avx512vl,
avx512ifma, avx512vbmi, avx5124fmaps, avx5124vnniw,
clwb, mwaitx, clzero, pku, rdpid, gfni, shstk,
avx512vbmi2, avx512vnni, vaes, vpclmulqdq,
avx512bitalg, avx512vpopcntdq, movdiri, movdir64b,
waitpkg, cldemote, ptwrite, avx512bf16, enqcmd,
avx512vp2intersect, serialize, tsxldtrk, NULL);

and instead mark flags in 'isa_names_table' that should be used for 
-match=native option emission.
We know names of the flags, their value (has_feature) and so we can generate 
that automatically?

3) Similarly we can we do automatically all the:

+  if (has_feature (FEATURE_AVX512VBMI2))
+assert (__builtin_cpu_supports ("avx512vbmi2"));

We should have both information in 'isa_names_table'.

Thanks,
Martin

Re: [PATCH] Implement no_stack_protect attribute.

2020-05-25 Thread Martin Liška


On 5/22/20 12:51 PM, Richard Biener wrote:

On Thu, May 21, 2020 at 12:09 PM Martin Liška  wrote:


On 5/18/20 1:49 PM, Richard Biener wrote:

On Mon, May 18, 2020 at 1:10 PM Jakub Jelinek via Gcc-patches
 wrote:


On Mon, May 18, 2020 at 01:03:40PM +0200, Jakub Jelinek wrote:

The optimize attribute is used to specify that a function is to be compiled
with different optimization options than specified on the command line.
```

It's pretty clear about the command line arguments (that are ignored).


That is not clear at all.  The difference is primarily in what the option
string says in there.


And if we decide we want to keep existing behavior (haven't checked if we
actually behave that way yet), we should add some syntax that will allow
ammending command line options rather than replacing it.


Hello.

Back to this, I must say that option handling is a complex thing in the GCC.



We do behave this way - while we're running against the current
gcc_options we call decode_options which first does
default_options_optimization.  So it behaves inconsistently with
respect to options not explicitly listed in default_options_table.


Yes, we basically build on top of the currently selection flags.
We use default_options_table because any optimization level selection
in an optimization attribute should efficiently enforce call of 
default_options_table.

What about using them only in case one changes optimization level (patch)?


I'm not sure if that works, no -On means -O0, so one might interpret
optimize("omit-frame-pointer") as -O0 -fomit-frame-pointer.   Also -On
are not treated in position dependent way, thus -fno-tree-pre -O2 is
the same as -O2 -fno-tree-pre rather than re-enabling PRE over -fno-tree-pre.
So your patch would turn a -O2 -fno-tree-pre invocation, when
optimize("O3") is encountered, to one with PRE enabled, not matching
behavior of -O2 -fno-tree-pre -O3.

I think the only sensible way is to save the original decoded options
from the gcc invocation (and have #pragma optimize push/pop append
accordingly) and append the current attribute/pragmas optimization
string to them and run that whole thing on a default option state.


That should work. Anyway, that's a task with unclear finish.
Would it be possible to add the no_stack_protect attribute for the time being?
The limitation is there for ages and we have another "optimize" attributes
that can be theoretically replaced with proper optimize handling.



That makes semantics equivalent to appending more options to the
command-line.  Well, hopefully.  Interaction with the target attribute
might be interesting (that likely also needs to append to that
"current decoded options" set).


I fear from the interaction of optimization and target attribute.



As you say, option handling is difficult.


Anyway, I'm planning to work on the options in stage1. It's quite close
relative to PR92860.

Martin



Richard.


Martin



But I also think doing the whole processing as in decode_options
is the only thing that's really tested.  And a fix would be to not
start from the current set of options but from a clean slate...

The code clearly thinks we're doing incremental changes:

/* Save current options.  */
cl_optimization_save (_opts, _options);

/* If we previously had some optimization options, use them as the
   default.  */
if (old_opts)
  cl_optimization_restore (_options,
   TREE_OPTIMIZATION (old_opts));

/* Parse options, and update the vector.  */
parse_optimize_options (args, true);
DECL_FUNCTION_SPECIFIC_OPTIMIZATION (*node)
  = build_optimization_node (_options);

/* Restore current options.  */
cl_optimization_restore (_options, _opts);

so for example you should see -fipa-pta preserved from the
command-line while -fno-tree-pta would be overridden even
if not mentioned explicitely in the optimize attribute.

IMHO the current situation is far from useful...

Richard.


Could be say start the optimize attribute string with + or something
similar.
Does target attribute behave that way too?

  Jakub

Re: New mklog script

2020-05-25 Thread Martin Liška


On 5/22/20 11:01 PM, Jason Merrill wrote:

On Thu, May 21, 2020 at 6:03 PM Jason Merrill  wrote:


On Fri, May 15, 2020 at 11:39 AM Martin Liška  wrote:


On 5/15/20 3:22 PM, Marek Polacek wrote:

On Fri, May 15, 2020 at 03:12:27PM +0200, Martin Liška wrote:

On 5/15/20 2:42 PM, Marek Polacek wrote:

I actually use mklog -i all the time.  But I can work around it if it
disappears.


Ah, I can see a consumer.
There's an updated version that supports that.

For the future, will you still use the option? Wouldn't be better
to put the ChangeLog content directly to commit message? Note
that you won't have to copy the entries to a particular ChangeLog file.


The way I do it is to generate a patch using format-patch, use mklog -i
on it, then add the ChangeLog entry to the commit message via commit --amend.


Hmm, you can do much better with:

$ git diff | ./contrib/mklog > changelog && git commit -a -t changelog

Or for an already created commit you can do:

$ git diff HEAD~ | ./contrib/mklog > changelog && git commit -a --amend -e -F 
changelog


With these git aliases:

 mklog-editor = "!f() { git show | git gcc-mklog >> $1; }; f"
 addlog = "!f() { GIT_EDITOR='git mklog-editor' git commit --amend; }; 
f"

I can 'git addlog' to append the output of mklog to the current
commit.  Probably better would be to do something with
prepare-commit-msg.


This is pretty rudimentary, but good enough as a start:


I like the idea of usage of the prepare commit hook.



#!/bin/sh

#COMMIT_MSG_FILE=$1
#COMMIT_SOURCE=$2
#SHA1=$3


It's better to use the named arguments.



if ! [ -f "$1" ]; then exit 0; fi

#echo "# $0 $1 $2 $3" >> $1

if fgrep 'ChangeLog:' $1 > /dev/null 2>&1; then exit 0; fi

if [ -z "$2" ]; then
 cmd="diff --cached"
elif [ $2 == commit ]; then
 cmd="show $3"
else
 exit 0
fi

git $cmd | git gcc-mklog >> $1



Well, that will generate changelog entry for each commit.
For a user branch development, it's not desirable.

What about more explicit approach:

1) making an alias for: git diff | git gcc-mklog > commit.msg
2) hook:

if test -f commit.msg; then
  cat commit.msg >> "$COMMIT_MSG_FILE"
  rm commit.msg
fi

So the changelog is created explicitly and included implicitly.

Martin

[PATCH] tree-optimization/95284 - amend previous store commoning fix

2020-05-25 Thread Richard Biener

Generalize check for clobbers.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2020-05-25  Richard Biener  

PR tree-optimization/95284
* tree-ssa-sink.c (sink_common_stores_to_bb): Amend previous
fix.

* g++.dg/torture/pr95284.C: New testcase.
---
 gcc/testsuite/g++.dg/torture/pr95284.C | 16 
 gcc/tree-ssa-sink.c|  2 +-
 2 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/pr95284.C

diff --git a/gcc/testsuite/g++.dg/torture/pr95284.C 
b/gcc/testsuite/g++.dg/torture/pr95284.C
new file mode 100644
index 000..3c273ef8d55
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr95284.C
@@ -0,0 +1,16 @@
+// { dg-do compile }
+// { dg-require-effective-target lp64 }
+
+#include 
+
+short a;
+unsigned long long c;
+char d;
+unsigned e;
+
+void f()
+{
+  for (;;)
+for (char b = 0; b < 19; b += 2)
+  a = std::min((1 ? d : 0) ? e : c, (unsigned long long)72252803048);
+}
diff --git a/gcc/tree-ssa-sink.c b/gcc/tree-ssa-sink.c
index b61ecf12d1f..962ad076968 100644
--- a/gcc/tree-ssa-sink.c
+++ b/gcc/tree-ssa-sink.c
@@ -536,7 +536,7 @@ sink_common_stores_to_bb (basic_block bb)
  else if (! operand_equal_p (gimple_assign_lhs (first_store),
  gimple_assign_lhs (def), 0)
   || (gimple_clobber_p (first_store)
-  && !gimple_clobber_p (def)))
+  != gimple_clobber_p (def)))
{
  first_store = NULL;
  break;
-- 
2.25.1

[Ada] Fix internal error on problematic renaming

2020-05-25 Thread Eric Botcazou

This is an internal renaming generated for a generalized loop iteration made 
on a tagged record type with predicate, and gigi cannot use the most efficient 
way of implementing renamings because the renamed object is an expression with 
a non-empty Actions list.

Tested on x86-64/Linux, applied on the mainline.


2020-05-25  Eric Botcazou  

* gcc-interface/decl.c (gnat_to_gnu_entity): Add new local variable
and use it throughout the function.
: Rename local variable and adjust accordingly.  In the
case of a renaming, materialize the entity if the renamed object is
an N_Expression_With_Actions node.
: Use Alias accessor function consistently.


2020-05-25  Eric Botcazou  

* gnat.dg/renaming16.adb: New test.
* gnat.dg/renaming16_pkg.ads: New helper.

-- 
Eric Botcazoudiff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c
index bd69c3ab306..94ea05de14f 100644
--- a/gcc/ada/gcc-interface/decl.c
+++ b/gcc/ada/gcc-interface/decl.c
@@ -280,6 +280,8 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 {
   /* The construct that declared the entity.  */
   const Node_Id gnat_decl = Declaration_Node (gnat_entity);
+  /* The object that the entity renames, if any.  */
+  const Entity_Id gnat_renamed_obj = Renamed_Object (gnat_entity);
   /* The kind of the entity.  */
   const Entity_Kind kind = Ekind (gnat_entity);
   /* True if this is a type.  */
@@ -327,7 +329,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
   /* Contains the list of attributes directly attached to the entity.  */
   struct attrib *attr_list = NULL;
 
-  /* Since a use of an Itype is a definition, process it as such if it is in
+  /* Since a use of an itype is a definition, process it as such if it is in
  the main unit, except for E_Access_Subtype because it's actually a use
  of its base type, see below.  */
   if (!definition
@@ -375,7 +377,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 	}
 	}
 
-  /* This abort means the Itype has an incorrect scope, i.e. that its
+  /* This abort means the itype has an incorrect scope, i.e. that its
 	 scope does not correspond to the subprogram it is first used in.  */
   gcc_unreachable ();
 }
@@ -448,6 +450,14 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
  If we are not defining it, it must be a type or an entity that is defined
  elsewhere or externally, otherwise we should have defined it already.
 
+ In other words, the failure of this assertion typically arises when a
+ reference to an entity (type or object) is made before its declaration,
+ either directly or by means of a freeze node which is incorrectly placed.
+ This can also happen for an entity referenced out of context, for example
+ a parameter outside of the subprogram where it is declared.  GNAT_ENTITY
+ is the N_Defining_Identifier of the entity, the problematic N_Identifier
+ being the argument passed to Identifier_to_gnu in the parent frame.
+
  One exception is for an entity, typically an inherited operation, which is
  a local alias for the parent's operation.  It is neither defined, since it
  is an inherited operation, nor public, since it is declared in the current
@@ -636,7 +646,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 	  && !gnu_expr
 	  && No (Address_Clause (gnat_entity))
 	  && !No_Initialization (gnat_decl)
-	  && No (Renamed_Object (gnat_entity)))
+	  && No (gnat_renamed_obj))
 	{
 	  gnu_decl = error_mark_node;
 	  saved = true;
@@ -692,7 +702,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 	 && !Treat_As_Volatile (gnat_entity)
 	 && (((Nkind (gnat_decl) == N_Object_Declaration)
 		  && Present (Expression (gnat_decl)))
-		 || Present (Renamed_Object (gnat_entity))
+		 || Present (gnat_renamed_obj)
 		 || imported_p));
 	bool inner_const_flag = const_flag;
 	bool static_flag = Is_Statically_Allocated (gnat_entity);
@@ -704,20 +714,20 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 	bool mutable_p = false;
 	bool used_by_ref = false;
 	tree gnu_ext_name = NULL_TREE;
-	tree renamed_obj = NULL_TREE;
+	tree gnu_renamed_obj = NULL_TREE;
 	tree gnu_ada_size = NULL_TREE;
 
 	/* We need to translate the renamed object even though we are only
 	   referencing the renaming.  But it may contain a call for which
 	   we'll generate a temporary to hold the return value and which
 	   is part of the definition of the renaming, so discard it.  */
-	if (Present (Renamed_Object (gnat_entity)) && !definition)
+	if (Present (gnat_renamed_obj) && !definition)
 	  {
 	if (kind == E_Exception)
 	  gnu_expr = gnat_to_gnu_entity (Renamed_Entity (gnat_entity),
 	 NULL_TREE, false);
 	else
-	  gnu_expr = gnat_to_gnu_external

Re: [Ada] Fix small issues with -fgnat-encodings=minimal

2020-05-25 Thread Eric Botcazou

> This is the mode where the GNAT compiler does not use special encodings in
> the debug info to describe some Ada constructs, for example packed array
> types.

This fixes a small fallout of the patch.  Tested on x86-64/Linux, applied on 
the mainline.


2020-05-25  Eric Botcazou  

* gcc-interface/misc.c (get_array_bit_stride): Get to the debug type,
if any, before calling gnat_get_array_descr_info.

-- 
Eric Botcazoudiff --git a/gcc/ada/gcc-interface/misc.c b/gcc/ada/gcc-interface/misc.c
index 5a5850a85e0..f8fa8563161 100644
--- a/gcc/ada/gcc-interface/misc.c
+++ b/gcc/ada/gcc-interface/misc.c
@@ -1003,6 +1003,9 @@ get_array_bit_stride (tree comp_type)
   if (INTEGRAL_TYPE_P (comp_type))
 return TYPE_RM_SIZE (comp_type);
 
+  /* The gnat_get_array_descr_info debug hook expects a debug tyoe.  */
+  comp_type = maybe_debug_type (comp_type);
+
   /* Otherwise, see if this is an array we can analyze; if it's not, punt.  */
   memset (, 0, sizeof (info));
   if (!gnat_get_array_descr_info (comp_type, ) || !info.stride)

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-25 Thread Uros Bizjak via Gcc-patches

On Mon, May 25, 2020 at 8:27 AM Richard Biener  wrote:
>
> On May 25, 2020 8:12:12 AM GMT+02:00, Uros Bizjak  wrote:
> >On Mon, May 25, 2020 at 7:53 AM Hongtao Liu  wrote:
> >
> >> > We have to introduce a new expander, that will have conforming mode
> >of
> >> > output operand (V2SF) and will produce RTX that will match
> >> > *floatv2div2sf2. A paradoxical output subreg from
> >> > V2SFmode V4SFmode is needed, generated by simplify_gen_subreg as is
> >> > the case with paradoxical input subreg.
> >>
> >> Problem`is simplify_gen_subreg (V4SFmode, operands[0], V2SFmode, 0)
> >> will return NULL since
> >> 
> >> 948  /* Subregs involving floating point modes are not allowed to
> >> 949 change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
> >> 950 (subreg:SI (reg:DF) 0) isn't.  */
> >
> >But, we are not changing size, we are still operating with SFmode. It
> >looks to me that this limitation is too strict, the intention is to
> >not expand scalar SFmode to DFmode.
>
> I guess so. The test probably wants to tes the component mode.

There is already some fishy x86 specific workaround in place, which I
took the liberty to extend to vector modes. The attached patch that
exercises this code works as expected, and produces expected code.

Liu, can you please test the attached patch?

Richi, is the middle end part OK?

Uros.
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index fa123788a8e..b873498f3ab 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -1649,9 +1649,9 @@ BDESC (OPTION_MASK_ISA_AVX512DQ | 
OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_ufix_not
 BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_ufix_notruncv4dfv4si2_mask, 
"__builtin_ia32_cvtpd2udq256_mask", IX86_BUILTIN_CVTPD2UDQ256_MASK, UNKNOWN, 
(int) V4SI_FTYPE_V4DF_V4SI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_ufix_notruncv2dfv2si2_mask, 
"__builtin_ia32_cvtpd2udq128_mask", IX86_BUILTIN_CVTPD2UDQ128_MASK, UNKNOWN, 
(int) V4SI_FTYPE_V2DF_V4SI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, 
CODE_FOR_fix_truncv4sfv4di2_mask, "__builtin_ia32_cvttps2qq256_mask", 
IX86_BUILTIN_CVTTPS2QQ256, UNKNOWN, (int) V4DI_FTYPE_V4SF_V4DI_UQI)
-BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, 
CODE_FOR_fix_truncv2sfv2di2_mask, "__builtin_ia32_cvttps2qq128_mask", 
IX86_BUILTIN_CVTTPS2QQ128, UNKNOWN, (int) V2DI_FTYPE_V4SF_V2DI_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, 
CODE_FOR_avx512dq_fix_truncv2sfv2di2_mask, "__builtin_ia32_cvttps2qq128_mask", 
IX86_BUILTIN_CVTTPS2QQ128, UNKNOWN, (int) V2DI_FTYPE_V4SF_V2DI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, 
CODE_FOR_fixuns_truncv4sfv4di2_mask, "__builtin_ia32_cvttps2uqq256_mask", 
IX86_BUILTIN_CVTTPS2UQQ256, UNKNOWN, (int) V4DI_FTYPE_V4SF_V4DI_UQI)
-BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, 
CODE_FOR_fixuns_truncv2sfv2di2_mask, "__builtin_ia32_cvttps2uqq128_mask", 
IX86_BUILTIN_CVTTPS2UQQ128, UNKNOWN, (int) V2DI_FTYPE_V4SF_V2DI_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, 
CODE_FOR_avx512dq_fixuns_truncv2sfv2di2_mask, 
"__builtin_ia32_cvttps2uqq128_mask", IX86_BUILTIN_CVTTPS2UQQ128, UNKNOWN, (int) 
V2DI_FTYPE_V4SF_V2DI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fix_truncv8sfv8si2_mask, 
"__builtin_ia32_cvttps2dq256_mask", IX86_BUILTIN_CVTTPS2DQ256_MASK, UNKNOWN, 
(int) V8SI_FTYPE_V8SF_V8SI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fix_truncv4sfv4si2_mask, 
"__builtin_ia32_cvttps2dq128_mask", IX86_BUILTIN_CVTTPS2DQ128_MASK, UNKNOWN, 
(int) V4SI_FTYPE_V4SF_V4SI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_ufix_truncv8sfv8si2_mask, 
"__builtin_ia32_cvttps2udq256_mask", IX86_BUILTIN_CVTTPS2UDQ256, UNKNOWN, (int) 
V8SI_FTYPE_V8SF_V8SI_UQI)
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index cee2b2890a5..fde65391d7d 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -5795,7 +5795,7 @@
(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
-(define_expand "floatv2div2sf2"
+(define_expand "avx512dq_floatv2div2sf2"
   [(set (match_operand:V4SF 0 "register_operand" "=v")
(vec_concat:V4SF
(any_float:V2SF (match_operand:V2DI 1 "nonimmediate_operand" "vm"))
@@ -5803,7 +5803,7 @@
   "TARGET_AVX512DQ && TARGET_AVX512VL"
   "operands[2] = CONST0_RTX (V2SFmode);")
 
-(define_insn "*floatv2div2sf2"
+(define_insn "*avx512dq_floatv2div2sf2"
   [(set (match_operand:V4SF 0 "register_operand" "=v")
(vec_concat:V4SF
(any_float:V2SF (match_operand:V2DI 1 "nonimmediate_operand" "vm"))
@@ -5814,6 +5814,17 @@
(set_attr "prefix" "evex")
(set_attr "mode" "V4SF")])
 
+(define_expand "floatv2div2sf2"
+  [(set (match_operand:V2SF 0 "register_operand")
+   (any_float:V2SF (match_operand:V2DI 1 "nonimmediate_operand")))]
+  "TARGET_AVX512DQ && TARGET_AVX512VL"
+{
+  operands[0] = simplify_gen_subreg (V4SFmode, operands[0],

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-25 Thread Uros Bizjak via Gcc-patches

On Mon, May 25, 2020 at 8:13 AM Hongtao Liu  wrote:
> (define_insn "sse_storehps"
>   [(set (match_operand:V2SF 0 "nonimmediate_operand" "=m,v,v")
> (vec_select:V2SF
>   (match_operand:V4SF 1 "nonimmediate_operand" "v,v,o")
>   (parallel [(const_int 2) (const_int 3)])))]
>   "TARGET_SSE && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
>   "@
>%vmovhps\t{%1, %0|%q0, %1}
>%vmovhlps\t{%1, %d0|%d0, %1}
>%vmovlps\t{%H1, %d0|%d0, %H1}"
>   [(set_attr "type" "ssemov")
>(set_attr "prefix" "maybe_vex")
>(set_attr "mode" "V2SF,V4SF,V2SF")])
>
> (define_insn "sse_storelps"
>   [(set (match_operand:V2SF 0 "nonimmediate_operand"   "=m,v,v")
> (vec_select:V2SF
>   (match_operand:V4SF 1 "nonimmediate_operand" " v,v,m")
>   (parallel [(const_int 0) (const_int 1)])))]
>   "TARGET_SSE && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
>   "@
>%vmovlps\t{%1, %0|%q0, %1}
>%vmovaps\t{%1, %0|%0, %1}
>%vmovlps\t{%1, %d0|%d0, %q1}"
>   [(set_attr "type" "ssemov")
>(set_attr "prefix" "maybe_vex")
>(set_attr "mode" "V2SF,V4SF,V2SF")])
>
> Should they be restricted under TARGET_MMX_WITH_SSE or is there
> anything i missed?

You missed our sincere hope that reload won't spill register with MMX move ;) .

On a serious note, there are other safety nets in place that prevent
compiler from generating moves involving MMX registers for V2SFmode,
such as lying to the compiler that moves to/from MMX regset require
memory (see ix86_register_move_cost).

So, let sleeping dogs lie.

Uros.

Re: ChangeLog files - server and client scripts

2020-05-25 Thread Martin Liška


On 5/21/20 5:14 PM, Rainer Orth wrote:

* In changelog_location, you allow only (among others) "a/b/c/" and
   "\ta/b/c/".  Please also accept the "a/b/c:" and "\ta/b/c:" forms
   here: especially the second seems quite common.


Ok, I believe these formats are supported as well. Feel free to mention
some git revisions that are not recognized.

Thanks,
Martin

[Ada] Fix missing back-annotation for derived types

2020-05-25 Thread Eric Botcazou

Gigi fails to back-annotate the Present_Expr field of variants present in a 
type derived from a discriminated untagged record type, which is for example 
visible in the output -gnatRj.

Tested on x86-64/Linux, applied on the mainline.


2020-05-25  Eric Botcazou  

* gcc-interface/decl.c (gnat_to_gnu_entity) : Tidy up.
(build_variant_list): Add GNAT_VARIANT_PART parameter and annotate its
variants if it is present.  Adjust the recursive call by passing the
variant subpart of variants, if any.
(copy_and_substitute_in_layout): Rename GNU_SUBST_LIST to SUBST_LIST
and adjust throughout.  For a type, pass the variant part in the
call to build_variant_list.

-- 
Eric Botcazoudiff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c
index ab6e79ce3c1..bd69c3ab306 100644
--- a/gcc/ada/gcc-interface/decl.c
+++ b/gcc/ada/gcc-interface/decl.c
@@ -230,7 +230,7 @@ static Uint annotate_value (tree);
 static void annotate_rep (Entity_Id, tree);
 static tree build_position_list (tree, bool, tree, tree, unsigned int, tree);
 static vec build_subst_list (Entity_Id, Entity_Id, bool);
-static vec build_variant_list (tree, vec,
+static vec build_variant_list (tree, Node_Id, vec,
 	 vec);
 static tree maybe_saturate_size (tree);
 static tree validate_size (Uint, tree, Entity_Id, enum tree_code, bool, bool,
@@ -2992,15 +2992,6 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 
 /* Record Types and Subtypes
 
-   The following fields are defined on record types:
-
-		Has_Discriminants	True if the record has discriminants
-		First_Discriminant  Points to head of list of discriminants
-		First_Entity		Points to head of list of fields
-		Is_Tagged_Type		True if the record is tagged
-
-   Implementation of Ada records and discriminated records:
-
A record type definition is transformed into the equivalent of a C
struct definition.  The fields that are the discriminants which are
found in the Full_Type_Declaration node and the elements of the
@@ -8886,20 +8877,29 @@ build_subst_list (Entity_Id gnat_subtype, Entity_Id gnat_type, bool definition)
   return gnu_list;
 }
 
-/* Scan all fields in QUAL_UNION_TYPE and return a list describing the
-   variants of QUAL_UNION_TYPE that are still relevant after applying
-   the substitutions described in SUBST_LIST.  GNU_LIST is a pre-existing
+/* Scan all fields in {GNU_QUAL_UNION_TYPE,GNAT_VARIANT_PART} and return a list
+   describing the variants of GNU_QUAL_UNION_TYPE that are still relevant after
+   applying the substitutions described in SUBST_LIST.  GNU_LIST is an existing
list to be prepended to the newly created entries.  */
 
 static vec
-build_variant_list (tree qual_union_type, vec subst_list,
-		vec gnu_list)
+build_variant_list (tree gnu_qual_union_type, Node_Id gnat_variant_part,
+		vec subst_list, vec gnu_list)
 {
+  Node_Id gnat_variant;
   tree gnu_field;
 
-  for (gnu_field = TYPE_FIELDS (qual_union_type);
+  for (gnu_field = TYPE_FIELDS (gnu_qual_union_type),
+   gnat_variant
+	= Present (gnat_variant_part)
+	  ? First_Non_Pragma (Variants (gnat_variant_part))
+	  : Empty;
gnu_field;
-   gnu_field = DECL_CHAIN (gnu_field))
+   gnu_field = DECL_CHAIN (gnu_field),
+   gnat_variant
+	= Present (gnat_variant_part)
+	  ? Next_Non_Pragma (gnat_variant)
+	  : Empty)
 {
   tree qual = DECL_QUALIFIER (gnu_field);
   unsigned int i;
@@ -8918,11 +8918,21 @@ build_variant_list (tree qual_union_type, vec subst_list,
 
 	  gnu_list.safe_push (v);
 
+	  /* Annotate the GNAT node if present.  */
+	  if (Present (gnat_variant))
+	Set_Present_Expr (gnat_variant, annotate_value (qual));
+
 	  /* Recurse on the variant subpart of the variant, if any.  */
 	  variant_subpart = get_variant_part (variant_type);
 	  if (variant_subpart)
-	gnu_list = build_variant_list (TREE_TYPE (variant_subpart),
-	   subst_list, gnu_list);
+	gnu_list
+	  = build_variant_list (TREE_TYPE (variant_subpart),
+Present (gnat_variant)
+? Variant_Part
+  (Component_List (gnat_variant))
+: Empty,
+subst_list,
+gnu_list);
 
 	  /* If the new qualifier is unconditionally true, the subsequent
 	 variants cannot be accessed.  */
@@ -9806,7 +9816,7 @@ copy_and_substitute_in_layout (Entity_Id gnat_new_type,
 			   Entity_Id gnat_old_type,
 			   tree gnu_new_type,
 			   tree gnu_old_type,
-			   vec gnu_subst_list,
+			   vec subst_list,
 			   bool debug_info_p)
 {
   const bool is_subtype = (Ekind (gnat_new_type) == E_Record_Subtype);
@@ -9825,11 +9835,18 @@ copy_and_substitute_in_layout (Entity_Id gnat_new_type,
  build a new qualified union for the variants that are still relevant.  */
   if (gnu_variant_part)
 {
+  const Node_Id gnat_decl = Declaration_Node (gnat_new_type);
   variant_desc *v;
   unsigned int i;

[Ada] Fix incorrect handling of Component_Size

2020-05-25 Thread Eric Botcazou

The compiler can mishandle a Component_Size clause on an array type specifying 
a size multiple of the storage unit, when this size is not a multiple of the 
alignment of the component type.

Tested on x86-64/Linux, applied on the mainline.


2020-05-25  Eric Botcazou  

* gcc-interface/decl.c (gnat_to_gnu_component_type): Cap the alignment
of the component type according to the component size.


2020-05-25  Eric Botcazou  

* gnat.dg/array40.adb: New test.
* gnat.dg/array40_pkg.ads: New helper.

-- 
Eric Botcazoudiff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c
index a36b1298bed..ab6e79ce3c1 100644
--- a/gcc/ada/gcc-interface/decl.c
+++ b/gcc/ada/gcc-interface/decl.c
@@ -5153,7 +5153,7 @@ gnat_to_gnu_component_type (Entity_Id gnat_array, bool definition,
   unsigned int max_align;
 
   /* If an alignment is specified, use it as a cap on the component type
- so that it can be honored for the whole type.  But ignore it for the
+ so that it can be honored for the whole type, but ignore it for the
  original type of packed array types.  */
   if (No (Packed_Array_Impl_Type (gnat_array))
   && Known_Alignment (gnat_array))
@@ -5200,6 +5200,7 @@ gnat_to_gnu_component_type (Entity_Id gnat_array, bool definition,
   if (gnu_comp_size && !is_bit_packed)
 {
   tree orig_type = gnu_type;
+  unsigned int gnu_comp_align;
 
   gnu_type = make_type_from_size (gnu_type, gnu_comp_size, false);
   if (max_align > 0 && TYPE_ALIGN (gnu_type) > max_align)
@@ -5207,8 +5208,22 @@ gnat_to_gnu_component_type (Entity_Id gnat_array, bool definition,
   else
 	orig_type = gnu_type;
 
-  gnu_type = maybe_pad_type (gnu_type, gnu_comp_size, 0, gnat_array,
- true, definition, true);
+  /* We need to make sure that the size is a multiple of the alignment.
+	 But we do not misalign the component type because of the alignment
+	 of the array type here; this either must have been done earlier in
+	 the packed case or should be rejected in the non-packed case.  */
+  if (TREE_CODE (gnu_comp_size) == INTEGER_CST)
+	{
+	  const unsigned HOST_WIDE_INT int_size = tree_to_uhwi (gnu_comp_size);
+	  gnu_comp_align = int_size & -int_size;
+	  if (gnu_comp_align > TYPE_ALIGN (gnu_type))
+	gnu_comp_align = 0;
+	}
+   else
+	 gnu_comp_align = 0;
+
+  gnu_type = maybe_pad_type (gnu_type, gnu_comp_size, gnu_comp_align,
+ gnat_array, true, definition, true);
 
   /* If a padding record was made, declare it now since it will never be
 	 declared otherwise.  This is necessary to ensure that its subtrees
-- { dg-do run }
-- { dg-options "-gnatws" }

with System.Storage_Elements;
with Array40_Pkg; use Array40_Pkg;

procedure Array40 is

  use System;
  use System.Storage_Elements;

begin
  if A(1)'Size /= 40 then
raise Program_Error;
  end if;

  if (A(2)'Address - A(1)'Address) * System.Storage_Unit /= 40 then
raise Program_Error;
  end if;

end;
package Array40_Pkg is

  type Rec is record
I : Integer;
  end record;

  type Arr is array (1 .. 4) of Rec;
  for Arr'Component_Size use 40;

  A : Arr;

end Array40_Pkg;

Re: [PATCH] Let numeric_limits::is_iec559 reflect -ffast-math

2020-05-25 Thread Matthias Kretz

On Freitag, 22. Mai 2020 18:39:42 CEST Jonathan Wakely wrote:
> On 22/05/20 09:49 +0200, Matthias Kretz wrote:
> >On Donnerstag, 21. Mai 2020 17:46:01 CEST Marc Glisse wrote:
> >> On Thu, 21 May 2020, Jonathan Wakely wrote:
> >> > On 27/04/20 17:09 +0200, Matthias Kretz wrote:
> >> >> From: Matthias Kretz 
> >> >> 
> >> >>PR libstdc++/84949
> >> >>* include/std/limits: Let is_iec559 reflect whether
> >> >>__GCC_IEC_559 says float and double support IEEE 754-2008.
> >> >>* testsuite/18_support/numeric_limits/is_iec559.cc: Test IEC559
> >> >>mandated behavior if is_iec559 is true.
> >> >>* testsuite/18_support/numeric_limits/infinity.cc: Only test
> >> >>inf
> >> >>behavior if is_iec559 is true, otherwise there is no guarantee
> >> >>how arithmetic on inf behaves.
> >> >>* testsuite/18_support/numeric_limits/quiet_NaN.cc: ditto for
> >> >>NaN.
> >> >>* testsuite/18_support/numeric_limits/denorm_min-1.cc: Compile
> >> >>with -ffast-math.
> >> >>* testsuite/18_support/numeric_limits/epsilon-1.cc: ditto.
> >> >>* testsuite/18_support/numeric_limits/infinity-1.cc: ditto.
> >> >>* testsuite/18_support/numeric_limits/is_iec559-1.cc: ditto.
> >> >>* testsuite/18_support/numeric_limits/quiet_NaN-1.cc: ditto.
> >> > 
> >> > I'm inclined to go ahead and commit this (to master only, obviously).
> >> > It certainly seems more correct to me, and we'll probably never find
> >> > out if it's "safe" to do unless we actually change it and see what
> >> > happens.
> >> > 
> >> > Marc, do you have an opinion?
> >> 
> >> I don't have a strong opinion on this. I thought we were refraining from
> >> changing numeric_limits based on flags (like -fwrapv for modulo) because
> >> that would lead to ODR violations when people link objects compiled with
> >> different flags. There is a value in libstdc++.so, which may have been
> >> compiled with different flags than the application.
> >
> >But these ODR violations happen in any case: The floating-point types are
> >different types with or without -ffast-math (and related) flags. They
> >behave differently. Compiling a function in multiple TUs with different
> >flags produces observably different results. Choosing a single one of them
> >is obviously fragile and broken. That's the spirit of an ODR violation...
> >
> >It would sometimes be useful to have different types:
> >float, float_no_nan, float_no_nan_no_signed_zero, ...
> 
> Sure. There are ODR violations like that, and then there are ones
> like:
> 
>template
>struct X
>{
>  conditional_t::is_iec559, T, BigNum> val;
>};

Nice. ;-) If only the mangling of a struct could include the type of its 
members (recursively)... But at least val has a different type now. And 
correctly so. Yes, the ABI breaks possible via this change is real, though 
I'd guess there are zero or close-to-zero ABI dependencies on is_iec559 out in 
the wild (at this point - because it didn't work anyway).

> I'm generally not concerned about ODR violations where one TU behaves
> as requested by the flags used to compile that TU and another behaves
> as requested by the flats used to compile that second TU. That happens
> all the time with -fno-exceptions and -fno-rtti and such like. That
> causes ODR violations too, but of the kind where each definition does
> what was requested.

I am concerned. Showcase: https://godbolt.org/z/KzM3si. If you link those TUs, 
you get one of the two behaviors for both TUs. This can result in very hard to 
find Heisenbugs.

> Constants defined by the library changing value is a bit more
> concerning. But I don't know if it's really a problem in this case.

template ::is_iec559>
struct Float
{
  T val
};

Finally, the standard mechanism that can help resolve those silent ODR 
violations works. I.e. one can build float_559 and float_non559 types 
(overloading all operators is still rather tedious)

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──

Re: New mklog script

2020-05-25 Thread Martin Liška


On 5/22/20 6:43 PM, Martin Sebor wrote:

On 5/21/20 2:16 AM, Martin Liška wrote:

Hello Martin.

Can you please compare the current mklog.py. Is there anything
you miss compared to your current script?


Nope, it matches the format I get with my script and even works
better and runs faster.  Very nice!  I'll be happy to switch to
using it instead.


Great, good to hear!



Thanks!
Martin

PS A couple of ideas for future enhancements are to have the script
print "New function." or "New type." for newly added functions and
types, and to print "Adjust comments." for changes to comments alone.


Feel free to send patches for the new script ;)

Martin

Re: ChangeLog files - server and client scripts

2020-05-25 Thread Martin Liška


On 5/23/20 12:14 AM, Ian Lance Taylor wrote:

Sure, I can wait.  Thanks for looking at it.


Hello.

Thank you for patience. There's a patch that fixes that,
I'm going to install it.

Martin
>From 76e18b91250f265a37d85063860fb38aa8f6aac3 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Mon, 25 May 2020 09:40:50 +0200
Subject: [PATCH] Allow only ignored files in ChangeLog entries.

contrib/ChangeLog:

2020-05-25  Martin Liska  

	* gcc-changelog/git_commit.py: Add trailing '/'
	for libdruntime.  Allow empty changelog for
	only ignored files.
	* gcc-changelog/test_email.py: New test for go
	patch in ignored location.
	* gcc-changelog/test_patches.txt: Add test.
---
 contrib/gcc-changelog/git_commit.py|  5 +--
 contrib/gcc-changelog/test_email.py|  4 +++
 contrib/gcc-changelog/test_patches.txt | 43 ++
 3 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/contrib/gcc-changelog/git_commit.py b/contrib/gcc-changelog/git_commit.py
index 8c5fa2c0fc9..2cfdbc83d09 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -130,7 +130,7 @@ ignored_prefixes = [
 'gcc/go/gofrontend/',
 'gcc/testsuite/go.test/test/',
 'libgo/',
-'libphobos/libdruntime',
+'libphobos/libdruntime/',
 'libphobos/src/',
 'libsanitizer/',
 ]
@@ -233,7 +233,8 @@ class GitCommit:
 
 project_files = [f for f in self.modified_files
  if self.is_changelog_filename(f[0])
- or f[0] in misc_files]
+ or f[0] in misc_files
+ or self.in_ignored_location(f[0])]
 if len(project_files) == len(self.modified_files):
 # All modified files are only MISC files
 return
diff --git a/contrib/gcc-changelog/test_email.py b/contrib/gcc-changelog/test_email.py
index d522e6ef7e3..aa516c6e6d1 100755
--- a/contrib/gcc-changelog/test_email.py
+++ b/contrib/gcc-changelog/test_email.py
@@ -276,3 +276,7 @@ class TestGccChangelog(unittest.TestCase):
 def test_dr_entry(self):
 email = self.from_patch_glob('0001-c-C-20-DR-2237.patch')
 assert email.changelog_entries[0].prs == ['DR 2237']
+
+def test_changes_only_in_ignored_location(self):
+email = self.from_patch_glob('0001-go-in-ignored-location.patch')
+assert not email.errors
diff --git a/contrib/gcc-changelog/test_patches.txt b/contrib/gcc-changelog/test_patches.txt
index 3445c3d9f11..58fd81c85c9 100644
--- a/contrib/gcc-changelog/test_patches.txt
+++ b/contrib/gcc-changelog/test_patches.txt
@@ -2568,3 +2568,46 @@ index a6a5d975af3..a8082d39aca 100644
 @@ -1 +1,2 @@
 
 +
+
+=== 0001-go-in-ignored-location.patch ===
+From 81994eab700da7fea6644541c163aa0f0f3b8cf1 Mon Sep 17 00:00:00 2001
+From: =?UTF-8?q?Cl=C3=A9ment=20Chigot?= 
+Date: Tue, 19 May 2020 16:03:54 +0200
+Subject: libgo: update x/sys/cpu after gccgo support added
+
+Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/234597
+---
+ gcc/go/gofrontend/MERGE   |  2 +-
+ .../sys/cpu/{cpu_aix_ppc64.go => cpu_aix.go}  |  2 +-
+ .../golang.org/x/sys/cpu/syscall_aix_gccgo.go | 27 +++
+ 3 files changed, 29 insertions(+), 2 deletions(-)
+ rename libgo/go/golang.org/x/sys/cpu/{cpu_aix_ppc64.go => cpu_aix.go} (96%)
+ create mode 100644 libgo/go/golang.org/x/sys/cpu/syscall_aix_gccgo.go
+
+diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
+index bc9c1f07eda..284374820b0 100644
+--- a/gcc/go/gofrontend/MERGE
 b/gcc/go/gofrontend/MERGE
+@@ -1 +1,2 @@
+
++
+diff --git a/libgo/go/golang.org/x/sys/cpu/cpu_aix_ppc64.go b/libgo/go/golang.org/x/sys/cpu/cpu_aix.go
+similarity index 96%
+rename from libgo/go/golang.org/x/sys/cpu/cpu_aix_ppc64.go
+rename to libgo/go/golang.org/x/sys/cpu/cpu_aix.go
+index b0ede112d4e..02d03129e50 100644
+--- a/libgo/go/golang.org/x/sys/cpu/cpu_aix_ppc64.go
 b/libgo/go/golang.org/x/sys/cpu/cpu_aix.go
+@@ -1 +1,2 @@
+
++
+diff --git a/libgo/go/golang.org/x/sys/cpu/syscall_aix_gccgo.go b/libgo/go/golang.org/x/sys/cpu/syscall_aix_gccgo.go
+new file mode 100644
+index 000..2609cc49ae7
+--- /dev/null
 b/libgo/go/golang.org/x/sys/cpu/syscall_aix_gccgo.go
+@@ -0,0 +1 @@
++
+
+-- 
+2.27.0.rc0.183.gde8f92d652-goog
-- 
2.26.2

[Ada] Change description of fat pointertype with -fgnat-encodings=minimal

2020-05-25 Thread Eric Botcazou

This makes a step back in the representation of fat pointer types in the debug 
info with -fgnat-encodings=minimal so as to avoid hiding the data indirection 
and making it easier for GDB to synthetize the construct.

Tested on x86-64/Linux, applied on the mainline.


2020-05-25  Eric Botcazou  

* gcc-interface/decl.c (gnat_to_gnu_entity) : Add a
description of the various types associated with the unconstrained
type.  Declare the fat pointer earlier.  Set the current function
as context on the template type, and the fat pointer type on the
array type.  Always mark the fat pointer type as artificial and set
it as the context for the pointer type to the array.  Also reuse
GNU_ENTITY_NAME.  Finish up the unconstrained type at the very end.
* gcc-interface/misc.c (gnat_get_array_descr_info): Do not handle
fat pointer types and tidy up accordingly.
* gcc-interface/utils.c (build_unc_object_type): Do not set the
context on the template type.

-- 
Eric Botcazoudiff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c
index d87a82ce8fa..c2a2e5fcd09 100644
--- a/gcc/ada/gcc-interface/decl.c
+++ b/gcc/ada/gcc-interface/decl.c
@@ -2099,16 +2099,28 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 
   /* Array Types and Subtypes
 
-	 Unconstrained array types are represented by E_Array_Type and
-	 constrained array types are represented by E_Array_Subtype.  There
-	 are no actual objects of an unconstrained array type; all we have
-	 are pointers to that type.
+	 In GNAT unconstrained array types are represented by E_Array_Type and
+	 constrained array types are represented by E_Array_Subtype.  They are
+	 translated into UNCONSTRAINED_ARRAY_TYPE and ARRAY_TYPE respectively.
+	 But there are no actual objects of an unconstrained array type; all we
+	 have are pointers to that type.  In addition to the type node itself,
+	 4 other types associated with it are built in the process:
 
-	 The following fields are defined on array types and subtypes:
+	   1. the array type (suffix XUA) containing the actual data,
 
-		Component_Type Component type of the array.
-		Number_Dimensions  Number of dimensions (an int).
-		First_Index	   Type of first index.  */
+	   2. the template type (suffix XUB) containng the bounds,
+
+	   3. the fat pointer type (suffix XUP) representing a pointer or a
+	  reference to the unconstrained array type:
+	XUP = struct { XUA *, XUB * }
+
+	   4. the object record type (suffix XUT) containing bounds and data:
+	XUT = struct { XUB, XUA }
+
+	 The bounds of the array type XUA (de)reference the XUB * field of a
+	 PLACEHOLDER_EXPR for the fat pointer type XUP, so the array type XUA
+	 is to be interpreted in the context of the fat pointer type XUB for
+	 debug info purposes.  */
 
 case E_Array_Type:
   {
@@ -2120,7 +2132,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 	tree gnu_template_reference, gnu_template_fields, gnu_fat_type;
 	tree *gnu_index_types = XALLOCAVEC (tree, ndim);
 	tree *gnu_temp_fields = XALLOCAVEC (tree, ndim);
-	tree gnu_max_size = size_one_node, tem, t;
+	tree gnu_max_size = size_one_node, tem, obj;
 	Entity_Id gnat_index;
 	int index;
 	tree comp_type;
@@ -2195,7 +2207,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 	TREE_TYPE (tem) = ptr_type_node;
 	TREE_TYPE (DECL_CHAIN (tem)) = gnu_ptr_template;
 	TYPE_DECL_SUPPRESS_DEBUG (TYPE_STUB_DECL (gnu_fat_type)) = 0;
-	for (t = gnu_fat_type; t; t = TYPE_NEXT_VARIANT (t))
+	for (tree t = gnu_fat_type; t; t = TYPE_NEXT_VARIANT (t))
 	  SET_TYPE_UNCONSTRAINED_ARRAY (t, gnu_type);
 	  }
 	else
@@ -2212,6 +2224,22 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 	SET_TYPE_UNCONSTRAINED_ARRAY (gnu_fat_type, gnu_type);
 	  }
 
+	/* If the GNAT encodings are used, give the fat pointer type a name.
+	   If this is a packed array, tell the debugger how to interpret the
+	   underlying bits by fetching that of the implementation type.  But
+	   in any case, mark it as artificial so the debugger can skip it.  */
+	const Entity_Id gnat_name
+	  = (Present (Packed_Array_Impl_Type (gnat_entity))
+	 && gnat_encodings != DWARF_GNAT_ENCODINGS_MINIMAL)
+	? Packed_Array_Impl_Type (gnat_entity)
+	: gnat_entity;
+	tree xup_name
+	  = (gnat_encodings != DWARF_GNAT_ENCODINGS_MINIMAL)
+	? create_concat_name (gnat_name, "XUP")
+	: gnu_entity_name;
+	create_type_decl (xup_name, gnu_fat_type, true, debug_info_p,
+			  gnat_entity);
+
 	/* Build a reference to the template from a PLACEHOLDER_EXPR that
 	   is the fat pointer.  This will be used to access the individual
 	   fields once we build them.  */
@@ -2313,6 +2341,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 	= chainon (gnu_template_fields,

[Ada] Fix wrong assignment to mutable Out parameter of task entry

2020-05-25 Thread Eric Botcazou

Under very specific circumstances the compiler can generate a wrong assignment 
to a mutable record object which contains an array component, because it does 
not correctly handle the update of the discriminant.

Tested on x86-64/Linux, applied on the mainline.


2020-05-25  Eric Botcazou  

* gcc-interface/gigi.h (operand_type): New static inline function.
* gcc-interface/trans.c (gnat_to_gnu): Do not suppress conversion
to the resulty type at the end for array types.
* gcc-interface/utils2.c (build_binary_op) : Do not
remove conversions between array types on the LHS.


2020-05-25  Eric Botcazou  

* gnat.dg/array39.adb: New test.
* gnat.dg/array39_pkg.ads: New helper.
* gnat.dg/array39_pkg.adb: Likewise.

-- 
Eric Botcazoudiff --git a/gcc/ada/gcc-interface/gigi.h b/gcc/ada/gcc-interface/gigi.h
index fcdea320c3a..e43b3db59a9 100644
--- a/gcc/ada/gcc-interface/gigi.h
+++ b/gcc/ada/gcc-interface/gigi.h
@@ -1209,3 +1209,11 @@ maybe_padded_object (tree expr)
 
   return expr;
 }
+
+/* Return the type of operand #0 of EXPR.  */
+
+static inline tree
+operand_type (tree expr)
+{
+  return TREE_TYPE (TREE_OPERAND (expr, 0));
+}
diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
index b7a4cadb7e6..969a480c3da 100644
--- a/gcc/ada/gcc-interface/trans.c
+++ b/gcc/ada/gcc-interface/trans.c
@@ -8821,7 +8821,8 @@ gnat_to_gnu (Node_Id gnat_node)
1. If this is the LHS of an assignment or an actual parameter of a
 	  call, return the result almost unmodified since the RHS will have
 	  to be converted to our type in that case, unless the result type
-	  has a simpler size.  Likewise if there is just a no-op unchecked
+	  has a simpler size or for array types because this size might be
+	  changed in-between. Likewise if there is just a no-op unchecked
 	  conversion in-between.  Similarly, don't convert integral types
 	  that are the operands of an unchecked conversion since we need
 	  to ignore those conversions (for 'Valid).
@@ -8856,15 +8857,17 @@ gnat_to_gnu (Node_Id gnat_node)
 	  && !AGGREGATE_TYPE_P (TREE_TYPE (gnu_result
   && !(TYPE_SIZE (gnu_result_type)
 	   && TYPE_SIZE (TREE_TYPE (gnu_result))
-	   && (AGGREGATE_TYPE_P (gnu_result_type)
-	   == AGGREGATE_TYPE_P (TREE_TYPE (gnu_result)))
+	   && AGGREGATE_TYPE_P (gnu_result_type)
+	  == AGGREGATE_TYPE_P (TREE_TYPE (gnu_result))
 	   && ((TREE_CODE (TYPE_SIZE (gnu_result_type)) == INTEGER_CST
 		&& (TREE_CODE (TYPE_SIZE (TREE_TYPE (gnu_result)))
 		!= INTEGER_CST))
 	   || (TREE_CODE (TYPE_SIZE (gnu_result_type)) != INTEGER_CST
 		   && !CONTAINS_PLACEHOLDER_P (TYPE_SIZE (gnu_result_type))
 		   && (CONTAINS_PLACEHOLDER_P
-		   (TYPE_SIZE (TREE_TYPE (gnu_result))
+		   (TYPE_SIZE (TREE_TYPE (gnu_result)
+	   || (TREE_CODE (gnu_result_type) == ARRAY_TYPE
+		   && TREE_CODE (TREE_TYPE (gnu_result)) == ARRAY_TYPE))
 	   && !(TREE_CODE (gnu_result_type) == RECORD_TYPE
 		&& TYPE_JUSTIFIED_MODULAR_P (gnu_result_type
 {
diff --git a/gcc/ada/gcc-interface/utils2.c b/gcc/ada/gcc-interface/utils2.c
index 7799776e1db..a56a4f45adc 100644
--- a/gcc/ada/gcc-interface/utils2.c
+++ b/gcc/ada/gcc-interface/utils2.c
@@ -875,31 +875,21 @@ build_binary_op (enum tree_code op_code, tree result_type,
 
   /* If there were integral or pointer conversions on the LHS, remove
 	 them; we'll be putting them back below if needed.  Likewise for
-	 conversions between array and record types, except for justified
-	 modular types.  But don't do this if the right operand is not
-	 BLKmode (for packed arrays) unless we are not changing the mode.  */
+	 conversions between record types, except for justified modular types.
+	 But don't do this if the right operand is not BLKmode (for packed
+	 arrays) unless we are not changing the mode.  */
   while ((CONVERT_EXPR_P (left_operand)
 	  || TREE_CODE (left_operand) == VIEW_CONVERT_EXPR)
 	 && (((INTEGRAL_TYPE_P (left_type)
 		   || POINTER_TYPE_P (left_type))
-		  && (INTEGRAL_TYPE_P (TREE_TYPE
-   (TREE_OPERAND (left_operand, 0)))
-		  || POINTER_TYPE_P (TREE_TYPE
-	 (TREE_OPERAND (left_operand, 0)
-		 || (((TREE_CODE (left_type) == RECORD_TYPE
-		   && !TYPE_JUSTIFIED_MODULAR_P (left_type))
-		  || TREE_CODE (left_type) == ARRAY_TYPE)
-		 && ((TREE_CODE (TREE_TYPE
- (TREE_OPERAND (left_operand, 0)))
-			  == RECORD_TYPE)
-			 || (TREE_CODE (TREE_TYPE
-	(TREE_OPERAND (left_operand, 0)))
-			 == ARRAY_TYPE))
+		  && (INTEGRAL_TYPE_P (operand_type (left_operand))
+		  || POINTER_TYPE_P (operand_type (left_operand
+		 || (TREE_CODE (left_type) == RECORD_TYPE
+		 && !TYPE_JUSTIFIED_MODULAR_P (left_type)
+		 && TREE_CODE (operand_type (left_operand)) == RECORD_TYPE
 		 && (TYPE_MODE (right_type) == BLKmode
-			 || (TYPE_MODE (left_type)
-			 == TYPE_MODE (TREE_TYPE
-	   (TREE_OPERAND
-

Re: [PATCH v1 1/2][PPC64] [PR88877]

2020-05-25 Thread kamlesh kumar via Gcc-patches

> OTOH, you don't need to name Tuple at all...  It should not *have* a
> constructor, since you declared it as class...  But you can just use
> std::tuple here?

I thought of using std::tuple  but it requires c++11 support.
I am not sure we always build gcc with c++11?

>
> > (emit_library_call): Added default arg unsigned_p.
> > (emit_library_call_value): Added default arg unsigned_p.
>
> Yeah, eww.  Default arguments have all the problems you had before,
> except now it is hidden and much more surprising.
>
> Those functions really should take rtx_mode_t arguments?

I was thinking the same. will incorporate this.

./kamlesh

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-25 Thread Richard Biener

On May 25, 2020 8:12:12 AM GMT+02:00, Uros Bizjak  wrote:
>On Mon, May 25, 2020 at 7:53 AM Hongtao Liu  wrote:
>
>> > We have to introduce a new expander, that will have conforming mode
>of
>> > output operand (V2SF) and will produce RTX that will match
>> > *floatv2div2sf2. A paradoxical output subreg from
>> > V2SFmode V4SFmode is needed, generated by simplify_gen_subreg as is
>> > the case with paradoxical input subreg.
>>
>> Problem`is simplify_gen_subreg (V4SFmode, operands[0], V2SFmode, 0)
>> will return NULL since
>> 
>> 948  /* Subregs involving floating point modes are not allowed to
>> 949 change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
>> 950 (subreg:SI (reg:DF) 0) isn't.  */
>
>But, we are not changing size, we are still operating with SFmode. It
>looks to me that this limitation is too strict, the intention is to
>not expand scalar SFmode to DFmode.

I guess so. The test probably wants to tes the component mode. 

>Let's ask experts.
>
>Uros.

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-25 Thread Hongtao Liu via Gcc-patches

On Mon, May 25, 2020 at 1:55 AM Uros Bizjak  wrote:
>
> On Sun, May 24, 2020 at 9:26 AM Hongtao Liu  wrote:
> >
> > On Sat, May 23, 2020 at 6:11 PM Uros Bizjak  wrote:
> > >
> > > On Sat, May 23, 2020 at 9:25 AM Hongtao Liu  wrote:
> > > >
> > > > Hi:
> > > >   This patch fix non-conforming expander for
> > > > floatv2div2sf2,floatunsv2div2sf2,fix_truncv2sfv2di,fixuns_truncv2sfv2di,
> > > > refer to PR95211, PR95256.
> > > >   bootstrap ok, regression test on i386/x86-64 backend is ok.
> > > >
> > > > gcc/ChangeLog:
> > > > PR target/95211 PR target/95256
> > >
> > Changed.
> > > Please put every PR reference in a separate line.
> > >
> > > > * config/i386/sse.md v2div2sf2): New expander.
> > > > (fix_truncv2sfv2di2): Ditto.
> > > > (floatv2div2sf2_internal): Renaming from
> > > > floatv2div2sf2.
> > > > (fix_truncv2sfv2di2_internal):
> > >
> > > The convention throughout sse,md is to prefix a standard pattern that
> > > is used through builtins with avx512_ instead of suffixing
> > > the pattern name with _internal.
> > >
> > Changed.
> > > > Renaming from fix_truncv2sfv2di2.
> > > > (vec_pack_float_): Adjust icode name.
> > > > (vec_unpack_fix_trunc_lo_): Ditto.
> > > > * config/i386/i386-builtin.def: Ditto.
> > >
> > > Uros.
> >
> > Update patch.
>
> The patch is wrong, and the correct way to fix these patterns is more complex:
>
> a) the pattern should not access register in mode, narrower than 128
> bits, as this implies MMX register in non-TARGET-MMX-WITH-SSE targets.

It seems there are some patterns in sse.md not obey this rule.
i.e:
(define_insn "sse_storehps"
  [(set (match_operand:V2SF 0 "nonimmediate_operand" "=m,v,v")
(vec_select:V2SF
  (match_operand:V4SF 1 "nonimmediate_operand" "v,v,o")
  (parallel [(const_int 2) (const_int 3)])))]
  "TARGET_SSE && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
  "@
   %vmovhps\t{%1, %0|%q0, %1}
   %vmovhlps\t{%1, %d0|%d0, %1}
   %vmovlps\t{%H1, %d0|%d0, %H1}"
  [(set_attr "type" "ssemov")
   (set_attr "prefix" "maybe_vex")
   (set_attr "mode" "V2SF,V4SF,V2SF")])

(define_insn "sse_storelps"
  [(set (match_operand:V2SF 0 "nonimmediate_operand"   "=m,v,v")
(vec_select:V2SF
  (match_operand:V4SF 1 "nonimmediate_operand" " v,v,m")
  (parallel [(const_int 0) (const_int 1)])))]
  "TARGET_SSE && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
  "@
   %vmovlps\t{%1, %0|%q0, %1}
   %vmovaps\t{%1, %0|%0, %1}
   %vmovlps\t{%1, %d0|%d0, %q1}"
  [(set_attr "type" "ssemov")
   (set_attr "prefix" "maybe_vex")
   (set_attr "mode" "V2SF,V4SF,V2SF")])

Should they be restricted under TARGET_MMX_WITH_SSE or is there
anything i missed?

> So, the correct way to define insn with narrow mode is to use
> vec_select, something like:
>
> (define_insn "sse4_1_v8qiv8hi2"
>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> (any_extend:V8HI
>   (vec_select:V8QI
> (match_operand:V16QI 1 "register_operand" "Yr,*x,v")
> (parallel [(const_int 0) (const_int 1)
>(const_int 2) (const_int 3)
>(const_int 4) (const_int 5)
>(const_int 6) (const_int 7)]]
>
> The instruction accesses the memory in the correct mode, so the memory
> operand is:
>
> (define_insn "*sse4_1_v8qiv8hi2_1"
>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> (any_extend:V8HI
>   (match_operand:V8QI 1 "memory_operand" "m,m,m")))]
>
> and a pre-reload split has to be introduced to convert insn from
> register form to memory form, when memory gets propagated to the insn:
>
> (define_insn_and_split "*sse4_1_v8qiv8hi2_2"
>   [(set (match_operand:V8HI 0 "register_operand")
> (any_extend:V8HI
>   (vec_select:V8QI
> (subreg:V16QI
>   (vec_concat:V2DI
> (match_operand:DI 1 "memory_operand")
> (const_int 0)) 0)
> (parallel [(const_int 0) (const_int 1)
>(const_int 2) (const_int 3)
>(const_int 4) (const_int 5)
>(const_int 6) (const_int 7)]]
>
> For a middle end to use this insn, an expander is used:
>
> (define_expand "v8qiv8hi2"
>   [(set (match_operand:V8HI 0 "register_operand")
> (any_extend:V8HI
>   (match_operand:V8QI 1 "nonimmediate_operand")))]
>
> b) Similar approach is used when an output is narrower than 128 bits:
>
> (define_insn "*floatv2div2sf2"
>   [(set (match_operand:V4SF 0 "register_operand" "=v")
> (vec_concat:V4SF
> (any_float:V2SF (match_operand:V2DI 1 "nonimmediate_operand" "vm"))
> (match_operand:V2SF 2 "const0_operand" "C")))]
>
> In your concrete case,
>
> (define_insn "fix_truncv2sfv2di2"
>   [(set (match_operand:V2DI 0 "register_operand" "=v")
> (any_fix:V2DI
>   (vec_select:V2SF
> (match_operand:V4SF 1 "nonimmediate_operand" "vm")
> (parallel [(const_int 0) (const_int 1)]]
>
> is already _NOT_ defined in a correct way

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-25 Thread Uros Bizjak via Gcc-patches

On Mon, May 25, 2020 at 7:53 AM Hongtao Liu  wrote:

> > We have to introduce a new expander, that will have conforming mode of
> > output operand (V2SF) and will produce RTX that will match
> > *floatv2div2sf2. A paradoxical output subreg from
> > V2SFmode V4SFmode is needed, generated by simplify_gen_subreg as is
> > the case with paradoxical input subreg.
>
> Problem`is simplify_gen_subreg (V4SFmode, operands[0], V2SFmode, 0)
> will return NULL since
> 
> 948  /* Subregs involving floating point modes are not allowed to
> 949 change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
> 950 (subreg:SI (reg:DF) 0) isn't.  */

But, we are not changing size, we are still operating with SFmode. It
looks to me that this limitation is too strict, the intention is to
not expand scalar SFmode to DFmode.

Let's ask experts.

Uros.

90 matches

Mail list logo