[PATCH] More memset VN improvements

2018-05-23 Thread Richard Biener

We can handle arbitrary constants by using native_interpret_expr for
byte-aligned re-interpretations.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

For the testcase we now reflect on the GIMPLE level what we did on
the RTL level before.

Richard.

>From 25955accffa877695a233086e06d0a20ab46e753 Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Tue, 22 May 2018 13:46:01 +0200
Subject: [PATCH 3/3] memset with native-interpret


diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-65.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-65.c
new file mode 100644
index 000..87ba6662041
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-65.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-fre1-details -fdump-tree-dse1-details" } */
+
+typedef unsigned char v16qi __attribute__((vector_size(16)));
+typedef unsigned short v8hi __attribute__((vector_size(16)));
+typedef unsigned int v4si __attribute__((vector_size(16)));
+void foo(char *dest)
+{
+  unsigned char x[256] __attribute__((aligned(16)));
+  __builtin_memset (x, 23, 256);
+  v16qi resqi = *(v16qi *)&x[16];
+  v8hi reshi = *(v8hi *)&x[16];
+  v4si ressi = *(v4si *)&x[16];
+  *(v16qi *)dest = resqi;
+  *(v8hi *)(dest + 16) = reshi;
+  *(v4si *)(dest + 32) = ressi;
+}
+
+/* { dg-final { scan-tree-dump-times "Replaced MEM" 3 "fre1" } } */
+/* { dg-final { scan-tree-dump-times "Deleted dead call" 1 "dse1" } } */
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 884cce12bb3..96e80c7b5a3 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -1959,9 +1959,9 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *vr_,
   if (is_gimple_reg_type (vr->type)
   && gimple_call_builtin_p (def_stmt, BUILT_IN_MEMSET)
   && (integer_zerop (gimple_call_arg (def_stmt, 1))
- || (INTEGRAL_TYPE_P (vr->type)
+ || ((TREE_CODE (gimple_call_arg (def_stmt, 1)) == INTEGER_CST
+  || (INTEGRAL_TYPE_P (vr->type) && known_eq (ref->size, 8)))
  && CHAR_BIT == 8 && BITS_PER_UNIT == 8
- && known_eq (ref->size, 8)
  && known_eq (ref->size, maxsize)
  && offset.is_constant (&offseti)
  && offseti % BITS_PER_UNIT == 0))
@@ -2030,7 +2030,8 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *vr_,
  tree val;
  if (integer_zerop (gimple_call_arg (def_stmt, 1)))
val = build_zero_cst (vr->type);
- else
+ else if (INTEGRAL_TYPE_P (vr->type)
+  && known_eq (ref->size, 8))
{
  code_helper rcode = NOP_EXPR;
  tree ops[3] = {};
@@ -2041,6 +2042,16 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *vr_,
  && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (val)))
return (void *)-1;
}
+ else
+   {
+ unsigned len = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (vr->type));
+ unsigned char *buf = XALLOCAVEC (unsigned char, len);
+ memset (buf, TREE_INT_CST_LOW (gimple_call_arg (def_stmt, 1)),
+ len);
+ val = native_interpret_expr (vr->type, buf, len);
+ if (!val)
+   return (void *)-1;
+   }
  return vn_reference_lookup_or_insert_for_pieces
   (vuse, vr->set, vr->type, vr->operands, val);
}
-- 
2.12.3



RE: [RFT PATCH, AVX512]: Implement scalar float->unsigned int truncations with AVX512F

2018-05-23 Thread Peryt, Sebastian
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- 
> ow...@gcc.gnu.org] On Behalf Of Uros Bizjak
> Sent: Monday, May 21, 2018 9:55 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Jakub Jelinek ; Kirill Yukhin 
> 
> Subject: Re: [RFT PATCH, AVX512]: Implement scalar float->unsigned int 
> truncations with AVX512F
> 
> On Mon, May 21, 2018 at 4:53 PM, Uros Bizjak  wrote:
> > Hello!
> >
> > Attached patch implements scalar float->unsigned int truncations 
> > with
> AVX512F.
> >
> > 2018-05-21  Uros Bizjak  
> >
> > * config/i386/i386.md (fixuns_truncdi2): New insn pattern.
> > (fixuns_truncsi2_avx512f): Ditto.
> > (*fixuns_truncsi2_avx512f_zext): Ditto.
> > (fixuns_truncsi2): Also enable for AVX512F and TARGET_SSE_MATH.
> > Emit fixuns_truncsi2_avx512f for AVX512F targets.
> >
> > testsuite/ChangeLog:
> >
> > 2018-05-21  Uros Bizjak  
> >
> > * gcc.target/i386/cvt-2.c: New test.
> >
> > Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
> >
> > Unfortunately, I have to means to test the patch on AVX512 target, 
> > so to avoid some hidden issue, I'd like to ask someone to test it on 
> > live target.

I've bootstrapped and regression tested your patch on x86_64-linux-gnu {,-m32} 
on SKX machine and I don't see any stability regression.

Sebastian

> 
> Ops, ssemodesuffix handling was missing in the insn mnemonic. Fixed in 
> the attached v-2 patch.
> 
> Uros.


Re: [RFC] [aarch64] Add HiSilicon tsv110 CPU support.

2018-05-23 Thread Kyrill Tkachov


On 23/05/18 05:54, Zhangshaokun wrote:

Hi Kyrill,

On 2018/5/22 18:52, Kyrill Tkachov wrote:

Hi Shaokun,

On 22/05/18 09:40, Shaokun Zhang wrote:

This patch adds HiSilicon's an mcpu: tsv110.

---
  gcc/ChangeLog|   9 +++
  gcc/config/aarch64/aarch64-cores.def |   5 ++
  gcc/config/aarch64/aarch64-cost-tables.h | 103 +++
  gcc/config/aarch64/aarch64-tune.md   |   2 +-
  gcc/config/aarch64/aarch64.c |  79 
  gcc/doc/invoke.texi  |   2 +-
  6 files changed, 198 insertions(+), 2 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index cec2892..5d44966 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2018-05-22  Shaokun Zhang 
+Bo Zhou  
+
+   * config/aarch64/aarch64-cores.def (tsv110): New CPU.
+   * config/aarch64/aarch64-tune.md: Regenerated.
+   * doc/invoke.texi (AArch61 Options/-mtune): Add "tsv110".

typo: AArch64.


Good catch, my mistake.


+   * gcc/config/aarch64/aarch64.c (tsv110_tunings): New tuning table.
+   * gcc/config/aarch64/aarch64-cost-tables.h: Add "tsv110" extra costs.

Please start the path with config/.


Sure, Will remove gcc/ next version.


+
  2018-05-21  Michael Meissner 

  PR target/85657
diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 33b96ca..db7a412 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -91,6 +91,11 @@ AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2
  /* Qualcomm ('Q') cores. */
  AARCH64_CORE("saphira", saphira,falkor,8_3A, 
AARCH64_FL_FOR_ARCH8_3 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   0x51, 0xC01, -1)

+/* ARMv8.4-A Architecture Processors.  */
+
+/* HiSilicon ('H') cores. */
+AARCH64_CORE("tsv110", tsv110,tsv110,8_4A, AARCH64_FL_FOR_ARCH8_4 
| AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | AARCH64_FL_SHA2, tsv110,   0x48, 
0xd01, -1)
+

The third field is the scheduler model to use when optimising.
Since there is no tsv110 scheduling model, using the name "tsv110"
in the third field will generally give pretty poor schedules.
I recommend you specify an scheduling model that most closely matches your core
for the time being. But I don't think it's required and I wouldn't let it hold

I checked it again, cortexa57 is most closely matches tsv110 and thanks your
suggestion.
If i choose cortexa57, can i add the tsv110_tunings which will use tsv110's
pipeline features, like the rest patch as follow or only use generic feature?


If you use cortexa57 for the scheduling model (the 3rd field) you should still
use tsv110_tunings in the 6th field as this will specify other important 
parameters
like instruction selection costs, fusion capabilities, alignment requirements 
etc.

Thanks,
Kyrill




up the patch.

You'll need approval from an aarch64 maintainer (cc'ed some for you).


Good, thanks for your nice guidance.

Thanks,
Shaokun


Thanks,
Kyrill


  /* ARMv8-A big.LITTLE implementations.  */

  AARCH64_CORE("cortex-a57.cortex-a53",  cortexa57cortexa53, cortexa53, 8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, 0x41, AARCH64_BIG_LITTLE (0xd07, 
0xd03), -1)
diff --git a/gcc/config/aarch64/aarch64-cost-tables.h 
b/gcc/config/aarch64/aarch64-cost-tables.h
index a455c62..b6890d6 100644
--- a/gcc/config/aarch64/aarch64-cost-tables.h
+++ b/gcc/config/aarch64/aarch64-cost-tables.h
@@ -334,4 +334,107 @@ const struct cpu_cost_table thunderx2t99_extra_costs =
}
  };

+const struct cpu_cost_table tsv110_extra_costs =
+{
+  /* ALU */
+  {
+0, /* arith.  */
+0, /* logical.  */
+0, /* shift.  */
+0, /* shift_reg.  */
+COSTS_N_INSNS (1), /* arith_shift.  */
+COSTS_N_INSNS (1), /* arith_shift_reg.  */
+COSTS_N_INSNS (1), /* log_shift.  */
+COSTS_N_INSNS (1), /* log_shift_reg.  */
+0, /* extend.  */
+COSTS_N_INSNS (1), /* extend_arith.  */
+0, /* bfi.  */
+0, /* bfx.  */
+0, /* clz.  */
+0,/* rev.  */
+0, /* non_exec.  */
+true   /* non_exec_costs_exec.  */
+  },
+  {
+/* MULT SImode */
+{
+  COSTS_N_INSNS (2),   /* simple.  */
+  COSTS_N_INSNS (2),   /* flag_setting.  */
+  COSTS_N_INSNS (2),   /* extend.  */
+  COSTS_N_INSNS (2),   /* add.  */
+  COSTS_N_INSNS (2),   /* extend_add.  */
+  COSTS_N_INSNS (11)   /* idiv.  */
+},
+/* MULT DImode */
+{
+  COSTS_N_INSNS (3),   /* simple.  */
+  0,   /* flag_setting (N/A).  */
+  COSTS_N_INSNS (3),   /* extend.  */
+  COSTS_N_INSNS (3),   /* add.  */
+  COSTS_N_INSNS (3),   /* extend_add.  */
+  COSTS_N_INSNS (19)   /

Re: Replace FMA_EXPR with one internal fn per optab

2018-05-23 Thread Richard Sandiford
"H.J. Lu"  writes:
> On Thu, May 17, 2018 at 1:56 AM, Richard Sandiford
>  wrote:
>> Richard Biener  writes:
 @@ -2698,23 +2703,26 @@ convert_mult_to_fma_1 (tree mul_result,
  }
>>>
 if (negate_p)
 -   mulop1 = force_gimple_operand_gsi (&gsi,
 -  build1 (NEGATE_EXPR,
 -  type, mulop1),
 -  true, NULL_TREE, true,
 -  GSI_SAME_STMT);
 +   mulop1 = gimple_build (&seq, NEGATE_EXPR, type, mulop1);
>>>
 -  fma_stmt = gimple_build_assign (gimple_assign_lhs (use_stmt),
 - FMA_EXPR, mulop1, op2, addop);
 +  if (seq)
 +   gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT);
 +  fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2,
>>> addop);
 +  gimple_call_set_lhs (fma_stmt, gimple_assign_lhs (use_stmt));
 +  gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal
>>> (use_stmt));
 +  gsi_replace (&gsi, fma_stmt, true);
 +  /* Valueize aggressively so that we generate FMS, FNMA and FNMS
 +regardless of where the negation occurs.  */
 +  if (fold_stmt (&gsi, aggressive_valueize))
 +   update_stmt (gsi_stmt (gsi));
>>>
>>> I think it would be nice to be able to use gimple_build () with IFNs so you
>>> can
>>> gimple_build () the IFN and then use gsi_replace_with_seq () on it.  You
>>> only need to fold with generated negates, not with negates already in the
>>> IL?
>>> The the folding implied with gimple_build will take care of it.
>>
>> The idea was to pick up existing negates that feed the multiplication
>> as well as any added by the pass itself.
>>
>> On IRC yesterday we talked about how this should handle the ECF_NOTHROW
>> flag, and whether things like IFN_SQRT and IFN_FMA should always be
>> nothrow (like the built-in functions are).  But in the end I thought
>> it'd be better to keep things as they are.  We already handle
>> -fnon-call-exceptions for unfused a * b + c and before the patch also
>> handled it for FMA_EXPR.  It'd seem like a step backwards if the new
>> internal functions didn't handle it too.  If anything it seems like the
>> built-in functions should change to be closer to the tree_code and
>> internal_fn way of doing things, if we want to support -fnon-call-exceptions
>> properly.
>>
>> This also surprised me when doing the if-conversion patch I sent yesterday.
>> We're happy to vectorise:
>>
>>   for (int i = 0; i < 100; ++i)
>> x[i] = ... ? sqrt (x[i]) : 0;
>>
>> by doing the sqrt unconditionally and selecting on the result, even with
>> the default maths flags, but refuse to vectorise the simpler:
>>
>>   for (int i = 0; i < 100; ++i)
>> x[i] = ... ? x[i] + 1 : 0;
>>
>> in the same way.
>>
>>> Otherwise can you please move aggressive_valueize to gimple-fold.[ch]
>>> alongside no_follow_ssa_edges / follow_single_use_edges and maybe
>>> rename it as follow_all_ssa_edges?
>>
>> Ah, yeah, that's definitely a better name.
>>
>> I also renamed all_scalar_fma to scalar_all_fma, since I realised
>> after Andrew's reply that the old name made it sound like it was
>> "all scalars", whereas it meant to mean "all fmas".
>>
>> Tested as before.
>>
>> Thanks,
>> Richard
>>
>> 2018-05-17  Richard Sandiford  
>>
>> gcc/
>> * doc/sourcebuild.texi (scalar_all_fma): Document.
>> * tree.def (FMA_EXPR): Delete.
>> * internal-fn.def (FMA, FMS, FNMA, FNMS): New internal functions.
>> * internal-fn.c (ternary_direct): New macro.
>> (expand_ternary_optab_fn): Likewise.
>> (direct_ternary_optab_supported_p): Likewise.
>> * Makefile.in (build/genmatch.o): Depend on case-fn-macros.h.
>> * builtins.c (fold_builtin_fma): Delete.
>> (fold_builtin_3): Don't call it.
>> * cfgexpand.c (expand_debug_expr): Remove FMA_EXPR handling.
>> * expr.c (expand_expr_real_2): Likewise.
>> * fold-const.c (operand_equal_p): Likewise.
>> (fold_ternary_loc): Likewise.
>> * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
>> * gimple.c (DEFTREECODE): Likewise.
>> * gimplify.c (gimplify_expr): Likewise.
>> * optabs-tree.c (optab_for_tree_code): Likewise.
>> * tree-cfg.c (verify_gimple_assign_ternary): Likewise.
>> * tree-eh.c (operation_could_trap_p): Likewise.
>> (stmt_could_throw_1_p): Likewise.
>> * tree-inline.c (estimate_operator_cost): Likewise.
>> * tree-pretty-print.c (dump_generic_node): Likewise.
>> (op_code_prio): Likewise.
>> * tree-ssa-loop-im.c (stmt_cost): Likewise.
>> * tree-ssa-operands.c (get_expr_operands): Likewise.
>> * tree.c (commutative_ternary_tree_code, add_expr): Likewise.
>> * fold-const-call.h (fold_fma): Delete.
>> 

GCC 8 backports

2018-05-23 Thread Martin Liška
Hi.

I'm going to backport following 2 revisions.
Patches can bootstrap on x86_64-linux-gnu and survive regression tests.

Martin
>From 3308817aa11be9d43cd564d249dae1c28bf41015 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 11 May 2018 07:37:35 +
Subject: Backport r260154

gcc/ChangeLog:

2018-05-11  Martin Liska  

PR sanitizer/85556
	* doc/extend.texi: Document LLVM style format for no_sanitize
	attribute.

gcc/c-family/ChangeLog:

2018-05-11  Martin Liska  

PR sanitizer/85556
	* c-attribs.c (handle_no_sanitize_attribute): Iterate all
	TREE_LIST values.

gcc/testsuite/ChangeLog:

2018-05-11  Martin Liska  

PR sanitizer/85556
	* c-c++-common/ubsan/attrib-6.c: New test.

---
diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 9bddc1aad4f..d302b4f22c7 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -403,7 +403,7 @@ const struct attribute_spec c_common_attribute_table[] =
 			  0, 0, true, false, false, false,
 			  handle_no_address_safety_analysis_attribute,
 			  NULL },
-  { "no_sanitize",	  1, 1, true, false, false, false,
+  { "no_sanitize",	  1, -1, true, false, false, false,
 			  handle_no_sanitize_attribute, NULL },
   { "no_sanitize_address",0, 0, true, false, false, false,
 			  handle_no_sanitize_address_attribute, NULL },
@@ -683,22 +683,26 @@ static tree
 handle_no_sanitize_attribute (tree *node, tree name, tree args, int,
 			  bool *no_add_attrs)
 {
+  unsigned int flags = 0;
   *no_add_attrs = true;
-  tree id = TREE_VALUE (args);
   if (TREE_CODE (*node) != FUNCTION_DECL)
 {
   warning (OPT_Wattributes, "%qE attribute ignored", name);
   return NULL_TREE;
 }

-  if (TREE_CODE (id) != STRING_CST)
+  for (; args; args = TREE_CHAIN (args))
 {
-  error ("no_sanitize argument not a string");
-  return NULL_TREE;
-}
+  tree id = TREE_VALUE (args);
+  if (TREE_CODE (id) != STRING_CST)
+	{
+	  error ("no_sanitize argument not a string");
+	  return NULL_TREE;
+	}

-  char *string = ASTRDUP (TREE_STRING_POINTER (id));
-  unsigned int flags = parse_no_sanitize_attribute (string);
+  char *string = ASTRDUP (TREE_STRING_POINTER (id));
+  flags |= parse_no_sanitize_attribute (string);
+}

   add_no_sanitize_value (*node, flags);

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 9d085844cfd..a4664cad819 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2977,6 +2977,8 @@ mentioned in @var{sanitize_option}.  A list of values acceptable by
 @smallexample
 void __attribute__ ((no_sanitize ("alignment", "object-size")))
 f () @{ /* @r{Do something.} */; @}
+void __attribute__ ((no_sanitize ("alignment,object-size")))
+g () @{ /* @r{Do something.} */; @}
 @end smallexample

 @item no_sanitize_address
diff --git a/gcc/testsuite/c-c++-common/ubsan/attrib-6.c b/gcc/testsuite/c-c++-common/ubsan/attrib-6.c
new file mode 100644
index 000..2af70c8c2cf
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/ubsan/attrib-6.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=undefined" } */
+
+static void __attribute__((no_sanitize("foobar")))
+foo (void) { /* { dg-warning "attribute directive ignored" } */
+}
+
+static void __attribute__((no_sanitize("address,undefined")))
+foo2 (void) {
+}
+
+static void __attribute__((no_sanitize("address", "undefined")))
+foo3 (void) {
+}
+
+static void __attribute__((no_sanitize("address", "address", "")))
+foo4 (void) {
+}
+
+static void __attribute__((no_sanitize("address", "address", "address,address")))
+foo5 (void) {
+}
+
+static void __attribute__((no_sanitize("address", "address,kernel-address,thread,leak,undefined,vptr,shift,integer-divide-by-zero,unreachable,vla-bound,null,return,signed-integer-overflow,bounds,bounds-strict,alignment,object-size,float-divide-by-zero,float-cast-overflow,nonnull-attribute,returns-nonnull-attribute,bool,enum")))
+foo6 (void) {
+}
--
2.16.3
>From 8203f7efd03cc82717ab0416a151e96d3a7b8f4b Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 23 May 2018 07:40:43 +
Subject: Backport r260566

gcc/ChangeLog:

2018-05-23  Yury Gribov  

	PR tree-optimization/85822
	* tree-vrp.c (is_masked_range_test): Fix handling of negative
	constants.

gcc/testsuite/ChangeLog:

2018-05-23  Yury Gribov  

	PR tree-optimization/85822
	* c-c++-common/pr85822.c: New test.

---
diff --git a/gcc/testsuite/c-c++-common/pr85822.c b/gcc/testsuite/c-c++-common/pr85822.c
new file mode 100644
index 000..3b09188ab47
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/pr85822.c
@@ -0,0 +1,27 @@
+/* { dg-options "-O2" } */
+/* { dg-do run } */
+
+static const long long int TagTypeNumber = 0xll;
+
+long long int x;
+
+void foo(void)
+{
+  x = TagTypeNumber + 1;
+}
+
+int main(int argc, char **argv)
+{
+  if (argc > 0)
+foo ();
+
+  if ((x & TagTypeNumber) == TagTypeNumber)
+  {
+unsigned y = (unsigned)x;
+__builtin_printf ("v: %u\n", y);
+   

[PATCH] Disable strict-overflow warnings from data-ref code

2018-05-23 Thread Richard Biener

This silences an unhelpful warning about strict-overflow for the
samba build.  The warning is emitted from
tree-data-ref.c:create_intersect_range_checks and ends up in
the default input_location which is just the function scope.

Bootstrapped and tested on x86_64-unknown-linux-gnu, will apply to trunk 
and branch.

I guess we should eventually prune all input_location using warnings
and make those strict-overflow warnings opt-in rather than opt-out?
Thus only emit them from frontend triggered foldings?

Thanks,
Richard.

2018-05-23  Richard Biener  

* tree-data-ref.c (create_runtime_alias_checks): Defer
and ignore overflow warnings.

* gcc.dg/Wstrict-overflow-27.c: New testcase.

Index: gcc/tree-data-ref.c
===
--- gcc/tree-data-ref.c (revision 260306)
+++ gcc/tree-data-ref.c (working copy)
@@ -1918,6 +1918,7 @@ create_runtime_alias_checks (struct loop
 {
   tree part_cond_expr;
 
+  fold_defer_overflow_warnings ();
   for (size_t i = 0, s = alias_pairs->length (); i < s; ++i)
 {
   const dr_with_seg_len& dr_a = (*alias_pairs)[i].first;
@@ -1940,6 +1941,7 @@ create_runtime_alias_checks (struct loop
   else
*cond_expr = part_cond_expr;
 }
+  fold_undefer_and_ignore_overflow_warnings ();
 }
 
 /* Check if OFFSET1 and OFFSET2 (DR_OFFSETs of some data-refs) are identical
Index: gcc/testsuite/gcc.dg/Wstrict-overflow-27.c
===
--- gcc/testsuite/gcc.dg/Wstrict-overflow-27.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/Wstrict-overflow-27.c  (working copy)
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -Wstrict-overflow=2 -Werror" } */
+
+typedef __SIZE_TYPE__ size_t;
+extern char *strtok_r (char *__restrict __s, const char *__restrict __delim,
+  char **__restrict __save_ptr)
+  __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (2, 
3)));
+extern const unsigned short int **__ctype_b_loc (void)
+  __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__));
+extern int *DEBUGLEVEL_CLASS;
+size_t debug_num_classes = 0;
+void debug_parse_param(char *param);
+void
+debug_parse_levels(const char *params_str, size_t str_len)
+{
+  char str[str_len+1];
+  char *tok, *saveptr;
+  size_t i;
+  tok = strtok_r(str, " \t,\n\r", &saveptr);
+  if (((*__ctype_b_loc ())[(int) ((tok[0]))]))
+tok = strtok_r(((void *)0), " \t,\n\r", &saveptr);
+  else
+DEBUGLEVEL_CLASS[0] = 0;
+  for (i = 0 +1; i < debug_num_classes; i++)
+DEBUGLEVEL_CLASS[i] = DEBUGLEVEL_CLASS[0];
+  while (tok != ((void *)0) )
+debug_parse_param(tok);
+}


Re: [PATCH] Fix PR85834

2018-05-23 Thread Eric Botcazou
> I'm not quite sure that ao_ref with offset % 8 == 0 but size != maxsize
> has any guarantee that all accesses that this covers are aligned to 8
> bits but I failed to create a testcase with a variable bit-alignment.
> But I'm quite sure Ada can do that, right?  There may be existing code
> that assumes that if offset is byte-aligned all covered accesses are.

The middle-end assumes that variable offsets are always byte-aligned, see 
DECL_FIELD_OFFSET and DECL_FIELD_BIT_OFFSET.  Does that answer the above?

-- 
Eric Botcazou


Re: [PATCH] Fix PR85834

2018-05-23 Thread Richard Biener
On Wed, 23 May 2018, Eric Botcazou wrote:

> > I'm not quite sure that ao_ref with offset % 8 == 0 but size != maxsize
> > has any guarantee that all accesses that this covers are aligned to 8
> > bits but I failed to create a testcase with a variable bit-alignment.
> > But I'm quite sure Ada can do that, right?  There may be existing code
> > that assumes that if offset is byte-aligned all covered accesses are.
> 
> The middle-end assumes that variable offsets are always byte-aligned, see 
> DECL_FIELD_OFFSET and DECL_FIELD_BIT_OFFSET.  Does that answer the above?

Ah, yes.

Thanks,
Richard.


Re: PING^2: [PATCH] Don't mark IFUNC resolver as only called directly

2018-05-23 Thread Jan Hubicka
> On Tue, May 22, 2018 at 9:21 AM, Jan Hubicka  wrote:
> >> > >  class ipa_opt_pass_d;
> >> > >  typedef ipa_opt_pass_d *ipa_opt_pass;
> >> > > @@ -2894,7 +2896,8 @@ 
> >> > > cgraph_node::only_called_directly_or_aliased_p (void)
> >> > >   && !DECL_STATIC_CONSTRUCTOR (decl)
> >> > >   && !DECL_STATIC_DESTRUCTOR (decl)
> >> > >   && !used_from_object_file_p ()
> >> > > - && !externally_visible);
> >> > > + && !externally_visible
> >> > > + && !lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl)));
> >> >
> >> > How's it handled for our own generated resolver functions?  That is,
> >> > isn't there sth cheaper than doing a lookup_attribute here?  I see
> >> > that make_dispatcher_decl nor ix86_get_function_versions_dispatcher
> >> > adds the 'ifunc' attribute (though they are TREE_PUBLIC there).
> >> 
> >>  Is there any drawback of setting force_output flag?
> >>  Honza
> >> >>>
> >> >>> Setting force_output may prevent some optimizations.  Can we add a bit
> >> >>> for IFUNC resolver?
> >> >>>
> >> >>
> >> >> Here is the patch to add ifunc_resolver to cgraph_node. Tested on x86-64
> >> >> and i686.  Any comments?
> >> >>
> >> >
> >> > PING:
> >> >
> >> > https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00647.html
> >> >
> >>
> >> PING.
> > OK, but please extend the verifier that ifunc_resolver flag is equivalent to
> > lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl))
> > so we are sure things stays in sync.
> >
> 
> Like this
> 
> diff --git a/gcc/symtab.c b/gcc/symtab.c
> index 80f6f910c3b..954920b6dff 100644
> --- a/gcc/symtab.c
> +++ b/gcc/symtab.c
> @@ -998,6 +998,13 @@ symtab_node::verify_base (void)
>error ("function symbol is not function");
>error_found = true;
>}
> +  else if ((lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl))
> + != NULL)
> + != dyn_cast  (this)->ifunc_resolver)
> +  {
> +  error ("inconsistent `ifunc' attribute");
> +  error_found = true;
> +  }
>  }
>else if (is_a  (this))
>  {
> 
> 
> Thanks.
Yes, thanks!
Honza
> 
> -- 
> H.J.


Re: [RFC] [aarch64] Add HiSilicon tsv110 CPU support.

2018-05-23 Thread Zhangshaokun
Hi Kyrill,

On 2018/5/23 16:08, Kyrill Tkachov wrote:
> 
> On 23/05/18 05:54, Zhangshaokun wrote:
>> Hi Kyrill,
>>
>> On 2018/5/22 18:52, Kyrill Tkachov wrote:
>>> Hi Shaokun,
>>>
>>> On 22/05/18 09:40, Shaokun Zhang wrote:
 This patch adds HiSilicon's an mcpu: tsv110.

 ---
   gcc/ChangeLog|   9 +++
   gcc/config/aarch64/aarch64-cores.def |   5 ++
   gcc/config/aarch64/aarch64-cost-tables.h | 103 
 +++
   gcc/config/aarch64/aarch64-tune.md   |   2 +-
   gcc/config/aarch64/aarch64.c |  79 
   gcc/doc/invoke.texi  |   2 +-
   6 files changed, 198 insertions(+), 2 deletions(-)

 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index cec2892..5d44966 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,12 @@
 +2018-05-22  Shaokun Zhang 
 +Bo Zhou  
 +
 +   * config/aarch64/aarch64-cores.def (tsv110): New CPU.
 +   * config/aarch64/aarch64-tune.md: Regenerated.
 +   * doc/invoke.texi (AArch61 Options/-mtune): Add "tsv110".
>>> typo: AArch64.
>>>
>> Good catch, my mistake.
>>
 +   * gcc/config/aarch64/aarch64.c (tsv110_tunings): New tuning table.
 +   * gcc/config/aarch64/aarch64-cost-tables.h: Add "tsv110" extra 
 costs.
>>> Please start the path with config/.
>>>
>> Sure, Will remove gcc/ next version.
>>
 +
   2018-05-21  Michael Meissner 

   PR target/85657
 diff --git a/gcc/config/aarch64/aarch64-cores.def 
 b/gcc/config/aarch64/aarch64-cores.def
 index 33b96ca..db7a412 100644
 --- a/gcc/config/aarch64/aarch64-cores.def
 +++ b/gcc/config/aarch64/aarch64-cores.def
 @@ -91,6 +91,11 @@ AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A, 
  AARCH64_FL_FOR_ARCH8_2
   /* Qualcomm ('Q') cores. */
   AARCH64_CORE("saphira", saphira,falkor,8_3A, 
 AARCH64_FL_FOR_ARCH8_3 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   
 0x51, 0xC01, -1)

 +/* ARMv8.4-A Architecture Processors.  */
 +
 +/* HiSilicon ('H') cores. */
 +AARCH64_CORE("tsv110", tsv110,tsv110,8_4A, 
 AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_F16 | 
 AARCH64_FL_AES | AARCH64_FL_SHA2, tsv110,   0x48, 0xd01, -1)
 +
>>> The third field is the scheduler model to use when optimising.
>>> Since there is no tsv110 scheduling model, using the name "tsv110"
>>> in the third field will generally give pretty poor schedules.
>>> I recommend you specify an scheduling model that most closely matches your 
>>> core
>>> for the time being. But I don't think it's required and I wouldn't let it 
>>> hold
>> I checked it again, cortexa57 is most closely matches tsv110 and thanks your
>> suggestion.
>> If i choose cortexa57, can i add the tsv110_tunings which will use tsv110's
>> pipeline features, like the rest patch as follow or only use generic feature?
> 
> If you use cortexa57 for the scheduling model (the 3rd field) you should still
> use tsv110_tunings in the 6th field as this will specify other important 
> parameters
> like instruction selection costs, fusion capabilities, alignment requirements 
> etc.
> 

Thanks your comments, i will wait other maintainers comments and prepare next 
version.
One more question, any thoughts on my cover letter issue that skips DC CVAU for
HiSilicon tsv110 when sync icache and dcache?

Thanks,
Shaokun

> Thanks,
> Kyrill
> 
>>
>>> up the patch.
>>>
>>> You'll need approval from an aarch64 maintainer (cc'ed some for you).
>>>
>> Good, thanks for your nice guidance.
>>
>> Thanks,
>> Shaokun
>>
>>> Thanks,
>>> Kyrill
>>>
   /* ARMv8-A big.LITTLE implementations.  */

   AARCH64_CORE("cortex-a57.cortex-a53",  cortexa57cortexa53, cortexa53, 
 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, 0x41, 
 AARCH64_BIG_LITTLE (0xd07, 0xd03), -1)
 diff --git a/gcc/config/aarch64/aarch64-cost-tables.h 
 b/gcc/config/aarch64/aarch64-cost-tables.h
 index a455c62..b6890d6 100644
 --- a/gcc/config/aarch64/aarch64-cost-tables.h
 +++ b/gcc/config/aarch64/aarch64-cost-tables.h
 @@ -334,4 +334,107 @@ const struct cpu_cost_table thunderx2t99_extra_costs 
 =
 }
   };

 +const struct cpu_cost_table tsv110_extra_costs =
 +{
 +  /* ALU */
 +  {
 +0, /* arith.  */
 +0, /* logical.  */
 +0, /* shift.  */
 +0, /* shift_reg.  */
 +COSTS_N_INSNS (1), /* arith_shift.  */
 +COSTS_N_INSNS (1), /* arith_shift_reg.  */
 +COSTS_N_INSNS (1), /* log_shift.  */
 +COSTS_N_INSNS (1), /* log_shift_reg.  */
 +0, /* extend.  */
 +COSTS_N_INSNS (1), /* extend_arith.  */
 +0, /* bfi.  */
 +0, /* bfx.

Re: [Libgomp, Fortran] Fix canadian cross build

2018-05-23 Thread Maxim Kuvyrkov
> On Jun 23, 2017, at 4:44 PM, Yvan Roux  wrote:
> 
> Hello,
> 
> Fortran parts of libgomp (omp_lib.mod, openacc.mod, etc...) are
> missing in a canadian cross build, at least when target gfortran
> compiler comes from PATH and not from GFORTRAN_FOR_TARGET.
> 
> Back in 2010, executability test of GFORTRAN was added to fix libgomp
> build on cygwin, but when the executable doesn't contain the path,
> "test -x" fails and part of the library are not built.
> 
> This patch fixes the issue by using M4 macro AC_PATH_PROG (which
> returns the absolute name) instead of AC_CHECK_PROG in the function
> defined in config/acx.m4: NCN_STRICT_CHECK_TARGET_TOOLS.  I renamed it
> into NCN_STRICT_PATH_TARGET_TOOLS to keep the semantic used in M4.
> 
> Tested by building cross and candian cross toolchain (host:
> i686-w64-mingw32) for arm-linux-gnueabihf with issue and with a
> complete libgomp.
> 
> ok for trunk ?

Hi Yvan,

The patch looks OK, but it is a pain to review.  Would you please split it into 
2 patches: one for the mechanical renames, and one for logical changes to 
acx.m4?  This should allow Paolo and DJ to approve your patch.

Thanks!

> 
> Thanks
> Yvan
> 
> config/ChangeLog
> 2017-06-23  Yvan Roux  
> 
>* acx.m4 (NCN_STRICT_CHECK_TARGET_TOOLS): Renamed to ...
>(NCN_STRICT_PATH_TARGET_TOOLS): ... this.  It reflects the replacement
>of AC_CHECK_PROG by AC_PATH_PROG to get the absolute name of the
>program.
>(ACX_CHECK_INSTALLED_TARGET_TOOL): Use renamed function.
> 
> ChangeLog
> 2017-06-23  Yvan Roux  
> 
>* configure.ac: Use NCN_STRICT_PATH_TARGET_TOOLS instead of
>NCN_STRICT_CHECK_TARGET_TOOLS.
>* configure: Regenerate.
> 

--
Maxim Kuvyrkov
www.linaro.org





Re: [PATCH] testsuite: Introduce be/le selectors

2018-05-23 Thread Richard Earnshaw (lists)
On 22/05/18 22:21, Jeff Law wrote:
> On 05/21/2018 03:46 PM, Segher Boessenkool wrote:
>> This patch creates "be" and "le" selectors, which can be used by all
>> architectures, similar to ilp32 and lp64.
>>
>> Is this okay for trunk?
>>
>>
>> Segher
>>
>>
>> 2017-05-21  Segher Boessenkool  
>>
>> gcc/testsuite/
>>  * lib/target-supports.exp (check_effective_target_be): New.
>>  (check_effective_target_le): New.
> I think this is fine.  "be" "le" are used all over the place in gcc and
> the kernel to denote big/little endian.

except when el and eb are used for perversity... :-)

R.

> 
> jeff
> 



Re: [PATCH] Fix PR85712 (SLSR cleanup of alternative interpretations)

2018-05-23 Thread Richard Biener
On Tue, May 22, 2018 at 11:37 PM Bill Schmidt 
wrote:

> Hi,

> PR85712 shows where an existing test case fails in the SLSR pass because
> the code is flawed that cleans up alternative interpretations (CAND_ADD
> versus CAND_MULT, for example) after a replacement.  This patch fixes the
> flaw by ensuring that we always visit all interpretations, not just
> subsequent ones in the next_interp chain.  I found six occurrences of
> this mistake in the code.

> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
> No new test case is added since the failure occurs on an existing test
> in the test suite.  Is this okay for trunk, and for backports to all
> supported branches after some burn-in time?

OK and Yes for the backports.

Thanks,
Richard.

> Thanks,
> Bill


> 2018-05-22  Bill Schmidt  

>  * gimple-ssa-strength-reduction.c (struct slsr_cand_d): Add
>  first_interp field.
>  (alloc_cand_and_find_basis): Initialize first_interp field.
>  (slsr_process_mul): Modify first_interp field.
>  (slsr_process_add): Likewise.
>  (slsr_process_cast): Modify first_interp field for each new
>  interpretation.
>  (slsr_process_copy): Likewise.
>  (dump_candidate): Dump first_interp field.
>  (replace_mult_candidate): Process all interpretations, not just
>  subsequent ones.
>  (replace_rhs_if_not_dup): Likewise.
>  (replace_one_candidate): Likewise.

> Index: gcc/gimple-ssa-strength-reduction.c
> ===
> --- gcc/gimple-ssa-strength-reduction.c (revision 260484)
> +++ gcc/gimple-ssa-strength-reduction.c (working copy)
> @@ -266,6 +266,10 @@ struct slsr_cand_d
>of a statement.  */
> cand_idx next_interp;

> +  /* Index of the first candidate record in a chain for the same
> + statement.  */
> +  cand_idx first_interp;
> +
> /* Index of the basis statement S0, if any, in the candidate vector.
  */
> cand_idx basis;

> @@ -686,6 +690,7 @@ alloc_cand_and_find_basis (enum cand_kind kind, gi
> c->kind = kind;
> c->cand_num = cand_vec.length () + 1;
> c->next_interp = 0;
> +  c->first_interp = c->cand_num;
> c->dependent = 0;
> c->sibling = 0;
> c->def_phi = kind == CAND_MULT ? find_phi_def (base) : 0;
> @@ -1261,6 +1266,7 @@ slsr_process_mul (gimple *gs, tree rhs1, tree rhs2
>   is the stride and RHS2 is the base expression.  */
> c2 = create_mul_ssa_cand (gs, rhs2, rhs1, speed);
> c->next_interp = c2->cand_num;
> +  c2->first_interp = c->cand_num;
>   }
> else if (TREE_CODE (rhs2) == INTEGER_CST)
>   {
> @@ -1498,7 +1504,10 @@ slsr_process_add (gimple *gs, tree rhs1, tree rhs2
>  {
>c2 = create_add_ssa_cand (gs, rhs2, rhs1, false, speed);
>if (c)
> -   c->next_interp = c2->cand_num;
> +   {
> + c->next_interp = c2->cand_num;
> + c2->first_interp = c->cand_num;
> +   }
>else
>  add_cand_for_stmt (gs, c2);
>  }
> @@ -1621,6 +1630,8 @@ slsr_process_cast (gimple *gs, tree rhs1, bool spe

> if (base_cand && base_cand->kind != CAND_PHI)
>   {
> +  slsr_cand_t first_cand = NULL;
> +
> while (base_cand)
>  {
>/* Propagate all data from the base candidate except the type,
> @@ -1635,6 +1646,12 @@ slsr_process_cast (gimple *gs, tree rhs1, bool spe
>   base_cand->index,
base_cand->stride,
>   ctype, base_cand->stride_type,
>   savings);
> + if (!first_cand)
> +   first_cand = c;
> +
> + if (first_cand != c)
> +   c->first_interp = first_cand->cand_num;
> +
>if (base_cand->next_interp)
>  base_cand = lookup_cand (base_cand->next_interp);
>else
> @@ -1657,6 +1674,7 @@ slsr_process_cast (gimple *gs, tree rhs1, bool spe
> c2 = alloc_cand_and_find_basis (CAND_MULT, gs, rhs1, 0,
>integer_one_node, ctype, sizetype,
0);
> c->next_interp = c2->cand_num;
> +  c2->first_interp = c->cand_num;
>   }

> /* Add the first (or only) interpretation to the statement-candidate
> @@ -1681,6 +1699,8 @@ slsr_process_copy (gimple *gs, tree rhs1, bool spe

> if (base_cand && base_cand->kind != CAND_PHI)
>   {
> +  slsr_cand_t first_cand = NULL;
> +
> while (base_cand)
>  {
>/* Propagate all data from the base candidate.  */
> @@ -1693,6 +1713,12 @@ slsr_process_copy (gimple *gs, tree rhs1, bool spe
>   base_cand->index,
base_cand->stride,
>   base_cand->cand_type,
>   base_cand->stride_type, savings);
> + if (!first_cand)
> +   

RE: [RFT PATCH, AVX512]: Implement scalar unsigned int->float conversions with AVX512F

2018-05-23 Thread Peryt, Sebastian
> From: Uros Bizjak [mailto:ubiz...@gmail.com]
> Sent: Tuesday, May 22, 2018 8:43 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Peryt, Sebastian ; Jakub Jelinek
> 
> Subject: [RFT PATCH, AVX512]: Implement scalar unsigned int->float conversions
> with AVX512F
> 
> Hello!
> 
> Attached patch implements scalar unsigned int->float conversions with
> AVX512F.
> 
> 2018-05-22  Uros Bizjak  
> 
> * config/i386/i386.md (*floatuns2_avx512):
> New insn pattern.
> (floatunssi2): Also enable for AVX512F and TARGET_SSE_MATH.
> Rewrite expander pattern.  Emit gen_floatunssi2_i387_with_xmm
> for non-SSE modes.
> (floatunsdisf2): Rewrite expander pattern.  Hanlde TARGET_AVX512F.
> (floatunsdidf2): Ditto.
> 
> testsuite/ChangeLog:
> 
> 2018-05-22  Uros Bizjak  
> 
> * gcc.target/i386/cvt-3.c: New test.
> 
> Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}., but
> not tested on AVX512 target.

I have checked it on x86_64-linux-gnu {,-m32} on SKX and don't see any 
stability regressions.

Sebastian

> 
> Uros.


Re: [PATCH] PR target/85358: Add target hook to prevent default widening

2018-05-23 Thread Richard Biener
On Tue, 22 May 2018, Michael Meissner wrote:

> I posted this patch at the end of GCC 8 as a RFC.  Now that we are in GCC 9, I
> would like to repose it.  Sorry to spam some of you.  It is unclear whom the
> reviewers for things like target hooks and basic mode handling are.
> 
> Here is the original patch.
> https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00764.html
> 
> PowerPC has 3 different 128-bit floating point types (KFmode, IFmode, and
> TFmode).  We are in the process of migrating long double from IBM extended
> double to IEEE 128-bit floating point.
> 
> * IFmode is IBM extended double (__ibm128)
> * KFmode is IEEE 128-bit floating point (__float128, _Float128N)
> * TFmode is whatever long double maps to
> 
> If we are compiling for a power8 system (which does not have hardware IEEE
> 128-bit floating point), the current system works because each of the 3 modes
> do not have hardware support.
> 
> If we are compiling for a power9 system and long double is IBM extended 
> double,
> again things are fine.
> 
> However, if we compiling for power9 and we've flipped the default for long
> double to be IEEE 128-bit floating point, then the code to support __ibm128
> breaks.  The machine independent portions of the mode handling says oh, there
> is hardware to support TFmode operations, lets widen the type to TFmode and do
> those operations.  However, converting IFmode to TFmode is not cheap, it has 
> to
> be done in a function call.
> 
> This patch adds a new target hook, that if it is overriden, the backend can 
> say
> don't automatically widen this type to that type.  The PowerPC port defines 
> the
> target hook so that it doesn't automatically convert IBM extended double to
> IEEE 128-bit and vice versa.
> 
> This patch goes through all of the places that calls GET_MODE_WIDER_MODE and
> then calls the target hook.  Now, the PowerPC only needs to block certain
> floating point widenings.  Several of the changes are to integer widenings, 
> and
> if desired, we could restrict the changes to just floating point types.
> However, there might be other ports that need the flexibility for other types.
> 
> I have tried various other approprches to fix this problem, and so far, I have
> not been able to come up with a PowerPC back-end only solution that works.
> 
> Alternatively, Segher has suggested that the call to the target hook be in
> GET_MODE_WIDER_MODE and GET_MODE_2XWIDER_MODE (plus any places where we access
> the mode_wider array direcly).
> 
> I have built little endian PowerPC builds with this patch, and I have verified
> that it does work.  I have tested the same patch in April on a big endian
> PowerPC system and x86_64 and it worked there also.
> 
> Can I check in this patch as is (I will verify x86/PowerPC big endian still
> works before checkin).  Or would people prefer modifications to the patch?

Just a question for clarification - _is_ KFmode strictly a wider mode
than IFmode?  That is, can it represent all values that IFmode can?

On another note I question the expanders considering wider FP modes
somewhat in general.  So maybe the hook shouldn't be named
default_widening_p but rather mode_covers_p ()?  And we can avoid
calling the hook for integer modes.

That said, I wonder if the construction of mode_wider and friends
should be (optionally) made more explicit in the modes .def file
so powerpc could avoid any wider relation for IFmode.

Thanks,
Richard.

> [gcc]
> 2018-05-22  Michael Meissner  
> 
>   PR target/85358
>   * target.def (default_widening_p): New target hook to say whether
>   default widening between modes should be done.
>   * targhooks.h (hook_bool_mode_mode_bool_true): New declaration.
>   * targhooks.c (hook_bool_mode_mode_bool_true): New default target
>   hook.
>   * optabs.c (expand_binop): Before doing default widening, check
>   whether the backend allows the widening.
>   (expand_twoval_unop): Likewise.
>   (expand_twoval_binop): Likewise.
>   (widen_leading): Likewise.
>   (widen_bswap): Likewise.
>   (expand_unop): Likewise.
>   * cse.c (cse_insn): Likewise.
>   * combine.c (simplify_comparison): Likewise.
>   * var-tracking.c (prepare_call_arguments): Likewise.
>   * config/rs6000/rs6000.c (TARGET_DEFAULT_WIDENING_P): Define
>   target hook to prevent IBM extended double and IEEE 128-bit
>   floating point from being converted to each by default.
>   (rs6000_default_widening_p): Likewise.
>   * doc/tm.texi (TARGET_DEFAULT_WIDENING_P): Document the new
>   default widening hook.
>   * doc/tm.texi.in (TARGET_DEFAULT_WIDENING_P): Likewise.
> 
> [gcc/testsuite]
> 2018-05-22  Michael Meissner  
> 
>   PR target/85358
>   * gcc.target/powerpc/pr85358.c: New test to make sure __ibm128
>   does not widen to __float128 on ISA 3.0 systems.
> 
> In order to start the transition of PowerPC long double to IEEE 128-bit, we
> will need this patch or a sim

Re: Fix SLP def type when computing masks (PR85853)

2018-05-23 Thread Richard Biener
On Wed, May 23, 2018 at 8:41 AM Richard Sandiford <
richard.sandif...@linaro.org> wrote:

> In this PR, SLP failed to include a comparison node in the SLP
> tree and so marked the node as external.  It then went on to call
> vect_is_simple_use on the comparison with its STMT_VINFO_DEF_TYPE
> still claiming that it was an internal definition.

> We already avoid that for vect_analyze_stmt by temporarily copying
> the node's definition type to each STMT_VINFO_DEF_TYPE.  This patch
> extends that to the vector type calculation.  The easiest thing
> seemed to be to split the analysis of the root node out into
> a subroutine, so that it's possible to return false early without
> awkward control flow.

> Tested on aarch64-linux-gnu (with and without SLP), aarch64_be-elf
> and x86_64-linux-gnu.  OK to install?

OK.

Richard.

> Richard


> 2018-05-23  Richard Sandiford  

> gcc/
>  PR tree-optimization/85853
>  * tree-vect-slp.c (vect_slp_analyze_node_operations): Split out
>  handling of the root of the node to...
>  (vect_slp_analyze_node_operations_1): ...this new function,
>  and run the whole thing with the child nodes' def types
>  set according to their SLP node's def type.

> gcc/testsuite/
>  PR tree-optimization/85853
>  * gfortran.dg/vect/pr85853.f90: New test.

> Index: gcc/tree-vect-slp.c
> ===
> --- gcc/tree-vect-slp.c 2018-05-17 11:50:31.609158213 +0100
> +++ gcc/tree-vect-slp.c 2018-05-23 07:37:12.480578116 +0100
> @@ -2476,49 +2476,16 @@ _bb_vec_info::~_bb_vec_info ()
> bb->aux = NULL;
>   }

> -
> -/* Analyze statements contained in SLP tree NODE after recursively
analyzing
> -   the subtree.  NODE_INSTANCE contains NODE and VINFO contains INSTANCE.
> -
> -   Return true if the operations are supported.  */
> +/* Subroutine of vect_slp_analyze_node_operations.  Handle the root of
NODE,
> +   given then that child nodes have already been processed, and that
> +   their def types currently match their SLP node's def type.  */

>   static bool
> -vect_slp_analyze_node_operations (vec_info *vinfo, slp_tree node,
> - slp_instance node_instance,
> - scalar_stmts_to_slp_tree_map_t *visited,
> - scalar_stmts_to_slp_tree_map_t
*lvisited,
> - stmt_vector_for_cost *cost_vec)
> +vect_slp_analyze_node_operations_1 (vec_info *vinfo, slp_tree node,
> +   slp_instance node_instance,
> +   stmt_vector_for_cost *cost_vec)
>   {
> -  bool dummy;
> -  int i, j;
> -  gimple *stmt;
> -  slp_tree child;
> -
> -  if (SLP_TREE_DEF_TYPE (node) != vect_internal_def)
> -return true;
> -
> -  /* If we already analyzed the exact same set of scalar stmts we're
done.
> - We share the generated vector stmts for those.  */
> -  slp_tree *leader;
> -  if ((leader = visited->get (SLP_TREE_SCALAR_STMTS (node)))
> -  || (leader = lvisited->get (SLP_TREE_SCALAR_STMTS (node
> -{
> -  SLP_TREE_NUMBER_OF_VEC_STMTS (node)
> -   = SLP_TREE_NUMBER_OF_VEC_STMTS (*leader);
> -  return true;
> -}
> -
> -  /* The SLP graph is acyclic so not caching whether we failed or
succeeded
> - doesn't result in any issue since we throw away the lvisited set
> - when we fail.  */
> -  lvisited->put (SLP_TREE_SCALAR_STMTS (node).copy (), node);
> -
> -  FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
> -if (!vect_slp_analyze_node_operations (vinfo, child, node_instance,
> -  visited, lvisited, cost_vec))
> -  return false;
> -
> -  stmt = SLP_TREE_SCALAR_STMTS (node)[0];
> +  gimple *stmt = SLP_TREE_SCALAR_STMTS (node)[0];
> stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
> gcc_assert (stmt_info);
> gcc_assert (STMT_SLP_TYPE (stmt_info) != loop_vect);
> @@ -2545,6 +2512,7 @@ vect_slp_analyze_node_operations (vec_in
>  }

> gimple *sstmt;
> +  unsigned int i;
> FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, sstmt)
>  STMT_VINFO_VECTYPE (vinfo_for_stmt (sstmt)) = vectype;
>   }
> @@ -2572,12 +2540,56 @@ vect_slp_analyze_node_operations (vec_in
>  = vect_get_num_vectors (vf * group_size, vectype);
>   }

> +  bool dummy;
> +  return vect_analyze_stmt (stmt, &dummy, node, node_instance, cost_vec);
> +}
> +
> +/* Analyze statements contained in SLP tree NODE after recursively
analyzing
> +   the subtree.  NODE_INSTANCE contains NODE and VINFO contains INSTANCE.
> +
> +   Return true if the operations are supported.  */
> +
> +static bool
> +vect_slp_analyze_node_operations (vec_info *vinfo, slp_tree node,
> + slp_instance node_instance,
> + scalar_stmts_to_slp_tree_map_t *visited,
> + 

[PATCH, AVX512]: Fix cvtusi264 insn mnemonic

2018-05-23 Thread Uros Bizjak
Hello!

With current insn mnemonic and ATT assembler dialect, there is no way
for the assembler to distinguish between DImode and SImode instruction
when memory input operand is used. The dump for 32bit memory reads as:

   0:   62 f1 7e 08 7b 05 00vcvtusi2ss 0x0(%rip),%xmm0,%xmm0
   7:   00 00 00
  10:   62 f1 7f 08 7b 05 00vcvtusi2sd 0x0(%rip),%xmm0,%xmm0
  17:   00 00 00

And for 64bit access:

  20:   62 f1 fe 08 7b 05 00vcvtusi2ss 0x0(%rip),%xmm0,%xmm0
  27:   00 00 00
  30:   62 f1 ff 08 7b 05 00vcvtusi2sd 0x0(%rip),%xmm0,%xmm0
  37:   00 00 00

 (Note the difference in the 3rd byte. On a related note, binutils
should also be fixed to dump vcvtsi2sdq in the 64bit case to avoid
ambiguity)

We should use "q" suffix with the ATT dialect in the Dimode insn mnemonic.

2018-05-23  Uros Bizjak  

* config/i386/sse.md (cvtusi264):
Add {q} suffix to insn mnemonic.

testsuite/Changelog:

2018-05-23  Uros Bizjak  

* gcc.target/i386/avx512f-vcvtusi2sd64-1.c: Update scan string.
* gcc.target/i386/avx512f-vcvtusi2ss64-1.c: Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

OK for mainline and backports?

Uros.
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 8a80fa35067..30411b15493 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -4463,7 +4463,7 @@
  (match_operand:VF_128 1 "register_operand" "v")
  (const_int 1)))]
   "TARGET_AVX512F && TARGET_64BIT"
-  "vcvtusi2\t{%2, %1, %0|%0, %1, 
%2}"
+  "vcvtusi2{q}\t{%2, %1, %0|%0, %1, 
%2}"
   [(set_attr "type" "sseicvt")
(set_attr "prefix" "evex")
(set_attr "mode" "")])
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vcvtusi2sd64-1.c 
b/gcc/testsuite/gcc.target/i386/avx512f-vcvtusi2sd64-1.c
index 8675450f0c4..66476c3013f 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vcvtusi2sd64-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vcvtusi2sd64-1.c
@@ -1,7 +1,7 @@
 /* { dg-do compile { target { ! ia32 } } } */
 /* { dg-options "-mavx512f -O2" } */
-/* { dg-final { scan-assembler-times "vcvtusi2sd\[ 
\\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vcvtusi2sd\[ 
\\t\]+\[^%\n\]*%r\[^\{\n\]*\{ru-sae\}\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 
} } */
+/* { dg-final { scan-assembler-times "vcvtusi2sdq\[ 
\\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vcvtusi2sdq\[ 
\\t\]+\[^%\n\]*%r\[^\{\n\]*\{ru-sae\}\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 
} } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vcvtusi2ss64-1.c 
b/gcc/testsuite/gcc.target/i386/avx512f-vcvtusi2ss64-1.c
index 38ecf39ad65..f4dae536873 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vcvtusi2ss64-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vcvtusi2ss64-1.c
@@ -1,7 +1,7 @@
 /* { dg-do compile { target { ! ia32 } } } */
 /* { dg-options "-mavx512f -O2" } */
-/* { dg-final { scan-assembler-times "vcvtusi2ss\[ 
\\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vcvtusi2ss\[ 
\\t\]+\[^%\n\]*%r\[^\{\n\]*\{rz-sae\}\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 
} } */
+/* { dg-final { scan-assembler-times "vcvtusi2ssq\[ 
\\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vcvtusi2ssq\[ 
\\t\]+\[^%\n\]*%r\[^\{\n\]*\{rz-sae\}\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 
} } */
 
 #include 
 


Re: [PATCH PR85804]Fix wrong code by correcting bump step computation in vector(1) load of single-element group access

2018-05-23 Thread Richard Biener
On Tue, May 22, 2018 at 2:11 PM Richard Sandiford <
richard.sandif...@linaro.org> wrote:

> Richard Biener  writes:
> > On Mon, May 21, 2018 at 3:14 PM Bin Cheng  wrote:
> >
> >> Hi,
> >> As reported in PR85804, bump step is wrongly computed for vector(1)
load
> > of
> >> single-element group access.  This patch fixes the issue by correcting
> > bump
> >> step computation for the specific VMAT_CONTIGUOUS case.
> >
> >> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK?
> >
> > To me it looks like the classification as VMAT_CONTIGUOUS is bogus.
> > We'd fall into the grouped_load case otherwise which should handle
> > the situation correctly?
> >
> > Richard?

> Yeah, I agree.  I mentioned to Bin privately that that was probably
> a misstep and that we should instead continue to treat them as
> VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector
> from the array of loaded vectors, instead of doing an actual permute.

I'd classify them as VMAT_ELEMENTWISE instead.  CONTIGUOUS
should be only for no-gap vectors.  How do we classify single-element
interleaving?  That would be another classification choice.

> (Note that VMAT_CONTIGUOUS is OK for stores, since we don't allow
> gaps there.  But it might be easiest to handle both loads and stores
> in the same way.)

> Although it still seems weird to "vectorise" stuff to one element.
> Why not leave the original scalar code in place, and put the onus on
> whatever wants to produce or consume a V1 to do the appropriate
> conversion?

Yeah, V1 is somewhat awkward to deal with but I think the way we're
doing it right now is OK.  Note if we leave the original scalar code in
place we still have to deal with larger unrolling factors thus we'd have
to duplicate the scalar code and have different IVs anyways.

Richard.

> Thanks,
> Richard


[Ada] Initialize_Scalars optimization causes spurious runtime check failure

2018-05-23 Thread Pierre-Marie de Rodat
This patch suppresses the optimization of scalar arrays when pragma
Initialize_Scalars is in effect if the component type is subject to
predicates. Since the scalar array is initialized with invalid values,
these values may violate the predicate or a validity check within the
predicate.


-- Source --


--  gnat.adc

pragma Initialize_Scalars;

--  types.ads

with System; use System;

package Types is
   type Byte is mod System.Storage_Unit;

   subtype Inter_Byte is Byte;

   function Always_OK (B : Inter_Byte) return Boolean is (True);
   function Is_OK (B : Inter_Byte) return Boolean is (Always_OK (B));

   subtype Final_Byte is Byte with Predicate => Is_OK (Final_Byte);

   type Bytes is array (1 .. 5) of Final_Byte;

   Obj : Bytes;
end Types;

--  main.adb

with Types; use Types;

procedure Main is begin null; end Main;

-
-- Compilation --
-

$ gnatmake -q -gnata -gnatVa main.adb
$ ./main

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Hristian Kirtchev  

gcc/ada/

* exp_ch3.adb (Default_Initialize_Object): Do not optimize scalar array
initialization when the component type has predicates.
* exp_ch4.adb (Expand_N_Allocator): Do not optimize scalar array
allocation when the component type has predicates.--- gcc/ada/exp_ch3.adb
+++ gcc/ada/exp_ch3.adb
@@ -6069,29 +6069,43 @@ package body Exp_Ch3 is
   null;
 
--  Optimize the default initialization of an array object when
-   --  the following conditions are met:
-   --
-   --* Pragma Initialize_Scalars or Normalize_Scalars is in
-   --  effect.
-   --
-   --* The bounds of the array type are static and lack empty
-   --  ranges.
-   --
-   --* The array type does not contain atomic components or is
-   --  treated as packed.
-   --
-   --* The component is of a scalar type which requires simple
-   --  initialization.
-   --
+   --  pragma Initialize_Scalars or Normalize_Scalars is in effect.
--  Construct an in-place initialization aggregate which may be
--  convert into a fast memset by the backend.
 
elsif Init_Or_Norm_Scalars
  and then Is_Array_Type (Typ)
+
+ --  The array must lack atomic components because they are
+ --  treated as non-static, and as a result the backend will
+ --  not initialize the memory in one go.
+
  and then not Has_Atomic_Components (Typ)
+
+ --  The array must not be packed because the invalid values
+ --  in System.Scalar_Values are multiples of Storage_Unit.
+
  and then not Is_Packed (Typ)
+
+ --  The array must have static non-empty ranges, otherwise
+ --  the backend cannot initialize the memory in one go.
+
  and then Has_Static_Non_Empty_Array_Bounds (Typ)
+
+ --  The optimization is only relevant for arrays of scalar
+ --  types.
+
  and then Is_Scalar_Type (Component_Type (Typ))
+
+ --  Similar to regular array initialization using a type
+ --  init proc, predicate checks are not performed because the
+ --  initialization values are intentionally invalid, and may
+ --  violate the predicate.
+
+ and then not Has_Predicates (Component_Type (Typ))
+
+ --  The component type must have a single initialization value
+
  and then Simple_Initialization_OK (Component_Type (Typ))
then
   Set_No_Initialization (N, False);

--- gcc/ada/exp_ch4.adb
+++ gcc/ada/exp_ch4.adb
@@ -4618,28 +4618,42 @@ package body Exp_Ch4 is
   Is_Allocate => True);
 end if;
 
- --  Optimize the default allocation of an array object when the
- --  following conditions are met:
- --
- --* Pragma Initialize_Scalars or Normalize_Scalars is in effect
- --
- --* The bounds of the array type are static and lack empty ranges
- --
- --* The array type does not contain atomic components or is
- --  treated as packed.
- --
- --* The component is of a scalar type which requires simple
- --  initialization.
- --
- --  Construct an in-place initialization aggregate which may be
- --  convert into a fast memset by the backend.
+ --  Optimize the default allocation of an array object when pragma
+ --  Initialize_Scalars or Normalize_Scalars is in effect. Construct an
+ --  in-place initialization aggregate which 

[Ada] Suppression of elaboration-related warnings

2018-05-23 Thread Pierre-Marie de Rodat
This patch modifies the effects of pragma Warnings (Off, ...) to suppress
elaboration warnings related to an entity.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Hristian Kirtchev  

gcc/ada/

* einfo.adb (Is_Elaboration_Checks_OK_Id): Use predicate
Is_Elaboration_Target.
(Is_Elaboration_Target): New routine.
(Is_Elaboration_Warnings_OK_Id): Use predicate Is_Elaboration_Target.
(Set_Is_Elaboration_Checks_OK_Id): Use predicate Is_Elaboration_Target.
(Set_Is_Elaboration_Warnings_OK_Id): Use predicate
Is_Elaboration_Target.
* einfo.ads: Add new synthesized attribute Is_Elaboration_Target along
with occurrences in nodes.
(Is_Elaboration_Target): New routine.
* sem_prag.adb (Analyze_Pragma): Suppress elaboration warnings when an
elaboration target is subject to pragma Warnings (Off, ...).

gcc/testsuite/

* gnat.dg/elab5.adb, gnat.dg/elab5_pkg.adb, gnat.dg/elab5_pkg.ads: New
testcase.--- gcc/ada/einfo.adb
+++ gcc/ada/einfo.adb
@@ -2253,23 +2253,13 @@ package body Einfo is
 
function Is_Elaboration_Checks_OK_Id (Id : E) return B is
begin
-  pragma Assert
-(Ekind_In (Id, E_Constant, E_Variable)
-  or else Is_Entry (Id)
-  or else Is_Generic_Unit (Id)
-  or else Is_Subprogram (Id)
-  or else Is_Task_Type (Id));
+  pragma Assert (Is_Elaboration_Target (Id));
   return Flag148 (Id);
end Is_Elaboration_Checks_OK_Id;
 
function Is_Elaboration_Warnings_OK_Id (Id : E) return B is
begin
-  pragma Assert
-(Ekind_In (Id, E_Constant, E_Variable, E_Void)
-  or else Is_Entry (Id)
-  or else Is_Generic_Unit (Id)
-  or else Is_Subprogram (Id)
-  or else Is_Task_Type (Id));
+  pragma Assert (Is_Elaboration_Target (Id) or else Ekind (Id) = E_Void);
   return Flag304 (Id);
end Is_Elaboration_Warnings_OK_Id;
 
@@ -5478,23 +5468,13 @@ package body Einfo is
 
procedure Set_Is_Elaboration_Checks_OK_Id (Id : E; V : B := True) is
begin
-  pragma Assert
-(Ekind_In (Id, E_Constant, E_Variable)
-  or else Is_Entry (Id)
-  or else Is_Generic_Unit (Id)
-  or else Is_Subprogram (Id)
-  or else Is_Task_Type (Id));
+  pragma Assert (Is_Elaboration_Target (Id));
   Set_Flag148 (Id, V);
end Set_Is_Elaboration_Checks_OK_Id;
 
procedure Set_Is_Elaboration_Warnings_OK_Id (Id : E; V : B := True) is
begin
-  pragma Assert
-(Ekind_In (Id, E_Constant, E_Variable)
-  or else Is_Entry (Id)
-  or else Is_Generic_Unit (Id)
-  or else Is_Subprogram (Id)
-  or else Is_Task_Type (Id));
+  pragma Assert (Is_Elaboration_Target (Id));
   Set_Flag304 (Id, V);
end Set_Is_Elaboration_Warnings_OK_Id;
 
@@ -8112,6 +8092,20 @@ package body Einfo is
   and then Is_Entity_Attribute_Name (Attribute_Name (N)));
end Is_Entity_Name;
 
+   ---
+   -- Is_Elaboration_Target --
+   ---
+
+   function Is_Elaboration_Target (Id : Entity_Id) return Boolean is
+   begin
+  return
+Ekind_In (Id, E_Constant, E_Variable)
+  or else Is_Entry(Id)
+  or else Is_Generic_Unit (Id)
+  or else Is_Subprogram   (Id)
+  or else Is_Task_Type(Id);
+   end Is_Elaboration_Target;
+
---
-- Is_External_State --
---

--- gcc/ada/einfo.ads
+++ gcc/ada/einfo.ads
@@ -2522,12 +2522,16 @@ package Einfo is
 --   checks. Such targets are allowed to generate run-time conditional ABE
 --   checks or guaranteed ABE failures.
 
+--Is_Elaboration_Target (synthesized)
+--   Applies to all entities, True only for elaboration targets (see the
+--   terminology in Sem_Elab).
+
 --Is_Elaboration_Warnings_OK_Id (Flag304)
 --   Defined in elaboration targets (see terminology in Sem_Elab). Set when
 --   the target appears in a region with elaboration warnings enabled.
 
 --Is_Elementary_Type (synthesized)
---   Applies to all entities, true for all elementary types and subtypes.
+--   Applies to all entities, True for all elementary types and subtypes.
 --   Either Is_Composite_Type or Is_Elementary_Type (but not both) is true
 --   of any type.
 
@@ -5971,6 +5975,7 @@ package Einfo is
--Address_Clause  (synth)
--Alignment_Clause(synth)
--Is_Atomic_Or_VFA(synth)
+   --Is_Elaboration_Target   (synth)
--Size_Clause (synth)
 
--  E_Decimal_Fixed_Point_Type
@@ -6041,6 +6046,7 @@ package Einfo is
--Entry_Index_Type(synth)
--First_Formal(synth)
--First_Formal_With_Extras(synth)
+   --Is_Elaboration_Target   

[Ada] Compiler fails to reject illegal store of anonymous_access_to_subprogram

2018-05-23 Thread Pierre-Marie de Rodat
GNAT properly rejects an attempt to assign an access_to_subprogram formal
to a local variable, according to accessibiiity rules. This patch forces the
same behavior on the use of such a formal in an object declaration.

Compiling store_anon.adb must yield:

  store_anon.adb:7:35: illegal attempt to store anonymous access to subprogram
  store_anon.adb:7:35: value has deeper accessibility than any master
   (RM 3.10.2 (13))
 store_anon.adb:7:35: use named access type for "P" instead of access parameter


package Store_Anon is
   procedure Store (P : not null access procedure);

   procedure Invoke;
end Store_Anon;

package body Store_Anon is
   type P_Ptr is access procedure;

   Stored : P_Ptr;

   procedure Store (P : not null access procedure) is
  Illegal : constant P_Ptr := P;
   begin -- Store
  Stored := Illegal;
   end Store;

   procedure Invoke is
  -- Empty
   begin -- Invoke
  Stored.all;
   end Invoke;
end Store_Anon;

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Ed Schonberg  

gcc/ada/

* sem_ch3.adb (Analyze_Object_Declaration): If expression is an
anonymous_access_to_ subprogram formal, apply a conversion to force an
accsssibility check that will fail statically, enforcing 3.10.2 (13).--- gcc/ada/sem_ch3.adb
+++ gcc/ada/sem_ch3.adb
@@ -4268,6 +4268,23 @@ package body Sem_Ch3 is
 Set_Etype (E, T);
 
  else
+
+--  If the expression is a formal that is a "subprogram pointer"
+--  this is illegal in accessibility terms. Add an explicit
+--  conversion to force the corresponding check, as is done for
+--  assignments.
+
+if Comes_From_Source (N)
+  and then Is_Entity_Name (E)
+  and then Present (Entity (E))
+  and then Is_Formal (Entity (E))
+  and then
+Ekind (Etype (Entity (E))) = E_Anonymous_Access_Subprogram_Type
+  and then Ekind (T) /= E_Anonymous_Access_Subprogram_Type
+then
+   Rewrite (E, Convert_To (T, Relocate_Node (E)));
+end if;
+
 Resolve (E, T);
  end if;
 



[Ada] Crash on predicate involving qualified expression in instance

2018-05-23 Thread Pierre-Marie de Rodat
This patch inhibits the generation of freeze nodes when pre-analyzing the
domain of iteration of an Ada2012 loop that appears as a quantified
expression in a predicate for an array type. This prevents a back-end
abort on an invisible freeze node that would otherwise appear in an
unexpanded code sequence.

The following must compile quietly:


with Id_Manager;

package My_Id_Manager is new Id_Manager (Max_Id_Type   => 100_000,
 Max_Key_Count => 100);

generic
   Max_Id_Type   : Positive;
   Max_Key_Count : Positive;

package Id_Manager is
   type Unique_Id_Type is new Integer range 0 .. Max_Id_Type;

   Undefined_Id : constant Unique_Id_Type := 0;

   type Key_Count is new Integer range 0 .. Max_Key_Count;
   subtype Key_Index is Key_Count range 1 .. Key_Count'Last;

   type Key_Array is array (Key_Index range <>) of Unique_Id_Type
 with Predicate => Key_Array'First = 1;

   type Id_Manager_State (Capacity : Key_Count) is private;

   procedure Display_Objects (TheObject : Id_Manager_State);

private
   type Id_Manager_State (Capacity : Key_Count) is record
  Id_Key   : Key_Array (1 .. Capacity) := (others => Undefined_Id);
  Key_Size : Key_Count := 0;
   end record;
end Id_Manager;

package body Id_Manager is
   procedure Display_Objects (TheObject : Id_Manager_State) is
   begin
  for Item of TheObject.Id_Key loop
 null;
  end loop;
   end Display_Objects;
end Id_Manager;

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Ed Schonberg  

gcc/ada/

* sem_ch5.adb (Preanalyze_Range): The pre-analysis of the domain of
iteration of an Ada2012 loop is performed to determine the type of the
domain, but full analysis is performed once the loop is rewritten as a
while-loop during expansion. The pre-analysis suppresses expansion; it
must also suppress the generation of freeze nodes, which may otherwise
appear in the wrong scope before rewritting.--- gcc/ada/sem_ch5.adb
+++ gcc/ada/sem_ch5.adb
@@ -4076,6 +4076,17 @@ package body Sem_Ch5 is
   Full_Analysis := False;
   Expander_Mode_Save_And_Set (False);
 
+  --  In addition to the above we must ecplicity suppress the
+  --  generation of freeze nodes which might otherwise be generated
+  --  during resolution of the range (e.g. if given by an attribute
+  --  that will freeze its prefix).
+
+  Set_Must_Not_Freeze (R_Copy);
+
+  if Nkind (R_Copy) = N_Attribute_Reference then
+ Set_Must_Not_Freeze (Prefix (R_Copy));
+  end if;
+
   Analyze (R_Copy);
 
   if Nkind (R_Copy) in N_Subexpr and then Is_Overloaded (R_Copy) then



[Ada] Suppression of elaboration-related warnings

2018-05-23 Thread Pierre-Marie de Rodat
This patch updates the documentation section on suppressing elaboration
warnings. No change in behavior, no need for a test.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Hristian Kirtchev  

gcc/ada/

* sem_elab.adb: Update the section on suppressing elaboration warnings.--- gcc/ada/sem_elab.adb
+++ gcc/ada/sem_elab.adb
@@ -394,33 +394,38 @@ package body Sem_Elab is
--  suppressed.
--
--  In addition to switch -gnatwL, pragma Warnings may be used to suppress
-   --  elaboration-related warnings by wrapping a construct in the following
-   --  manner:
+   --  elaboration-related warnings when used in the following manner:
--
--pragma Warnings ("L");
-   --
-   --pragma Warnings ("l");
+   --
+   --
+   --
+   --pragma Warnings (Off, target);
+   --
+   --pragma Warnings (Off);
+   --
--
--  * To suppress elaboration warnings for '[Unrestricted_]Access of
--entries, operators, and subprograms, either:
--
-   --  - Wrap the entry, operator, or subprogram, or
-   --  - Wrap the attribute, or
+   --  - Suppress the entry, operator, or subprogram, or
+   --  - Suppress the attribute, or
--  - Use switch -gnatw.f
--
--  * To suppress elaboration warnings for calls to entries, operators,
--and subprograms, either:
--
-   --  - Wrap the entry, operator, or subprogram, or
-   --  - Wrap the call
+   --  - Suppress the entry, operator, or subprogram, or
+   --  - Suppress the call
--
-   --  * To suppress elaboration warnings for instantiations, wrap the
+   --  * To suppress elaboration warnings for instantiations, suppress the
--instantiation.
--
--  * To suppress elaboration warnings for task activations, either:
--
-   --  - Wrap the task object, or
-   --  - Wrap the task type
+   --  - Suppress the task object, or
+   --  - Suppress the task type, or
+   --  - Suppress the activation call
 
--
-- Switches --



[Ada] Suppression of elaboration-related warnings

2018-05-23 Thread Pierre-Marie de Rodat
This patch changes the behavior of elaboration-related warnings as follows:

   * If a scenario or a target has [elaboration] warnings suppressed, then
 any further elaboration-related warnings along the paths rooted at the
 scenario are also suppressed.

   * Elaboration-related warnings related to task activation can now be
 suppressed when either the task object, task type, or the activation
 call have [elaboration] warnings suppressed.

   * Elaboration-related warnings related to calls can now be suppressed when
 either the target or the call have [elaboration] warnings suppressed.

   * Elaboration-related warnings related to instantiations can now be
 suppressed when the instantiation has [elaboration] warnings suppressed.

The patch also cleans up the way the state of the Processing phase is updated
with each new node along a path. It is now preferable to update the state in
routines

   Process_Conditional_ABE_Activation_Impl
   Process_Conditional_ABE_Call
   Process_Conditional_ABE_Instantiation

rather than within their language-specific versions.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Hristian Kirtchev  

gcc/ada/

* einfo.adb: Flag304 is now Is_Elaboration_Warnings_OK_Id.
(Is_Elaboration_Warnings_OK_Id): New routine.
(Set_Is_Elaboration_Warnings_OK_Id): New routine.
(Write_Entity_Flags): Output Flag304.
* einfo.ads: Add new attribute Is_Elaboration_Warnings_OK_Id along with
occurrences in entities.
(Is_Elaboration_Warnings_OK_Id): New routine along with pragma Inline.
(Set_Is_Elaboration_Warnings_OK_Id): New routine along with pragma
Inline.
* sem_attr.adb (Analyze_Access_Attribute): Capture the state of
elaboration warnings.
* sem_ch3.adb (Analyze_Object_Declaration): Capture the state of
elaboration warnings.
* sem_ch6.adb (Analyze_Abstract_Subprogram_Declaration): Capture the
state of elaboration warnings.
(Analyze_Subprogram_Body_Helper): Capture the state of elaboration
warnings.
(Analyze_Subprogram_Declaration): Capture the state of elaboration
warnings.
* sem_ch9.adb (Analyze_Entry_Declaration): Capture the state of
elaboration warnings.
(Analyze_Single_Task_Declaration): Capture the state of elaboration
warnings.
(Analyze_Task_Type_Declaration): Capture the state of elaboration
warnings.
* sem_ch12.adb (Analyze_Generic_Package_Declaration): Capture the state
of elaboration warnings.
(Analyze_Generic_Subprogram_Declaration): Capture the state of
elaboration warnings.
* sem_elab.adb: Add a section on suppressing elaboration warnings.
Type Processing_Attributes includes component Suppress_Warnings
intended to suppress any elaboration warnings along a path in the
graph.  Update Initial_State to include a value for this component.
Types Target_Attributes and Task_Attriutes include component
Elab_Warnings_OK to indicate whether the target or task has elaboration
warnings enabled.  component Elab_Warnings_OK.
(Build_Access_Marker): Propagate attribute
Is_Elaboration_Warnings_OK_Node from the attribute to the generated
call marker.
(Extract_Instantiation_Attributes): Set the value for Elab_Warnings_OK.
(Extract_Target_Attributes): Set the value for Elab_Warnings_OK.
(Extract_Task_Attributes): Set the value for Elab_Warnings_OK.
(Process_Conditional_ABE_Access): Suppress futher elaboration warnings
when already in this mode or when the attribute or target have warnings
suppressed.
(Process_Conditional_ABE_Activation_Impl): Do not emit any diagnostics
if warnings are suppressed.
(Process_Conditional_ABE_Call): Suppress further elaboration warnings
when already in this mode, or the target or call have warnings
suppressed.
(Process_Conditional_ABE_Call_Ada): Do not emit any diagnostics if
warnings are suppressed.
(Process_Conditional_ABE_Call_SPARK): Do not emit any diagnostics if
warnings are suppressed.
(Process_Conditional_ABE_Instantiation): Suppress further elaboration
warnings when already in this mode or when the instantiation has
warnings suppressed.
(Process_Conditional_ABE_Instantiation_Ada): Do not emit any
diagnostics if warnings are suppressed.
(Process_Conditional_ABE_Variable_Assignment_Ada): Use the more
specific Is_Elaboration_Warnings_OK_Id rather than Warnings_Off.
(Process_Conditional_ABE_Variable_Assignment_SPARK): Use the more
specific Is_Elaboration_Warnings_OK_Id rather than Warnings_Off.
(Process_Task_Object): Suppress further elaboration warnings when
already in this mode, or when the object, activation call, or 

[Ada] Build-in-place aggregates and Address clauses

2018-05-23 Thread Pierre-Marie de Rodat
This patch fixes a bug in which if a limited volatile variable with
an Address aspect is initialized with a build-in-place aggregate
containing build-in-place function calls, the compiler can crash.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Bob Duff  

gcc/ada/

* freeze.adb: (Check_Address_Clause): Deal with build-in-place
aggregates in addition to build-in-place calls.

gcc/testsuite/

* gnat.dg/addr10.adb: New testcase.--- gcc/ada/freeze.adb
+++ gcc/ada/freeze.adb
@@ -710,13 +710,12 @@ package body Freeze is
 end;
  end if;
 
- --  Remove side effects from initial expression, except in the case
- --  of a build-in-place call, which has its own later expansion.
+ --  Remove side effects from initial expression, except in the case of
+ --  limited build-in-place calls and aggregates, which have their own
+ --  expansion elsewhere. This exception is necessary to avoid copying
+ --  limited objects.
 
- if Present (Init)
-   and then (Nkind (Init) /= N_Function_Call
-  or else not Is_Expanded_Build_In_Place_Call (Init))
- then
+ if Present (Init) and then not Is_Limited_View (Typ) then
 --  Capture initialization value at point of declaration, and make
 --  explicit assignment legal, because object may be a constant.
 
@@ -735,7 +734,7 @@ package body Freeze is
 
 Set_No_Initialization (Decl);
 
---  If the objet is tagged, check whether the tag must be
+--  If the object is tagged, check whether the tag must be
 --  reassigned explicitly.
 
 Tag_Assign := Make_Tag_Assignment (Decl);

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/addr10.adb
@@ -0,0 +1,24 @@
+--  { dg-do compile }
+
+with System;
+
+procedure Addr10 is
+   type Limited_Type is limited record
+  Element : Integer;
+   end record;
+
+   function Initial_State return Limited_Type is ((Element => 0));
+
+   type Double_Limited_Type is
+  record
+ A : Limited_Type;
+  end record;
+
+   Double_Limited : Double_Limited_Type :=
+  (A => Initial_State)
+   with
+  Volatile,
+  Address => System'To_Address (16#1234_5678#);
+begin
+   null;
+end Addr10;



[Ada] Suspension and elaboration warnings/checks

2018-05-23 Thread Pierre-Marie de Rodat
This patch modifies the static elaboration model to stop the inspection of
a task body when it contains a synchronous suspension call and restriction
No_Entry_Calls_In_Elaboration_Code or switch -gnatd_s is in effect.


-- Source --


--  suspension.ads

package Suspension is
   procedure ABE;

   task type Barrier_Task_1;
   task type Barrier_Task_2;
   task type Object_Task_1;
   task type Object_Task_2;
end Suspension;

--  suspension.adb

with Ada.Synchronous_Barriers; use Ada.Synchronous_Barriers;
with Ada.Synchronous_Task_Control; use Ada.Synchronous_Task_Control;

package body Suspension is
   Bar : Synchronous_Barrier (Barrier_Limit'Last);
   Obj : Suspension_Object;

   task body Barrier_Task_1 is
  OK : Boolean;
   begin
  Wait_For_Release (Bar, OK);
  ABE;
   end Barrier_Task_1;

   task body Barrier_Task_2 is
  procedure Block is
 OK : Boolean;
  begin
 Wait_For_Release (Bar, OK);
  end Block;
   begin
  Block;
  ABE;
   end Barrier_Task_2;

   task body Object_Task_1 is
   begin
  Suspend_Until_True (Obj);
  ABE;
   end Object_Task_1;

   task body Object_Task_2 is
  procedure Block is
  begin
 Suspend_Until_True (Obj);
  end Block;
   begin
  Block;
  ABE;
   end Object_Task_2;

   function Elaborator return Boolean is
  BT_1 : Barrier_Task_1;
  BT_2 : Barrier_Task_2;
  OT_1 : Object_Task_1;
  OT_2 : Object_Task_2;
   begin
  return True;
   end Elaborator;

   Elab : constant Boolean := Elaborator;

   procedure ABE is begin null; end ABE;
end Suspension;

--  main.adb

with Suspension;

procedure Main is begin null; end Main;


-- Compilation and output --


$ gnatmake -q -gnatd_s main.adb
suspension.adb:23:07: warning: cannot call "ABE" before body seen
suspension.adb:23:07: warning: Program_Error may be raised at run time
suspension.adb:23:07: warning:   body of unit "Suspension" elaborated
suspension.adb:23:07: warning:   function "Elaborator" called at line 51
suspension.adb:23:07: warning:   local tasks of "Elaborator" activated
suspension.adb:23:07: warning:   procedure "ABE" called at line 23
suspension.adb:39:07: warning: cannot call "ABE" before body seen
suspension.adb:39:07: warning: Program_Error may be raised at run time
suspension.adb:39:07: warning:   body of unit "Suspension" elaborated
suspension.adb:39:07: warning:   function "Elaborator" called at line 51
suspension.adb:39:07: warning:   local tasks of "Elaborator" activated
suspension.adb:39:07: warning:   procedure "ABE" called at line 39

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Hristian Kirtchev  

gcc/ada/

* debug.adb: Switch -gnatd_s is now used to stop elaboration checks on
synchronized suspension.
* rtsfind.ads: Add entries for units Ada.Synchronous_Barriers and
Ada.Synchronous_Task_Control and routines Suspend_Until_True and
Wait_For_Release.
* sem_elab.adb: Document switch -gnatd_s.
(In_Task_Body): New routine.
(Is_Potential_Scenario): Code cleanup. Stop the traversal of a task
body when the current construct denotes a synchronous suspension call,
and restriction No_Entry_Calls_In_Elaboration_Code or switch -gnatd_s
is in effect.
(Is_Synchronous_Suspension_Call): New routine.
* switch-c.adb (Scan_Front_End_Switches): Switch -gnatJ now sets switch
-gnatd_s.--- gcc/ada/debug.adb
+++ gcc/ada/debug.adb
@@ -163,7 +163,7 @@ package body Debug is
--  d_p  Ignore assertion pragmas for elaboration
--  d_q
--  d_r
-   --  d_s
+   --  d_s  Stop elaboration checks on synchronous suspension
--  d_t
--  d_u
--  d_v
@@ -839,6 +839,10 @@ package body Debug is
--   semantics of invariants and postconditions in both the static and
--   dynamic elaboration models.
 
+   --  d_s  The compiler stops the examination of a task body once it reaches
+   --   a call to routine Ada.Synchronous_Task_Control.Suspend_Until_True
+   --   or Ada.Synchronous_Barriers.Wait_For_Release.
+
--  d_L  Output trace information on elaboration checking. This debug switch
--   causes output to be generated showing each call or instantiation as
--   it is checked, and the progress of the recursive trace through

--- gcc/ada/rtsfind.ads
+++ gcc/ada/rtsfind.ads
@@ -131,6 +131,8 @@ package Rtsfind is
   Ada_Real_Time,
   Ada_Streams,
   Ada_Strings,
+  Ada_Synchronous_Barriers,
+  Ada_Synchronous_Task_Control,
   Ada_Tags,
   Ada_Task_Identification,
   Ada_Task_Termination,
@@ -609,6 +611,10 @@ package Rtsfind is
 
  RE_Unbounded_String,-- Ada.Strings.Unbounded
 
+ RE_Wait_For_Release,-- Ada.Synchronous_Barriers
+
+ RE_Suspend_Until_True,  -- Ada.Synchronous_Task_Control
+
  RE_Acces

[Ada] Fix of some permission rules of pointers in SPARK

2018-05-23 Thread Pierre-Marie de Rodat
This commit fixes bugs in the code that implements the rules for safe pointers
in SPARK. This only affects SPARK tools, not compilation.

  * Global variables should be handled differently compared
to parameters. The whole tree of an in global variable has the
permission Read-Only. In contrast, an in parameter has the
permission Read-Only for the first level and Read-Write permission
for suffixes.
  * The suffix X of Integer'image(X) was not analyzed correctly.
  * The instruction X'img was not dealt with.
  * Shallow aliased types which are not initialized are now allowed
and analyzed.

Dealing with function inlining is not handled correctly yet.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Maroua Maalej  

gcc/ada/

* sem_spark.adb: Fix of some permission rules of pointers in SPARK.--- gcc/ada/sem_spark.adb
+++ gcc/ada/sem_spark.adb
@@ -554,9 +554,10 @@ package body Sem_SPARK is
 
   Super_Move,
   --  Enhanced moving semantics (under 'Access). Checks that paths have
-  --  Read_Write permission. After moving a path, its permission is set
-  --  to No_Access, as well as the permission of its extensions and the
-  --  permission of its prefixes up to the first Reference node.
+  --  Read_Write permission (shallow types may have only Write permission).
+  --  After moving a path, its permission is set to No_Access, as well as
+  --  the permission of its extensions and the permission of its prefixes
+  --  up to the first Reference node.
 
   Borrow_Out,
   --  Used for actual OUT parameters. Checks that paths have Write_Perm
@@ -750,9 +751,10 @@ package body Sem_SPARK is
--  execution.
 
procedure Return_Parameter_Or_Global
- (Id   : Entity_Id;
-  Mode : Formal_Kind;
-  Subp : Entity_Id);
+ (Id : Entity_Id;
+  Mode   : Formal_Kind;
+  Subp   : Entity_Id;
+  Global_Var : Boolean);
--  Auxiliary procedure to Return_Parameters and Return_Globals
 
procedure Return_Parameters (Subp : Entity_Id);
@@ -813,8 +815,9 @@ package body Sem_SPARK is
--  global items with appropriate permissions.
 
procedure Setup_Parameter_Or_Global
- (Id   : Entity_Id;
-  Mode : Formal_Kind);
+ (Id : Entity_Id;
+  Mode   : Formal_Kind;
+  Global_Var : Boolean);
--  Auxiliary procedure to Setup_Parameters and Setup_Globals
 
procedure Setup_Parameters (Subp : Entity_Id);
@@ -1049,23 +1052,27 @@ package body Sem_SPARK is
 
 declare
Elem : Perm_Tree_Access;
-
+   Deep : constant Boolean :=
+ Is_Deep (Etype (Defining_Identifier (Decl)));
 begin
Elem := new Perm_Tree_Wrapper'
  (Tree =>
 (Kind=> Entire_Object,
- Is_Node_Deep=>
-   Is_Deep (Etype (Defining_Identifier (Decl))),
+ Is_Node_Deep=> Deep,
  Permission  => Read_Write,
  Children_Permission => Read_Write));
 
--  If unitialized declaration, then set to Write_Only. If a
--  pointer declaration, it has a null default initialization.
-   if Nkind (Expression (Decl)) = N_Empty
+   if No (Expression (Decl))
  and then not Has_Full_Default_Initialization
(Etype (Defining_Identifier (Decl)))
  and then not Is_Access_Type
(Etype (Defining_Identifier (Decl)))
+ --  Objects of shallow types are considered as always
+ --  initialized, leaving the checking of initialization to
+ --  flow analysis.
+ and then Deep
then
   Elem.all.Tree.Permission := Write_Only;
   Elem.all.Tree.Children_Permission := Write_Only;
@@ -1209,6 +1216,9 @@ package body Sem_SPARK is
   Check_Node (Prefix (Expr));
 
when Name_Image =>
+  Check_List (Expressions (Expr));
+
+   when Name_Img =>
   Check_Node (Prefix (Expr));
 
when Name_SPARK_Mode =>
@@ -2350,7 +2360,7 @@ package body Sem_SPARK is
 | N_Use_Type_Clause
 | N_Validate_Unchecked_Conversion
 | N_Variable_Reference_Marker
- =>
+=>
 null;
 
  --  The following nodes are rewritten by semantic analysis
@@ -3528,10 +3538,10 @@ package body Sem_SPARK is
  when N_Identifier
 | N_Expanded_Name
  =>
-return Has_Alias_Deep (Etype (N));
+return Is_Aliased (Entity (N)) or else Has_Alias_Deep (Etype (N));
 
  when N_Defining_Identifier =>
-return Has_Alias_Deep (Etype (N));
+return Is_Aliased (N) or else Has_Alias_Deep (Etype (N));
 

[Ada] Fix implementation of utility for finding enclosing declaration

2018-05-23 Thread Pierre-Marie de Rodat
This utility is used in GNATprove to find when a node is inside a named
number declaration, and this case was not properly handled. Now fixed.
There is no impact on compilation.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Yannick Moy  

gcc/ada/

* sem_util.adb (Enclosing_Declaration): Fix the case of a named number
declaration, which was not taken into account.--- gcc/ada/sem_util.adb
+++ gcc/ada/sem_util.adb
@@ -6635,7 +6635,9 @@ package body Sem_Util is
   while Present (Decl)
 and then not (Nkind (Decl) in N_Declaration
 or else
-  Nkind (Decl) in N_Later_Decl_Item)
+  Nkind (Decl) in N_Later_Decl_Item
+or else
+  Nkind (Decl) = N_Number_Declaration)
   loop
  Decl := Parent (Decl);
   end loop;



[Ada] Missing legality check on iterator over formal container

2018-05-23 Thread Pierre-Marie de Rodat
This patch adds a check on an iterator over a GNAT-specific formal container,
when the iterator specification includes a subtype indication that must be
compatible with the element type of the container.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Ed Schonberg  

gcc/ada/

* sem_ch5.adb (Analyze_Iterator_Specification): If a subtype indication
is present, verify its legality when the domain of iteration is a
GNAT-specific formal container, as is already done for arrays and
predefined containers.

gcc/testsuite/

* gnat.dg/iter1.adb, gnat.dg/iter1.ads: New testcase.--- gcc/ada/sem_ch5.adb
+++ gcc/ada/sem_ch5.adb
@@ -2063,11 +2063,25 @@ package body Sem_Ch5 is
   --  indicator, verify that the container type has an Iterate aspect that
   --  implements the reversible iterator interface.
 
+  procedure Check_Subtype_Indication (Comp_Type : Entity_Id);
+  --  If a subtype indication is present, verify that it is consistent
+  --  with the component type of the array or container name.
+
   function Get_Cursor_Type (Typ : Entity_Id) return Entity_Id;
   --  For containers with Iterator and related aspects, the cursor is
   --  obtained by locating an entity with the proper name in the scope
   --  of the type.
 
+  --  Local variables
+
+  Def_Id: constant Node_Id:= Defining_Identifier (N);
+  Iter_Name : constant Node_Id:= Name (N);
+  Loc   : constant Source_Ptr := Sloc (N);
+  Subt  : constant Node_Id:= Subtype_Indication (N);
+
+  Bas   : Entity_Id := Empty;  -- initialize to prevent warning
+  Typ   : Entity_Id;
+
   -
   -- Check_Reverse_Iteration --
   -
@@ -2091,6 +2105,26 @@ package body Sem_Ch5 is
  end if;
   end Check_Reverse_Iteration;
 
+  ---
+  --  Check_Subtype_Indication --
+  ---
+
+  procedure Check_Subtype_Indication (Comp_Type : Entity_Id) is
+  begin
+ if Present (Subt)
+   and then (not Covers (Base_Type ((Bas)), Comp_Type)
+  or else not Subtypes_Statically_Match (Bas, Comp_Type))
+ then
+if Is_Array_Type (Typ) then
+   Error_Msg_N
+ ("subtype indication does not match component type", Subt);
+else
+   Error_Msg_N
+ ("subtype indication does not match element type", Subt);
+end if;
+ end if;
+  end Check_Subtype_Indication;
+
   -
   -- Get_Cursor_Type --
   -
@@ -2127,16 +2161,6 @@ package body Sem_Ch5 is
  return Etype (Ent);
   end Get_Cursor_Type;
 
-  --  Local variables
-
-  Def_Id: constant Node_Id:= Defining_Identifier (N);
-  Iter_Name : constant Node_Id:= Name (N);
-  Loc   : constant Source_Ptr := Sloc (N);
-  Subt  : constant Node_Id:= Subtype_Indication (N);
-
-  Bas : Entity_Id := Empty;  -- initialize to prevent warning
-  Typ : Entity_Id;
-
--   Start of processing for Analyze_Iterator_Specification
 
begin
@@ -2394,15 +2418,7 @@ package body Sem_Ch5 is
   & "component of a mutable object", N);
 end if;
 
-if Present (Subt)
-  and then
-(Base_Type (Bas) /= Base_Type (Component_Type (Typ))
-  or else
-not Subtypes_Statically_Match (Bas, Component_Type (Typ)))
-then
-   Error_Msg_N
- ("subtype indication does not match component type", Subt);
-end if;
+Check_Subtype_Indication (Component_Type (Typ));
 
  --  Here we have a missing Range attribute
 
@@ -2452,6 +2468,8 @@ package body Sem_Ch5 is
   end if;
end;
 
+   Check_Subtype_Indication (Etype (Def_Id));
+
 --  For a predefined container, The type of the loop variable is
 --  the Iterator_Element aspect of the container type.
 
@@ -2477,18 +2495,7 @@ package body Sem_Ch5 is
  Cursor_Type := Get_Cursor_Type (Typ);
  pragma Assert (Present (Cursor_Type));
 
- --  If subtype indication was given, verify that it covers
- --  the element type of the container.
-
- if Present (Subt)
-   and then (not Covers (Bas, Etype (Def_Id))
-  or else not Subtypes_Statically_Match
-(Bas, Etype (Def_Id)))
- then
-Error_Msg_N
-  ("subtype indication does not match element type",
-   Subt);
- end if;
+ 

[Ada] Add a Is_Foreign_Exception predicate to GNAT.Exception_Actions

2018-05-23 Thread Pierre-Marie de Rodat
Useful to check if an occurrence caught by a "when others" choice originates
from a foreign language, e.g. C++.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Olivier Hainque  

gcc/ada/

* libgnat/g-excact.ads (Is_Foreign_Exception): New predicate.
* libgnat/g-excact.adb: Implement.--- gcc/ada/libgnat/g-excact.adb
+++ gcc/ada/libgnat/g-excact.adb
@@ -91,6 +91,19 @@ package body GNAT.Exception_Actions is
 
procedure Core_Dump (Occurrence : Exception_Occurrence) is separate;
 
+   --
+   -- Is_Foreign_Exception --
+   --
+
+   function Is_Foreign_Exception (E : Exception_Occurrence) return Boolean is
+  Foreign_Exception : aliased Exception_Data;
+  pragma Import
+(Ada, Foreign_Exception, "system__exceptions__foreign_exception");
+   begin
+  return (To_Data (Exception_Identity (E))
+= Foreign_Exception'Unchecked_Access);
+   end Is_Foreign_Exception;
+

-- Name_To_Id --


--- gcc/ada/libgnat/g-excact.ads
+++ gcc/ada/libgnat/g-excact.ads
@@ -29,9 +29,11 @@
 --  --
 --
 
---  This package provides support for callbacks on exceptions
+--  This package provides support for callbacks on exceptions as well as
+--  exception-related utility subprograms of possible interest together with
+--  exception actions or more generally.
 
---  These callbacks are called immediately when either a specific exception,
+--  The callbacks are called immediately when either a specific exception,
 --  or any exception, is raised, before any other actions taken by raise, in
 --  particular before any unwinding of the stack occurs.
 
@@ -85,6 +87,10 @@ package GNAT.Exception_Actions is
--  Note: All non-predefined exceptions will return Null_Id for programs
--  compiled with pragma Restriction (No_Exception_Registration)
 
+   function Is_Foreign_Exception (E : Exception_Occurrence) return Boolean;
+   --  Tell whether the exception occurrence E represents a foreign exception,
+   --  such as one raised in C++ and caught by a when others choice in Ada.
+
function Registered_Exceptions_Count return Natural;
--  Return the number of exceptions that have been registered so far.
--  Exceptions declared locally will not appear in this list until their



[Ada] Clarify meaning of local pragma Warnings Off without On

2018-05-23 Thread Pierre-Marie de Rodat
A local use of pragma Warnings Off to suppress specific messages, when
not followed by a matching pragma Warnings On, extends until the end of
the file.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Yannick Moy  

gcc/ada/

* doc/gnat_rm/implementation_defined_pragmas.rst: Clarify meaning of
local pragma Warnings Off without On.
* gnat_rm.texi: Regenerate.--- gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst
+++ gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst
@@ -7456,6 +7456,10 @@ In this usage, the pattern string must match in the Off and On
 pragmas, and (if *-gnatw.w* is given) at least one matching
 warning must be suppressed.
 
+Note: if the ON form is not found, then the effect of the OFF form extends
+until the end of the file (pragma Warnings is purely textual, so its effect
+does not stop at the end of the enclosing scope).
+
 Note: to write a string that will match any warning, use the string
 ``"***"``. It will not work to use a single asterisk or two
 asterisks since this looks like an operator name. This form with three

--- gcc/ada/gnat_rm.texi
+++ gcc/ada/gnat_rm.texi
@@ -8893,6 +8893,10 @@ In this usage, the pattern string must match in the Off and On
 pragmas, and (if @emph{-gnatw.w} is given) at least one matching
 warning must be suppressed.
 
+Note: if the ON form is not found, then the effect of the OFF form extends
+until the end of the file (pragma Warnings is purely textual, so its effect
+does not stop at the end of the enclosing scope).
+
 Note: to write a string that will match any warning, use the string
 @code{"***"}. It will not work to use a single asterisk or two
 asterisks since this looks like an operator name. This form with three



[Ada] Implementation of AI12-0131: legality of class-wide precondition

2018-05-23 Thread Pierre-Marie de Rodat
This patch refines the legality check on a class-wide precondition on a type
extension when ancestor does not have a class-wide precondition. Previously the
compiler accepted such a precondition when the ancestor had a class-wide
postcondition.

Compiling pck.ads must yield:

  pck.ads:7:04: illegal class-wide precondition on overriding operation


package Pck is
   type Parent is tagged null record;
   procedure Init (P : Parent) with Post'Class => True;

   type Child is new Parent with null record;
   overriding procedure Init (C : Child) with
   Pre'Class => True;
end Pck;

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Ed Schonberg  

gcc/ada/

* sem_prag.adb (Inherit_Class_Wide_Pre): Refine legality check on
class-wide precondition on a type extension when ancestor does not have
a class-wide precondition.  Previously the compiler accepted such a
precondition when the ancestor had a class-wide postcondition.--- gcc/ada/sem_prag.adb
+++ gcc/ada/sem_prag.adb
@@ -,7 +,9 @@ package body Sem_Prag is
if Present (Cont) then
   Prag := Pre_Post_Conditions (Cont);
   while Present (Prag) loop
- if Class_Present (Prag) then
+ if Pragma_Name (Prag) = Name_Precondition
+   and then Class_Present (Prag)
+ then
 return True;
  end if;
 



[Ada] gnatbind: do not list No_Implementation_Restrictions

2018-05-23 Thread Pierre-Marie de Rodat
When the gnatbind -r switch is used, do not list
No_Implementation_Restrictions, because after using the new restriction list,
No_Implementation_Restrictions will cause an error.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Bob Duff  

gcc/ada/

* gnatbind.adb (List_Applicable_Restrictions): Add
No_Implementation_Restrictions to the list of restrictions not to list.
Remove double negative "not No_Restriction_List".  Comment the
commentary that is output, so it won't cause errors if used directly in
a gnat.adc.--- gcc/ada/gnatbind.adb
+++ gcc/ada/gnatbind.adb
@@ -167,55 +167,61 @@ procedure Gnatbind is
   --  -r switch is used. Not all restrictions are output for the reasons
   --  given below in the list, and this array is used to test whether
   --  the corresponding pragma should be listed. True means that it
-  --  should not be listed.
+  --  should be listed.
 
-  No_Restriction_List : constant array (All_Restrictions) of Boolean :=
-(No_Standard_Allocators_After_Elaboration => True,
+  Restrictions_To_List : constant array (All_Restrictions) of Boolean :=
+(No_Standard_Allocators_After_Elaboration => False,
  --  This involves run-time conditions not checkable at compile time
 
- No_Anonymous_Allocators => True,
+ No_Anonymous_Allocators => False,
  --  Premature, since we have not implemented this yet
 
- No_Exception_Propagation=> True,
+ No_Exception_Propagation=> False,
  --  Modifies code resulting in different exception semantics
 
- No_Exceptions   => True,
+ No_Exceptions   => False,
  --  Has unexpected Suppress (All_Checks) effect
 
- No_Implicit_Conditionals=> True,
+ No_Implicit_Conditionals=> False,
  --  This could modify and pessimize generated code
 
- No_Implicit_Dynamic_Code=> True,
+ No_Implicit_Dynamic_Code=> False,
  --  This could modify and pessimize generated code
 
- No_Implicit_Loops   => True,
+ No_Implicit_Loops   => False,
  --  This could modify and pessimize generated code
 
- No_Recursion=> True,
+ No_Recursion=> False,
  --  Not checkable at compile time
 
- No_Reentrancy   => True,
+ No_Reentrancy   => False,
  --  Not checkable at compile time
 
- Max_Entry_Queue_Length   => True,
+ Max_Entry_Queue_Length  => False,
  --  Not checkable at compile time
 
- Max_Storage_At_Blocking => True,
+ Max_Storage_At_Blocking => False,
  --  Not checkable at compile time
 
+ No_Implementation_Restrictions  => False,
+ --  Listing this one would cause a chicken&egg problem; the program
+ --  doesn't use implementation-defined restrictions, but after
+ --  applying the listed restrictions, it probably WILL use them,
+ --  so No_Implementation_Restrictions will cause an error.
+
  --  The following three should not be partition-wide, so the
  --  following tests are junk to be removed eventually ???
 
- No_Specification_Of_Aspect  => True,
+ No_Specification_Of_Aspect  => False,
  --  Requires a parameter value, not a count
 
- No_Use_Of_Attribute => True,
+ No_Use_Of_Attribute => False,
  --  Requires a parameter value, not a count
 
- No_Use_Of_Pragma=> True,
+ No_Use_Of_Pragma=> False,
  --  Requires a parameter value, not a count
 
- others  => False);
+ others  => True);
 
   Additional_Restrictions_Listed : Boolean := False;
   --  Set True if we have listed header for restrictions
@@ -279,14 +285,14 @@ procedure Gnatbind is
   --  Loop through restrictions
 
   for R in All_Restrictions loop
- if not No_Restriction_List (R)
+ if Restrictions_To_List (R)
and then Restriction_Could_Be_Set (R)
  then
 if not Additional_Restrictions_Listed then
Write_Eol;
Write_Line
- ("The following additional restrictions may be applied to "
-  & "this partition:");
+ ("--  The following additional restrictions may be applied "
+  & "to this partition:");
Additional_Restrictions_Listed := True;
 end if;
 



[Ada] Vectors: spurious error in -gnatwE mode

2018-05-23 Thread Pierre-Marie de Rodat
This patch fixes a bug in which if Ada.Containers.Vectors is instantiated with
an Index_Type such that Index_Type'Base'Last is less than Count_Type'Last, and
the -gnatwE switch is used, the compiler gives spurious error messages.

The following test should compile quietly with -gnatwE:

gnatmake short_vectors.ads -gnatwa -gnatwE -gnatf

with Ada.Containers.Vectors;
package Short_Vectors is

   type Index_Type is range 1 .. 256;

   package Map_Pkg is new Ada.Containers.Vectors
 (Index_Type => Index_Type,
  Element_Type => Integer);

end Short_Vectors;

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Bob Duff  

gcc/ada/

* libgnat/a-convec.adb: (Insert, Insert_Space): Suppress warnings. The
code in question is not reachable in the case where Count_Type'Last is
out of range.
--- gcc/ada/libgnat/a-convec.adb
+++ gcc/ada/libgnat/a-convec.adb
@@ -999,9 +999,12 @@ package body Ada.Containers.Vectors is
 
 --  We know that No_Index (the same as Index_Type'First - 1) is
 --  less than 0, so it is safe to compute the following sum without
---  fear of overflow.
+--  fear of overflow. We need to suppress warnings, because
+--  otherwise we get an error in -gnatwE mode.
 
+pragma Warnings (Off);
 Index := No_Index + Index_Type'Base (Count_Type'Last);
+pragma Warnings (On);
 
 if Index <= Index_Type'Last then
 
@@ -1657,9 +1660,12 @@ package body Ada.Containers.Vectors is
 
 --  We know that No_Index (the same as Index_Type'First - 1) is
 --  less than 0, so it is safe to compute the following sum without
---  fear of overflow.
+--  fear of overflow. We need to suppress warnings, because
+--  otherwise we get an error in -gnatwE mode.
 
+pragma Warnings (Off);
 Index := No_Index + Index_Type'Base (Count_Type'Last);
+pragma Warnings (On);
 
 if Index <= Index_Type'Last then
 



[Ada] Crash processing Valid_Scalars whose evaluation is always true

2018-05-23 Thread Pierre-Marie de Rodat
The compiler blows up generating code associated with occurrences of attribute
Valid_Scalars whose evaluation is always true. After this patch the following
test compiles fine.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Javier Miranda  

gcc/ada/

* sem_attr.adb (Valid_Scalars): Do not invoke Error_Attr_P to report
the warning on occurrences of this attribute whose evaluation is always
true (since that subprogram aborts processing the attribute). In
addition, replace the node by its boolean result 'True' (required
because the backend has no knowledge of this attribute).

gcc/testsuite/

* gnat.dg/valid_scalars1.adb: New testcase.--- gcc/ada/sem_attr.adb
+++ gcc/ada/sem_attr.adb
@@ -6929,8 +6929,10 @@ package body Sem_Attr is
 
 else
if not Scalar_Part_Present (P_Type) then
-  Error_Attr_P
-("??attribute % always True, no scalars to check");
+  Error_Msg_Name_1 := Aname;
+  Error_Msg_F
+("??attribute % always True, no scalars to check", P);
+  Set_Boolean_Result (N, True);
end if;
 
--  Attribute 'Valid_Scalars is illegal on unchecked union types

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/valid_scalars1.adb
@@ -0,0 +1,11 @@
+--  { dg-do compile }
+--  { dg-options "-gnata -gnatws" }
+
+procedure Valid_Scalars1 is
+   type Ptr is access Integer;
+   V1 : Ptr;
+
+   Check : Boolean := V1'Valid_Scalars;
+begin
+   pragma Assert (Check);
+end;



[Ada] Spurious Storage_Error on imported array

2018-05-23 Thread Pierre-Marie de Rodat
This patch moves the check which verifies that a large modular array is created
from expansion to freezing in order to take interfacing pragmas in account. The
check is no longer performed on imported objects because no object is created
in that case.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Hristian Kirtchev  

gcc/ada/

* exp_ch3.adb (Check_Large_Modular_Array): Moved to Freeze.
(Expand_N_Object_Declaration): Do not check for a large modular array
here.
* freeze.adb (Check_Large_Modular_Array): Moved from Exp_Ch3.
(Freeze_Object_Declaration): Code cleanup. Check for a large modular
array.
* sem_ch3.adb: Minor reformatting.

gcc/testsuite/

* gnat.dg/import2.adb: New testcase.--- gcc/ada/exp_ch3.adb
+++ gcc/ada/exp_ch3.adb
@@ -5606,13 +5606,6 @@ package body Exp_Ch3 is
   --  value, it may be possible to build an equivalent aggregate instead,
   --  and prevent an actual call to the initialization procedure.
 
-  procedure Check_Large_Modular_Array;
-  --  Check that the size of the array can be computed without overflow,
-  --  and generate a Storage_Error otherwise. This is only relevant for
-  --  array types whose index in a (mod 2**64) type, where wrap-around
-  --  arithmetic might yield a meaningless value for the length of the
-  --  array, or its corresponding attribute.
-
   procedure Count_Default_Sized_Task_Stacks
 (Typ : Entity_Id;
  Pri_Stacks  : out Int;
@@ -5759,61 +5752,6 @@ package body Exp_Ch3 is
  end if;
   end Build_Equivalent_Aggregate;
 
-  ---
-  -- Check_Large_Modular_Array --
-  ---
-
-  procedure Check_Large_Modular_Array is
- Index_Typ : Entity_Id;
-
-  begin
- if Is_Array_Type (Typ)
-   and then Is_Modular_Integer_Type (Etype (First_Index (Typ)))
- then
---  To prevent arithmetic overflow with large values, we raise
---  Storage_Error under the following guard:
-
---(Arr'Last / 2 - Arr'First / 2) > (2 ** 30)
-
---  This takes care of the boundary case, but it is preferable to
---  use a smaller limit, because even on 64-bit architectures an
---  array of more than 2 ** 30 bytes is likely to raise
---  Storage_Error.
-
-Index_Typ := Etype (First_Index (Typ));
-
-if RM_Size (Index_Typ) = RM_Size (Standard_Long_Long_Integer) then
-   Insert_Action (N,
- Make_Raise_Storage_Error (Loc,
-   Condition =>
- Make_Op_Ge (Loc,
-   Left_Opnd  =>
- Make_Op_Subtract (Loc,
-   Left_Opnd  =>
- Make_Op_Divide (Loc,
-   Left_Opnd  =>
- Make_Attribute_Reference (Loc,
-   Prefix =>
- New_Occurrence_Of (Typ, Loc),
-   Attribute_Name => Name_Last),
-   Right_Opnd =>
- Make_Integer_Literal (Loc, Uint_2)),
-   Right_Opnd =>
- Make_Op_Divide (Loc,
-   Left_Opnd =>
- Make_Attribute_Reference (Loc,
-   Prefix =>
- New_Occurrence_Of (Typ, Loc),
-   Attribute_Name => Name_First),
-   Right_Opnd =>
- Make_Integer_Literal (Loc, Uint_2))),
-   Right_Opnd =>
- Make_Integer_Literal (Loc, (Uint_2 ** 30))),
-   Reason=> SE_Object_Too_Large));
-end if;
- end if;
-  end Check_Large_Modular_Array;
-
   -
   -- Count_Default_Sized_Task_Stacks --
   -
@@ -6434,8 +6372,6 @@ package body Exp_Ch3 is
  Build_Master_Entity (Def_Id);
   end if;
 
-  Check_Large_Modular_Array;
-
   --  If No_Implicit_Heap_Allocations or No_Implicit_Task_Allocations
   --  restrictions are active then default-sized secondary stacks are
   --  generated by the binder and allocated by SS_Init. To provide the

--- gcc/ada/freeze.adb
+++ gcc/ada/freeze.adb
@@ -3187,6 +3187,100 @@ package body Freeze is
   ---
 
   procedure Freeze_Object_Declaration (E : Entity_Id) is
+
+ procedure Check_Large_Modular_Array (Typ : Entity_Id);
+ --  Check that the size of array type Typ can be computed without
+ --  overflow, and generates a Storage_Error otherwise. This is only
+

[Ada] Spurious error on instantiation with type with unknown discriminants

2018-05-23 Thread Pierre-Marie de Rodat
This patch fixes a spurious error when instantiating an indefinite container
with a private type with unknown discriminants, when its full view is an
unconstrained array type. It also cleans up the inheritance of dynamic
predicates inherited by anonymous subtypes of array types.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Ed Schonberg  

gcc/ada/

* einfo.ads: New attribute on types: Predicated_Parent, to simplify the
retrieval of the applicable predicate function to an itype created for
a constrained array component.
* einfo.adb: Subprograms for Predicated_Parent.
(Predicate_Function): Use new attribute.
* exp_util.adb (Make_Predicate_Call): If the predicate function is not
available for a subtype, retrieve it from the base type, which may have
been frozen after the subtype declaration and not captured by the
subtype declaration.
* sem_aggr.adb (Resolve_Array_Aggregate): An Others association is
legal within a generated initiqlization procedure, as may happen with a
predicate check on a component, when the predicate function applies to
the base type of the component.
* sem_ch3.adb (Analyze_Subtype_Declaration): Clean up inheritance of
predicates for subtype declarations and for subtype indications in
other contexts.
(Process_Subtype): Likewise. Handle properly the case of a private type
with unknown discriminants whose full view is an unconstrained array.
Use Predicated_Parent to indicate source of predicate function on an
itype whose parent is itself an itype.
(Complete_Private_Subtype): If the private view has unknown
discriminants and the full view is an unconstrained array, set base
type of completion to the full view of parent.
(Inherit_Predicate_Flags): Prevent double assignment of predicate
function and flags.
(Build_Subtype): For a constrained array component, propagate predicate
information from original component type declaration.

gcc/testsuite/

* gnat.dg/discr51.adb: New testcase.--- gcc/ada/einfo.adb
+++ gcc/ada/einfo.adb
@@ -276,6 +276,7 @@ package body Einfo is
 
--Nested_ScenariosElist36
--Validated_ObjectNode36
+   --Predicated_Parent   Node36
 
--Class_Wide_CloneNode38
 
@@ -3082,6 +3083,12 @@ package body Einfo is
   return Node14 (Id);
end Postconditions_Proc;
 
+   function Predicated_Parent (Id : E) return E is
+   begin
+  pragma Assert (Is_Type (Id));
+  return Node36 (Id);
+   end Predicated_Parent;
+
function Predicates_Ignored (Id : E) return B is
begin
   pragma Assert (Is_Type (Id));
@@ -6311,6 +6318,12 @@ package body Einfo is
   Set_Node14 (Id, V);
end Set_Postconditions_Proc;
 
+   procedure Set_Predicated_Parent (Id : E; V : E) is
+   begin
+  pragma Assert (Is_Type (Id));
+  Set_Node36 (Id, V);
+   end Set_Predicated_Parent;
+
procedure Set_Predicates_Ignored (Id : E; V : B) is
begin
   pragma Assert (Is_Type (Id));
@@ -8829,6 +8842,9 @@ package body Einfo is
   then
  Typ := Full_View (Id);
 
+  elsif Is_Itype (Id) and then Present (Predicated_Parent (Id)) then
+ Typ := Predicated_Parent (Id);
+
   else
  Typ := Id;
   end if;
@@ -11200,6 +11216,11 @@ package body Einfo is
  when E_Variable =>
 Write_Str ("Validated_Object");
 
+ when E_Array_Subtype
+| E_Record_Subtype
+ =>
+Write_Str ("predicated parent");
+
  when others =>
 Write_Str ("Field36??");
   end case;

--- gcc/ada/einfo.ads
+++ gcc/ada/einfo.ads
@@ -3932,6 +3932,14 @@ package Einfo is
 --   is the special version created for membership tests, where if one of
 --   these raise expressions is executed, the result is to return False.
 
+--Predicated_Parent (Node36)
+--   Defined on itypes created by subtype indications, when the parent
+--   subtype has predicates. The itype shares the Predicate_Function
+--   of the predicated parent, but this function may not have been built
+--   at the point the Itype is constructed, so this attribute allows its
+--   retrieval at the point a predicate check needs to be generated.
+--   The utility Predicate_Function takes this link into account.
+
 --Predicates_Ignored (Flag288)
 --   Defined on all types. Indicates whether the subtype declaration is in
 --   a context where Assertion_Policy is Ignore, in which case no checks
@@ -7427,6 +7435,7 @@ package Einfo is
function Partial_View_Has_Unknown_Discr  (Id : E) return B;
function Pending_Access_Types(Id : E) return L;
function Postconditions_Proc (Id : E) return E;
+   function Predicated_Parent   (Id : E) 

[Ada] Fix computation of handle/pid lists in win32_wait

2018-05-23 Thread Pierre-Marie de Rodat
An obvious mistake due to missing parentheses was not properly computing the
size of the handle and pid list passed to WaitForMultipleObjects(). This
resulted in a memory corruption.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-23  Pascal Obry  

gcc/ada/

* adaint.c (win32_wait): Add missing parentheses.--- gcc/ada/adaint.c
+++ gcc/ada/adaint.c
@@ -2591,10 +2591,10 @@ win32_wait (int *status)
 #else
   /* Note that index 0 contains the event handle that is signaled when the
  process list has changed */
-  hl = (HANDLE *) xmalloc (sizeof (HANDLE) * hl_len + 1);
+  hl = (HANDLE *) xmalloc (sizeof (HANDLE) * (hl_len + 1));
   hl[0] = ProcListEvt;
   memmove (&hl[1], HANDLES_LIST, sizeof (HANDLE) * hl_len);
-  pidl = (int *) xmalloc (sizeof (int) * hl_len + 1);
+  pidl = (int *) xmalloc (sizeof (int) * (hl_len + 1));
   memmove (&pidl[1], PID_LIST, sizeof (int) * hl_len);
   hl_len++;
 #endif



Re: [RFC] [aarch64] Add HiSilicon tsv110 CPU support

2018-05-23 Thread Ramana Radhakrishnan



On 23/05/2018 03:50, Zhangshaokun wrote:

Hi Ramana,

On 2018/5/22 18:28, Ramana Radhakrishnan wrote:

On Tue, May 22, 2018 at 9:40 AM, Shaokun Zhang
 wrote:

tsv110 is designed by HiSilicon and supports v8_4A, it also optimizes
L1 Icache which can access L1 Dcache.
Therefore, DC CVAU is not necessary in __aarch64_sync_cache_range for
tsv110, is there any good idea to skip DC CVAU operation for tsv110.


A solution would be to use an ifunc but on a cpu variant.



ifunc, can you give further explanation?
If on a cpu variant, for HiSilicon tsv110, we have two version and CPU variants
are 0 and 1. Both are expected to skip DC CVAU operation in sync icache and
dcache.



Since it is not necessary for sync icache and dcache, it is beneficial for
performance to skip the redundant DC CVAU and do IC IVAU only.
For JVM, __clear_cache is called many times.



Thanks for some more detail as to where you think you want to use this. 
Have you investigated whether the jvm can actually elide such a call 
rather than trying to fix this in the toolchain ?


If you really need to think about solutions in the toolchain -

The simplest first step would be to implement the changes hinted at by 
the comment in aarch64.h .


 If you read the comment above CLEAR_INSN_CACHE in aarch64.h you would 
see that


/* This definition should be relocated to aarch64-elf-raw.h.  This macro
   should be undefined in aarch64-linux.h and a clear_cache pattern
   implmented to emit either the call to __aarch64_sync_cache_range()
   directly or preferably the appropriate sycall or cache clear
   instructions inline.  */
#define CLEAR_INSN_CACHE(beg, end)  \
  extern void  __aarch64_sync_cache_range (void *, void *); \
  __aarch64_sync_cache_range (beg, end)

Thus I would expect that by implementing the clear_cache pattern and 
deciding whether to put out the call to the __aarch64_sync_cache_range 
function or not depending on whether you had the tsv110 chosen on the 
command line would allow you to have an idea of what the performance 
gain actually is by compiling the jvm with -mcpu=tsv110 vs 
-march=armv8-a. You probably also want to clean up the trampoline_init 
code while you are here.


I do think that's something that should be easy enough to do and the 
subject of a patch series in its own right. If your users can rebuild 
the world for tsv110 then this is sufficient.


If you want to have a single jvm binary without any run time checks, 
then you need to investigate the use of ifuncs which are a mechanism in 
the GNU toolchain for some of this kind of stuff. We tend not to ifuncs 
on a per CPU basis unless there is a very good reason and the 
performance improvement is worth it (but probably more on a per 
architecture or per architectural basis) and you will need to make the 
case for it including what sort of performance benefits it gives. Some 
introduction about this feature can be found here. 
https://sourceware.org/glibc/wiki/GNU_IFUNC


regards
Ramana



Hi ARM guys,
are you happy to share yours idea about this?


Is this really that important for performance and on what workloads ?



Since it is not necessary for sync icache and dcache, it is beneficial for
performance to skip the redundant DC CVAU and do IC IVAU only.
For JVM, __clear_cache is called many times.

Thanks,
Shaokun


regards
Ramana



Any thoughts and ideas are welcome.

Shaokun Zhang (1):
   [aarch64] Add HiSilicon tsv110 CPU support.

  gcc/ChangeLog|   9 +++
  gcc/config/aarch64/aarch64-cores.def |   5 ++
  gcc/config/aarch64/aarch64-cost-tables.h | 103 +++
  gcc/config/aarch64/aarch64-tune.md   |   2 +-
  gcc/config/aarch64/aarch64.c |  79 
  gcc/doc/invoke.texi  |   2 +-
  6 files changed, 198 insertions(+), 2 deletions(-)

--
2.7.4








Re: [PATCH PR85804]Fix wrong code by correcting bump step computation in vector(1) load of single-element group access

2018-05-23 Thread Bin.Cheng
On Wed, May 23, 2018 at 11:19 AM, Richard Biener
 wrote:
> On Tue, May 22, 2018 at 2:11 PM Richard Sandiford <
> richard.sandif...@linaro.org> wrote:
>
>> Richard Biener  writes:
>> > On Mon, May 21, 2018 at 3:14 PM Bin Cheng  wrote:
>> >
>> >> Hi,
>> >> As reported in PR85804, bump step is wrongly computed for vector(1)
> load
>> > of
>> >> single-element group access.  This patch fixes the issue by correcting
>> > bump
>> >> step computation for the specific VMAT_CONTIGUOUS case.
>> >
>> >> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK?
>> >
>> > To me it looks like the classification as VMAT_CONTIGUOUS is bogus.
>> > We'd fall into the grouped_load case otherwise which should handle
>> > the situation correctly?
>> >
>> > Richard?
>
>> Yeah, I agree.  I mentioned to Bin privately that that was probably
>> a misstep and that we should instead continue to treat them as
>> VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector
>> from the array of loaded vectors, instead of doing an actual permute.
>
> I'd classify them as VMAT_ELEMENTWISE instead.  CONTIGUOUS
> should be only for no-gap vectors.  How do we classify single-element
> interleaving?  That would be another classification choice.
Yes, I suggested this offline too, but Richard may have more to say about this.
One thing worth noting is classifying it as VMAT_ELEMENTWISE would
disable vectorization in this case because of cost model issue as
commented at the end of get_load_store_type.

Thanks,
bin
>
>> (Note that VMAT_CONTIGUOUS is OK for stores, since we don't allow
>> gaps there.  But it might be easiest to handle both loads and stores
>> in the same way.)
>
>> Although it still seems weird to "vectorise" stuff to one element.
>> Why not leave the original scalar code in place, and put the onus on
>> whatever wants to produce or consume a V1 to do the appropriate
>> conversion?
>
> Yeah, V1 is somewhat awkward to deal with but I think the way we're
> doing it right now is OK.  Note if we leave the original scalar code in
> place we still have to deal with larger unrolling factors thus we'd have
> to duplicate the scalar code and have different IVs anyways.
>
> Richard.
>
>> Thanks,
>> Richard


Re: [PATCH PR85804]Fix wrong code by correcting bump step computation in vector(1) load of single-element group access

2018-05-23 Thread Richard Sandiford
"Bin.Cheng"  writes:
> On Wed, May 23, 2018 at 11:19 AM, Richard Biener
>  wrote:
>> On Tue, May 22, 2018 at 2:11 PM Richard Sandiford <
>> richard.sandif...@linaro.org> wrote:
>>
>>> Richard Biener  writes:
>>> > On Mon, May 21, 2018 at 3:14 PM Bin Cheng  wrote:
>>> >
>>> >> Hi,
>>> >> As reported in PR85804, bump step is wrongly computed for vector(1)
>> load
>>> > of
>>> >> single-element group access.  This patch fixes the issue by correcting
>>> > bump
>>> >> step computation for the specific VMAT_CONTIGUOUS case.
>>> >
>>> >> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK?
>>> >
>>> > To me it looks like the classification as VMAT_CONTIGUOUS is bogus.
>>> > We'd fall into the grouped_load case otherwise which should handle
>>> > the situation correctly?
>>> >
>>> > Richard?
>>
>>> Yeah, I agree.  I mentioned to Bin privately that that was probably
>>> a misstep and that we should instead continue to treat them as
>>> VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector
>>> from the array of loaded vectors, instead of doing an actual permute.
>>
>> I'd classify them as VMAT_ELEMENTWISE instead.  CONTIGUOUS
>> should be only for no-gap vectors.  How do we classify single-element
>> interleaving?  That would be another classification choice.
> Yes, I suggested this offline too, but Richard may have more to say about 
> this.
> One thing worth noting is classifying it as VMAT_ELEMENTWISE would
> disable vectorization in this case because of cost model issue as
> commented at the end of get_load_store_type.

Yeah, that's the problem.  Using VMAT_ELEMENTWISE also means that we
use a scalar load and then insert it into a vector, whereas all we want
(and all we currently generate) is a single vector load.

So if we classify them as VMAT_ELEMENTWISE, they'll be a special case
for both costing (vector load rather than scalar load and vector
construct) and code-generation.

Thanks,
Richard


Re: [PATCH][RFC] Add dynamic edge/bb flag allocation

2018-05-23 Thread Richard Biener
On Tue, 22 May 2018, Richard Biener wrote:

> On May 22, 2018 6:53:57 PM GMT+02:00, Joseph Myers  
> wrote:
> >On Tue, 22 May 2018, Richard Biener wrote:
> >
> >> +  if (*sptr & (1 << (CHAR_BIT * sizeof (T) - 1)))
> >> +  gcc_unreachable ();
> >> +  m_flag = 1 << ((CHAR_BIT * sizeof (T)) - clz_hwi (*sptr));
> >
> >I don't see how the use of clz_hwi works with a type T that may be 
> >narrower than HOST_WIDE_INT.  Surely this logic requires a count of 
> >leading zeros in something of type T, not a possibly larger number of 
> >leading zeros after conversion to HOST_WIDE_INT?  Also, if T is wider
> >than 
> >int, shifting plain 1 won't work here.
> 
> I messed up the conversion to a template. The bitnum should be subtracted 
> from HOST_BITS_PER_WIDE_INT and yes, 1 in unsigned hwi should be shifted. 

So this is the final patch, I've changed the flags compute to use ffs
which is better suited to find a "random" unset bit.  For types
smaller than HOST_WIDE_INT we'll find bits outside of the range but
the truncated mask will be zero.  I guess the
ffsl ((unsigned long)~intvar) cannot be easily pattern-matched to
ffs so a way to do that unsigned conversion would be nice (or
mass-change all signed flag ints to unsigned...).

I took the opportunity to change dfs_enumerate_from to use
an auto_bb_flag rather than a static sbitmap.  That should be
profitable given we currently have the cacheline with BB flags
loaded anyways because we access BB index.  So using a BB flag
will avoid pulling in the sbitmap cachelines.  But it trades
possibly less memory stores for it in case the sbitmap modifications
were adjacent.  OTOH applying a mask should be cheaper than
variable shifts involved in sbitmap bit access.  It's definitely
less code ;)

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

OK for trunk?

Thanks,
Richard.

>From 0091e95a133454da62973ad570c97e7b61bfd0ec Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Fri, 18 May 2018 13:01:36 +0200
Subject: [PATCH] add dynamic cfg flag allocation

* cfg.h (struct control_flow_graph): Add edge_flags_allocated and
bb_flags_allocated members.
(auto_flag): New RAII class for allocating flags.
(auto_edge_flag): New RAII class for allocating edge flags.
(auto_bb_flag): New RAII class for allocating bb flags.
* cfgloop.c (verify_loop_structure): Allocate temporary edge
flag dynamically.
* cfganal.c (dfs_enumerate_from): Remove use of visited sbitmap
in favor of temporarily allocated BB flag.
* hsa-brig.c: Re-order includes.
* hsa-dump.c: Likewise.
* hsa-regalloc.c: Likewise.
* print-rtl.c: Likewise.
* profile-count.c: Likewise.

diff --git a/gcc/cfg.c b/gcc/cfg.c
index 11026e7209a..f8b217d39ca 100644
--- a/gcc/cfg.c
+++ b/gcc/cfg.c
@@ -79,6 +79,8 @@ init_flow (struct function *the_fun)
 = EXIT_BLOCK_PTR_FOR_FN (the_fun);
   EXIT_BLOCK_PTR_FOR_FN (the_fun)->prev_bb
 = ENTRY_BLOCK_PTR_FOR_FN (the_fun);
+  the_fun->cfg->edge_flags_allocated = EDGE_ALL_FLAGS;
+  the_fun->cfg->bb_flags_allocated = BB_ALL_FLAGS;
 }
 
 /* Helper function for remove_edge and clear_edges.  Frees edge structure
diff --git a/gcc/cfg.h b/gcc/cfg.h
index 0953456782b..9fff135d11f 100644
--- a/gcc/cfg.h
+++ b/gcc/cfg.h
@@ -74,6 +74,10 @@ struct GTY(()) control_flow_graph {
 
   /* Maximal count of BB in function.  */
   profile_count count_max;
+
+  /* Dynamically allocated edge/bb flags.  */
+  int edge_flags_allocated;
+  int bb_flags_allocated;
 };
 
 
@@ -121,4 +125,60 @@ extern basic_block get_bb_copy (basic_block);
 void set_loop_copy (struct loop *, struct loop *);
 struct loop *get_loop_copy (struct loop *);
 
+/* Generic RAII class to allocate a bit from storage of integer type T.
+   The allocated bit is accessible as mask with the single bit set
+   via the conversion operator to T.  */
+
+template 
+class auto_flag
+{
+public:
+  /* static assert T is integer type of max HOST_WIDE_INT precision.  */
+  auto_flag (T *sptr)
+{
+  m_sptr = sptr;
+  int free_bit = ffs_hwi (~*sptr);
+  /* If there are no unset bits... */
+  if (free_bit == 0)
+   gcc_unreachable ();
+  m_flag = HOST_WIDE_INT_1U << (free_bit - 1);
+  /* ...or if T is signed and thus the complement is sign-extended,
+ check if we ran out of bits.  We could spare us this bit
+if we could use C++11 std::make_unsigned::type to pass
+~*sptr to ffs_hwi.  */
+  if (m_flag == 0)
+   gcc_unreachable ();
+  gcc_checking_assert ((*sptr & m_flag) == 0);
+  *sptr |= m_flag;
+}
+  ~auto_flag ()
+{
+  gcc_checking_assert ((*m_sptr & m_flag) == m_flag);
+  *m_sptr &= ~m_flag;
+}
+  operator T () const { return m_flag; }
+private:
+  T *m_sptr;
+  T m_flag;
+};
+
+/* RAII class to allocate an edge flag for temporary use.  You have
+   to clear the flag from all edges when you are finished using it.  */
+
+class auto_ed

Re: Add a class to represent a gimple match result

2018-05-23 Thread Richard Biener
On Tue, May 22, 2018 at 9:25 AM Richard Sandiford <
richard.sandif...@linaro.org> wrote:

> Gimple match results are represented by a code_helper for the operation,
> a tree for the type, and an array of three trees for the operands.
> This patch wraps them up in a class so that they don't need to be
> passed around individually.

> The main reason for doing this is to make it easier to increase the
> number of operands (for calls) or to support more complicated kinds
> of operation.  But passing around fewer operands also helps to reduce
> the size of gimple-match.o (about 7% for development builds and 4% for
> release builds).

Looks great!

Thanks and OK.
Richard.

> 2018-05-21  Richard Sandiford  

> gcc/
>  * gimple-match.h (gimple_match_op): New class.
>  (mprts_hook): Replace parameters with a gimple_match_op *.
>  (maybe_build_generic_op): Likewise.
>  (gimple_simplified_result_is_gimple_val): Replace parameters with
>  a const gimple_match_op *.
>  (gimple_simplify): Replace code_helper * and tree * parameters
with
>  a gimple_match_op * parameter.
>  (gimple_resimplify1): Replace code_helper *, tree and tree *
>  parameters with a gimple_match_op * parameter.
>  (gimple_resimplify2): Likewise.
>  (gimple_resimplify3): Likewise.
>  (maybe_push_res_to_seq): Replace code_helper, tree and tree *
>  parameters with a gimple_match_op * parameter.
>  * gimple-match-head.c (gimple_simplify): Change prototypes of
>  auto-generated functions to take a gimple_match_op * instead of
>  separate code_helper * and tree * parameters.  Make the same
>  change in the top-level overload and update calls to the
>  gimple_resimplify routines.  Update calls to the auto-generated
>  functions and to maybe_push_res_to_seq in the publicly-facing
>  operation-specific gimple_simplify overloads.
>  (gimple_match_op::MAX_NUM_OPS): Define.
>  (gimple_resimplify1): Replace rcode and ops with a single res_op
>  parameter.  Update call to gimple_simplify.
>  (gimple_resimplify2): Likewise.
>  (gimple_resimplify3): Likewise.
>  (mprts_hook): Replace parameters with a gimple_match_op *.
>  (maybe_build_generic_op): Likewise.
>  (build_call_internal): Replace type, nargs and ops with
>  a gimple_match_op *.
>  (maybe_push_res_to_seq): Replace res_code, type and ops parameters
>  with a single gimple_match_op *.  Update calls to mprts_hook,
>  build_call_internal and gimple_simplified_result_is_gimple_val.
>  Factor out code that is common to the tree_code and combined_fn
cases.
>  * genmatch.c (expr::gen_transform): Replace tem_code and
>  tem_ops with a gimple_match_op called tem_op.  Update calls
>  to the gimple_resimplify functions and maybe_push_res_to_seq.
>  (dt_simplify::gen_1): Manipulate res_op instead of res_code and
>  res_ops.  Update call to the gimple_resimplify functions.
>  (dt_simplify::gen): Pass res_op instead of res_code and res_ops.
>  (decision_tree::gen): Make the functions take a gimple_match_op *
>  called res_op instead of separate res_code and res_ops parameters.
>  Update call accordingly.
>  * gimple-fold.c (replace_stmt_with_simplification): Replace rcode
>  and ops with a single res_op parameter.  Update calls to
>  maybe_build_generic_op and maybe_push_res_to_seq.
>  (fold_stmt_1): Update calls to gimple_simplify and
>  replace_stmt_with_simplification.
>  (gimple_fold_stmt_to_constant_1): Update calls to gimple_simplify
>  and gimple_simplified_result_is_gimple_val.
>  * tree-cfgcleanup.c (cleanup_control_expr_graph): Update call to
>  gimple_simplify.
>  * tree-ssa-sccvn.c (vn_lookup_simplify_result): Replace parameters
>  with a gimple_match_op *.
>  (vn_nary_build_or_lookup): Likewise.  Update call to
>  vn_nary_build_or_lookup_1.
>  (vn_nary_build_or_lookup_1): Replace rcode, type and ops with a
>  gimple_match_op *.  Update calls to the gimple_resimplify routines
>  and to gimple_simplified_result_is_gimple_val.
>  (vn_nary_simplify): Update call to vn_nary_build_or_lookup_1.
>  Use gimple_match_op::MAX_NUM_OPS instead of a hard-coded 3.
>  (vn_reference_lookup_3): Update call to vn_nary_build_or_lookup.
>  (visit_nary_op): Likewise.
>  (visit_reference_op_load): Likewise.

> Index: gcc/gimple-match.h
> ===
> --- gcc/gimple-match.h  2018-05-22 08:22:40.094593327 +0100
> +++ gcc/gimple-match.h  2018-05-22 08:22:40.324588555 +0100
> @@ -40,31 +40,165 @@ #define GCC_GIMPLE_MATCH_H
> int rep;
>   };

> -/* Return whether OPS[0] with CO

Re: [PATCH PR85804]Fix wrong code by correcting bump step computation in vector(1) load of single-element group access

2018-05-23 Thread Bin.Cheng
On Wed, May 23, 2018 at 12:01 PM, Richard Sandiford
 wrote:
> "Bin.Cheng"  writes:
>> On Wed, May 23, 2018 at 11:19 AM, Richard Biener
>>  wrote:
>>> On Tue, May 22, 2018 at 2:11 PM Richard Sandiford <
>>> richard.sandif...@linaro.org> wrote:
>>>
 Richard Biener  writes:
 > On Mon, May 21, 2018 at 3:14 PM Bin Cheng  wrote:
 >
 >> Hi,
 >> As reported in PR85804, bump step is wrongly computed for vector(1)
>>> load
 > of
 >> single-element group access.  This patch fixes the issue by correcting
 > bump
 >> step computation for the specific VMAT_CONTIGUOUS case.
 >
 >> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK?
 >
 > To me it looks like the classification as VMAT_CONTIGUOUS is bogus.
 > We'd fall into the grouped_load case otherwise which should handle
 > the situation correctly?
 >
 > Richard?
>>>
 Yeah, I agree.  I mentioned to Bin privately that that was probably
 a misstep and that we should instead continue to treat them as
 VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector
 from the array of loaded vectors, instead of doing an actual permute.
>>>
>>> I'd classify them as VMAT_ELEMENTWISE instead.  CONTIGUOUS
>>> should be only for no-gap vectors.  How do we classify single-element
>>> interleaving?  That would be another classification choice.
>> Yes, I suggested this offline too, but Richard may have more to say about 
>> this.
>> One thing worth noting is classifying it as VMAT_ELEMENTWISE would
>> disable vectorization in this case because of cost model issue as
>> commented at the end of get_load_store_type.
>
> Yeah, that's the problem.  Using VMAT_ELEMENTWISE also means that we
> use a scalar load and then insert it into a vector, whereas all we want
> (and all we currently generate) is a single vector load.
>
> So if we classify them as VMAT_ELEMENTWISE, they'll be a special case
> for both costing (vector load rather than scalar load and vector
> construct) and code-generation.
Looks to me it will be a special case for VMAT_ELEMENTWISE or
VMAT_CONTIGUOUS* anyway, probably VMAT_CONTIGUOUS is the easiest one?

Thanks,
bin
>
> Thanks,
> Richard


Re: [PATCH PR85804]Fix wrong code by correcting bump step computation in vector(1) load of single-element group access

2018-05-23 Thread Richard Biener
On Wed, May 23, 2018 at 1:02 PM Richard Sandiford <
richard.sandif...@linaro.org> wrote:

> "Bin.Cheng"  writes:
> > On Wed, May 23, 2018 at 11:19 AM, Richard Biener
> >  wrote:
> >> On Tue, May 22, 2018 at 2:11 PM Richard Sandiford <
> >> richard.sandif...@linaro.org> wrote:
> >>
> >>> Richard Biener  writes:
> >>> > On Mon, May 21, 2018 at 3:14 PM Bin Cheng  wrote:
> >>> >
> >>> >> Hi,
> >>> >> As reported in PR85804, bump step is wrongly computed for vector(1)
> >> load
> >>> > of
> >>> >> single-element group access.  This patch fixes the issue by
correcting
> >>> > bump
> >>> >> step computation for the specific VMAT_CONTIGUOUS case.
> >>> >
> >>> >> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK?
> >>> >
> >>> > To me it looks like the classification as VMAT_CONTIGUOUS is bogus.
> >>> > We'd fall into the grouped_load case otherwise which should handle
> >>> > the situation correctly?
> >>> >
> >>> > Richard?
> >>
> >>> Yeah, I agree.  I mentioned to Bin privately that that was probably
> >>> a misstep and that we should instead continue to treat them as
> >>> VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector
> >>> from the array of loaded vectors, instead of doing an actual permute.
> >>
> >> I'd classify them as VMAT_ELEMENTWISE instead.  CONTIGUOUS
> >> should be only for no-gap vectors.  How do we classify single-element
> >> interleaving?  That would be another classification choice.
> > Yes, I suggested this offline too, but Richard may have more to say
about this.
> > One thing worth noting is classifying it as VMAT_ELEMENTWISE would
> > disable vectorization in this case because of cost model issue as
> > commented at the end of get_load_store_type.

> Yeah, that's the problem.  Using VMAT_ELEMENTWISE also means that we
> use a scalar load and then insert it into a vector, whereas all we want
> (and all we currently generate) is a single vector load.

> So if we classify them as VMAT_ELEMENTWISE, they'll be a special case
> for both costing (vector load rather than scalar load and vector
> construct) and code-generation.

But V1 elementwise loads already work by loading with the V1 vector type
(Bin fixed that recently).  But yes, the costmodel thing should be fixed
anyways.

Richard.

> Thanks,
> Richard


Re: [PATCH PR85804]Fix wrong code by correcting bump step computation in vector(1) load of single-element group access

2018-05-23 Thread Richard Biener
On Wed, May 23, 2018 at 1:10 PM Bin.Cheng  wrote:

> On Wed, May 23, 2018 at 12:01 PM, Richard Sandiford
>  wrote:
> > "Bin.Cheng"  writes:
> >> On Wed, May 23, 2018 at 11:19 AM, Richard Biener
> >>  wrote:
> >>> On Tue, May 22, 2018 at 2:11 PM Richard Sandiford <
> >>> richard.sandif...@linaro.org> wrote:
> >>>
>  Richard Biener  writes:
>  > On Mon, May 21, 2018 at 3:14 PM Bin Cheng 
wrote:
>  >
>  >> Hi,
>  >> As reported in PR85804, bump step is wrongly computed for
vector(1)
> >>> load
>  > of
>  >> single-element group access.  This patch fixes the issue by
correcting
>  > bump
>  >> step computation for the specific VMAT_CONTIGUOUS case.
>  >
>  >> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK?
>  >
>  > To me it looks like the classification as VMAT_CONTIGUOUS is bogus.
>  > We'd fall into the grouped_load case otherwise which should handle
>  > the situation correctly?
>  >
>  > Richard?
> >>>
>  Yeah, I agree.  I mentioned to Bin privately that that was probably
>  a misstep and that we should instead continue to treat them as
>  VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector
>  from the array of loaded vectors, instead of doing an actual permute.
> >>>
> >>> I'd classify them as VMAT_ELEMENTWISE instead.  CONTIGUOUS
> >>> should be only for no-gap vectors.  How do we classify single-element
> >>> interleaving?  That would be another classification choice.
> >> Yes, I suggested this offline too, but Richard may have more to say
about this.
> >> One thing worth noting is classifying it as VMAT_ELEMENTWISE would
> >> disable vectorization in this case because of cost model issue as
> >> commented at the end of get_load_store_type.
> >
> > Yeah, that's the problem.  Using VMAT_ELEMENTWISE also means that we
> > use a scalar load and then insert it into a vector, whereas all we want
> > (and all we currently generate) is a single vector load.
> >
> > So if we classify them as VMAT_ELEMENTWISE, they'll be a special case
> > for both costing (vector load rather than scalar load and vector
> > construct) and code-generation.
> Looks to me it will be a special case for VMAT_ELEMENTWISE or
> VMAT_CONTIGUOUS* anyway, probably VMAT_CONTIGUOUS is the easiest one?

The question is why we do

   if (memory_access_type == VMAT_GATHER_SCATTER
   || (!slp && memory_access_type == VMAT_CONTIGUOUS))
 grouped_load = false;

I think for the particular testcase removing this would fix the issue as
well?

Richard.

> Thanks,
> bin
> >
> > Thanks,
> > Richard


Re: [PATCH] PR libgcc/60790: Avoid IFUNC resolver access to uninitialized data

2018-05-23 Thread Florian Weimer

On 05/23/2018 12:48 AM, Jeff Law wrote:

OK.


Thanks.  I will let it sit in trunk for a while and then consider 
backporting it to release branches.



Do you have write access to the GCC repository?  If not I can commit for
you.


I have write access.  I take this as a hint to contribute more 
regularly. 8-)


Florian



Re: [PATCH][AArch64] Fix aarch64_ira_change_pseudo_allocno_class

2018-05-23 Thread Wilco Dijkstra
Richard Sandiford wrote:

> -  if (allocno_class != ALL_REGS)
> +  if (allocno_class != POINTER_AND_FP_REGS)
>  return allocno_class;
>  
> -  if (best_class != ALL_REGS)
> +  if (best_class != POINTER_AND_FP_REGS)
>  return best_class;
>  
>    mode = PSEUDO_REGNO_MODE (regno);

> I think it'd be better to use !reg_class_subset_p (POINTER_AND_FP_REGS, ...)
> instead of ... != POINTER_AND_FP_REGS, since this in principle still applies
> to ALL_REGS too.
> 
> FWIW, the patch looks good to me with that change.

How does reg_class_subset_p help? In my testing I didn't see ALL_REGS ever
used (and I don't believe it's possible to get it with SVE either). And it's 
not obvious
without looking at the implementation whether subset here means strict subset 
or not,
so it would obfuscate the clear meaning of the existing patch.

Wilco

Re: [PATCH PR85720/partial]Support runtime loop versioning if loop can be distributed into builtin functions

2018-05-23 Thread Richard Biener
On Tue, May 22, 2018 at 6:38 PM Bin Cheng  wrote:

> Hi,
> This patch partially improves loop distribution for PR85720.  It now
supports runtime
> loop versioning if the loop can be distributed into builtin functions.
Note for this
> moment only coarse-grain runtime alias is checked, while different
overlapping cases
> for different dependence relations are not supported yet.
> Note changes in break_alias_scc_partitions and
version_loop_by_alias_check do not
> strictly match each other, with the latter more restricted.  Because it's
hard to pass
> information around.  Hopefully this will be resolved when classifying
distributor.

> Bootstrap and test on x86_64.  Is it OK?

OK.

Thanks,
Richard.

> Thanks,
> bin

> 2018-05-22  Bin Cheng  

>  * tree-loop-distribution.c (break_alias_scc_partitions): Don't
merge
>  SCC if all partitions are builtins.
>  (version_loop_by_alias_check): New parameter.  Generate cancelable
>  runtime alias check if all partitions are builtins.
>  (distribute_loop): Update call to above function.

> gcc/testsuite
> 2018-05-22  Bin Cheng  

>  * gcc.dg/tree-ssa/pr85720.c: New test.
>  * gcc.target/i386/avx256-unaligned-store-2.c: Disable loop pattern
>  distribution.


Re: [PATCH][AARCH64][PR target/84882] Add mno-strict-align

2018-05-23 Thread Sudakshina Das

Hi Richard

On 18/05/18 15:48, Richard Earnshaw (lists) wrote:

On 27/03/18 13:58, Sudakshina Das wrote:

Hi

This patch adds the no variant to -mstrict-align and the corresponding
function attribute. To enable the function attribute, I have modified
aarch64_can_inline_p () to allow checks even when the callee function
has no attribute. The need for this is shown by the new test
target_attr_18.c.

Testing: Bootstrapped, regtested and added new tests that are copies
of earlier tests checking -mstrict-align with opposite scan directives.

Is this ok for trunk?

Sudi


*** gcc/ChangeLog ***

2018-03-27  Sudakshina Das  

 * common/config/aarch64/aarch64-common.c (aarch64_handle_option):
 Check val before adding MASK_STRICT_ALIGN to opts->x_target_flags.
 * config/aarch64/aarch64.opt (mstrict-align): Remove RejectNegative.
 * config/aarch64/aarch64.c (aarch64_attributes): Mark allow_neg
 as true for strict-align.
 (aarch64_can_inline_p): Perform checks even when callee has no
 attributes to check for strict alignment.
 * doc/extend.texi (AArch64 Function Attributes): Document
 no-strict-align.
 * doc/invoke.texi: (AArch64 Options): Likewise.

*** gcc/testsuite/ChangeLog ***

2018-03-27  Sudakshina Das  

 * gcc.target/aarch64/pr84882.c: New test.
 * gcc.target/aarch64/target_attr_18.c: Likewise.

strict-align.diff


diff --git a/gcc/common/config/aarch64/aarch64-common.c 
b/gcc/common/config/aarch64/aarch64-common.c
index 7fd9305..d5655a0 100644
--- a/gcc/common/config/aarch64/aarch64-common.c
+++ b/gcc/common/config/aarch64/aarch64-common.c
@@ -97,7 +97,10 @@ aarch64_handle_option (struct gcc_options *opts,
return true;
  
  case OPT_mstrict_align:

-  opts->x_target_flags |= MASK_STRICT_ALIGN;
+  if (val)
+   opts->x_target_flags |= MASK_STRICT_ALIGN;
+  else
+   opts->x_target_flags &= ~MASK_STRICT_ALIGN;
return true;
  
  case OPT_momit_leaf_frame_pointer:

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 4b5183b..4f35a6c 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11277,7 +11277,7 @@ static const struct aarch64_attribute_info 
aarch64_attributes[] =
{ "fix-cortex-a53-843419", aarch64_attr_bool, true, NULL,
   OPT_mfix_cortex_a53_843419 },
{ "cmodel", aarch64_attr_enum, false, NULL, OPT_mcmodel_ },
-  { "strict-align", aarch64_attr_mask, false, NULL, OPT_mstrict_align },
+  { "strict-align", aarch64_attr_mask, true, NULL, OPT_mstrict_align },
{ "omit-leaf-frame-pointer", aarch64_attr_bool, true, NULL,
   OPT_momit_leaf_frame_pointer },
{ "tls-dialect", aarch64_attr_enum, false, NULL, OPT_mtls_dialect_ },
@@ -11640,16 +11640,13 @@ aarch64_can_inline_p (tree caller, tree callee)
tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
  
-  /* If callee has no option attributes, then it is ok to inline.  */

-  if (!callee_tree)
-return true;


I think it's still useful to spot the case where both callee_tree and
caller_tree are NULL.  In that case both options will pick up
target_option_default_node and will always be compatible; so you can
short-circuit that case, which is the most likely scenario.


-
struct cl_target_option *caller_opts
= TREE_TARGET_OPTION (caller_tree ? caller_tree
   : target_option_default_node);
  
-  struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);

-
+  struct cl_target_option *callee_opts
+   = TREE_TARGET_OPTION (callee_tree ? callee_tree
+  : target_option_default_node);
  
/* Callee's ISA flags should be a subset of the caller's.  */

if ((caller_opts->x_aarch64_isa_flags & callee_opts->x_aarch64_isa_flags)
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index 52eaf8c..1426b45 100644
--- a/gcc/config/aarch64/aarch64.opt
+++ b/gcc/config/aarch64/aarch64.opt
@@ -85,7 +85,7 @@ Target RejectNegative Joined Enum(cmodel) 
Var(aarch64_cmodel_var) Init(AARCH64_C
  Specify the code model.
  
  mstrict-align

-Target Report RejectNegative Mask(STRICT_ALIGN) Save
+Target Report Mask(STRICT_ALIGN) Save
  Don't assume that unaligned accesses are handled by the system.
  
  momit-leaf-frame-pointer

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 93a0ebc..dcda216 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3605,8 +3605,10 @@ for the command line option @option{-mcmodel=}.
  @item strict-align


Other targets add an @itemx for the no-variant.


  @cindex @code{strict-align} function attribute, AArch64
  Indicates that the compiler should not assume that unaligned memory references
-are handled by the system.  The behavior is the same as for the command-line
-option @option{-mstrict-align}.
+are handled by the system.  To allow the compiler to assume that

Re: [PATCH][AArch64] Fix aarch64_ira_change_pseudo_allocno_class

2018-05-23 Thread Richard Sandiford
Wilco Dijkstra  writes:
> Richard Sandiford wrote:
>> -  if (allocno_class != ALL_REGS)
>> +  if (allocno_class != POINTER_AND_FP_REGS)
>>  return allocno_class;
>>  
>> -  if (best_class != ALL_REGS)
>> +  if (best_class != POINTER_AND_FP_REGS)
>>  return best_class;
>>  
>>    mode = PSEUDO_REGNO_MODE (regno);
>
>> I think it'd be better to use !reg_class_subset_p (POINTER_AND_FP_REGS, ...)
>> instead of ... != POINTER_AND_FP_REGS, since this in principle still applies
>> to ALL_REGS too.
>> 
>> FWIW, the patch looks good to me with that change.
>
> How does reg_class_subset_p help? In my testing I didn't see ALL_REGS ever
> used (and I don't believe it's possible to get it with SVE either). And
> it's not obvious
> without looking at the implementation whether subset here means strict
> subset or not,
> so it would obfuscate the clear meaning of the existing patch.

But I think the fact that we need this patch shows why hard-coding the
names of union classes is dangerous.  IMO the question isn't whether we
see ALL_REGS used but whether there's a reason in principle why it
wouldn't be used.  E.g. ALL_REGS is the starting class for the
best_class calculation, and LRA uses ALL_REGS as the default choice
for scratch reload registers.

It's not like we can claim that the testsuite will flag up if this
goes wrong again, since AIUI there are no tests that show the reason
we need to make this change.  (I realise the patch includes an md
change to keep the testsuite happy, but that's not the same thing.
I mean more a test that shows why removing the '*'s made things
worse, through no fault of its own.)

Conceptually what we're saying here is that if the given classes
include both GENERAL_REGS and FP_REGS, we'll choose between them
based on the mode of the register.  And that makes sense for any
class that includes both GENERAL_REGS and FP_REGS.  We could write
it that way if it seems better, i.e.:

  if (!reg_class_subset_p (GENERAL_REGS, ...)
  || !reg_class_subset_p (FP_REGS, ...))
...

That way we don't mention any union classes, and I think the meaning
is clear in the context of eventually returning GENERAL_REGS or FP_REGS.

reg_class_subset_p tests for the normal inclusive subset relation
rather than "strict subset".

Thanks,
Richard


Re: [PATCH PR85804]Fix wrong code by correcting bump step computation in vector(1) load of single-element group access

2018-05-23 Thread Bin.Cheng
On Wed, May 23, 2018 at 12:12 PM, Richard Biener
 wrote:
> On Wed, May 23, 2018 at 1:10 PM Bin.Cheng  wrote:
>
>> On Wed, May 23, 2018 at 12:01 PM, Richard Sandiford
>>  wrote:
>> > "Bin.Cheng"  writes:
>> >> On Wed, May 23, 2018 at 11:19 AM, Richard Biener
>> >>  wrote:
>> >>> On Tue, May 22, 2018 at 2:11 PM Richard Sandiford <
>> >>> richard.sandif...@linaro.org> wrote:
>> >>>
>>  Richard Biener  writes:
>>  > On Mon, May 21, 2018 at 3:14 PM Bin Cheng 
> wrote:
>>  >
>>  >> Hi,
>>  >> As reported in PR85804, bump step is wrongly computed for
> vector(1)
>> >>> load
>>  > of
>>  >> single-element group access.  This patch fixes the issue by
> correcting
>>  > bump
>>  >> step computation for the specific VMAT_CONTIGUOUS case.
>>  >
>>  >> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK?
>>  >
>>  > To me it looks like the classification as VMAT_CONTIGUOUS is bogus.
>>  > We'd fall into the grouped_load case otherwise which should handle
>>  > the situation correctly?
>>  >
>>  > Richard?
>> >>>
>>  Yeah, I agree.  I mentioned to Bin privately that that was probably
>>  a misstep and that we should instead continue to treat them as
>>  VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector
>>  from the array of loaded vectors, instead of doing an actual permute.
>> >>>
>> >>> I'd classify them as VMAT_ELEMENTWISE instead.  CONTIGUOUS
>> >>> should be only for no-gap vectors.  How do we classify single-element
>> >>> interleaving?  That would be another classification choice.
>> >> Yes, I suggested this offline too, but Richard may have more to say
> about this.
>> >> One thing worth noting is classifying it as VMAT_ELEMENTWISE would
>> >> disable vectorization in this case because of cost model issue as
>> >> commented at the end of get_load_store_type.
>> >
>> > Yeah, that's the problem.  Using VMAT_ELEMENTWISE also means that we
>> > use a scalar load and then insert it into a vector, whereas all we want
>> > (and all we currently generate) is a single vector load.
>> >
>> > So if we classify them as VMAT_ELEMENTWISE, they'll be a special case
>> > for both costing (vector load rather than scalar load and vector
>> > construct) and code-generation.
>> Looks to me it will be a special case for VMAT_ELEMENTWISE or
>> VMAT_CONTIGUOUS* anyway, probably VMAT_CONTIGUOUS is the easiest one?
>
> The question is why we do
>
>if (memory_access_type == VMAT_GATHER_SCATTER
>|| (!slp && memory_access_type == VMAT_CONTIGUOUS))
>  grouped_load = false;
>
> I think for the particular testcase removing this would fix the issue as
No, simply relaxing this check would result in generating redundant
loads (for gap).  Also it leads to vect_transform_grouped_load as well
as vect_permute_load_chain, so we would need to do the special case
handling just like classifying as VMAT_CONTIGUOUS_PERMUTE.

Thanks,
bin
> well?
>
> Richard.
>
>> Thanks,
>> bin
>> >
>> > Thanks,
>> > Richard


Re: [PATCH][RFC] Add dynamic edge/bb flag allocation

2018-05-23 Thread Michael Matz
Hi,

On Wed, 23 May 2018, Richard Biener wrote:

> > I messed up the conversion to a template. The bitnum should be 
> > subtracted from HOST_BITS_PER_WIDE_INT and yes, 1 in unsigned hwi 
> > should be shifted.

Maybe you should convert the thing to a template when the need arises 
instead of before?  You have now added 54 lines of code for wrapping an 
int!


Ciao,
Michael.


Re: [PATCH][RFC] Add dynamic edge/bb flag allocation

2018-05-23 Thread Eric Botcazou
> Maybe you should convert the thing to a template when the need arises
> instead of before?  You have now added 54 lines of code for wrapping an
> int!

Yeah, it took me 5 minutes to understand what all this fluff is about!

-- 
Eric Botcazou


Re: [PATCH] Fix PR85712 (SLSR cleanup of alternative interpretations)

2018-05-23 Thread Bill Schmidt
On May 23, 2018, at 4:32 AM, Richard Biener  wrote:
> 
> On Tue, May 22, 2018 at 11:37 PM Bill Schmidt 
> wrote:
> 
>> Hi,
> 
>> PR85712 shows where an existing test case fails in the SLSR pass because
>> the code is flawed that cleans up alternative interpretations (CAND_ADD
>> versus CAND_MULT, for example) after a replacement.  This patch fixes the
>> flaw by ensuring that we always visit all interpretations, not just
>> subsequent ones in the next_interp chain.  I found six occurrences of
>> this mistake in the code.
> 
>> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
>> No new test case is added since the failure occurs on an existing test
>> in the test suite.  Is this okay for trunk, and for backports to all
>> supported branches after some burn-in time?
> 
> OK and Yes for the backports.

Thanks, committed to trunk as r260608.  Will do backports next week.

Bill
> 
> Thanks,
> Richard.
> 
>> Thanks,
>> Bill
> 
> 
>> 2018-05-22  Bill Schmidt  
> 
>> * gimple-ssa-strength-reduction.c (struct slsr_cand_d): Add
>> first_interp field.
>> (alloc_cand_and_find_basis): Initialize first_interp field.
>> (slsr_process_mul): Modify first_interp field.
>> (slsr_process_add): Likewise.
>> (slsr_process_cast): Modify first_interp field for each new
>> interpretation.
>> (slsr_process_copy): Likewise.
>> (dump_candidate): Dump first_interp field.
>> (replace_mult_candidate): Process all interpretations, not just
>> subsequent ones.
>> (replace_rhs_if_not_dup): Likewise.
>> (replace_one_candidate): Likewise.
> 
>> Index: gcc/gimple-ssa-strength-reduction.c
>> ===
>> --- gcc/gimple-ssa-strength-reduction.c (revision 260484)
>> +++ gcc/gimple-ssa-strength-reduction.c (working copy)
>> @@ -266,6 +266,10 @@ struct slsr_cand_d
>>   of a statement.  */
>>cand_idx next_interp;
> 
>> +  /* Index of the first candidate record in a chain for the same
>> + statement.  */
>> +  cand_idx first_interp;
>> +
>>/* Index of the basis statement S0, if any, in the candidate vector.
>  */
>>cand_idx basis;
> 
>> @@ -686,6 +690,7 @@ alloc_cand_and_find_basis (enum cand_kind kind, gi
>>c->kind = kind;
>>c->cand_num = cand_vec.length () + 1;
>>c->next_interp = 0;
>> +  c->first_interp = c->cand_num;
>>c->dependent = 0;
>>c->sibling = 0;
>>c->def_phi = kind == CAND_MULT ? find_phi_def (base) : 0;
>> @@ -1261,6 +1266,7 @@ slsr_process_mul (gimple *gs, tree rhs1, tree rhs2
>>  is the stride and RHS2 is the base expression.  */
>>c2 = create_mul_ssa_cand (gs, rhs2, rhs1, speed);
>>c->next_interp = c2->cand_num;
>> +  c2->first_interp = c->cand_num;
>>  }
>>else if (TREE_CODE (rhs2) == INTEGER_CST)
>>  {
>> @@ -1498,7 +1504,10 @@ slsr_process_add (gimple *gs, tree rhs1, tree rhs2
>> {
>>   c2 = create_add_ssa_cand (gs, rhs2, rhs1, false, speed);
>>   if (c)
>> -   c->next_interp = c2->cand_num;
>> +   {
>> + c->next_interp = c2->cand_num;
>> + c2->first_interp = c->cand_num;
>> +   }
>>   else
>> add_cand_for_stmt (gs, c2);
>> }
>> @@ -1621,6 +1630,8 @@ slsr_process_cast (gimple *gs, tree rhs1, bool spe
> 
>>if (base_cand && base_cand->kind != CAND_PHI)
>>  {
>> +  slsr_cand_t first_cand = NULL;
>> +
>>while (base_cand)
>> {
>>   /* Propagate all data from the base candidate except the type,
>> @@ -1635,6 +1646,12 @@ slsr_process_cast (gimple *gs, tree rhs1, bool spe
>>  base_cand->index,
> base_cand->stride,
>>  ctype, base_cand->stride_type,
>>  savings);
>> + if (!first_cand)
>> +   first_cand = c;
>> +
>> + if (first_cand != c)
>> +   c->first_interp = first_cand->cand_num;
>> +
>>   if (base_cand->next_interp)
>> base_cand = lookup_cand (base_cand->next_interp);
>>   else
>> @@ -1657,6 +1674,7 @@ slsr_process_cast (gimple *gs, tree rhs1, bool spe
>>c2 = alloc_cand_and_find_basis (CAND_MULT, gs, rhs1, 0,
>>   integer_one_node, ctype, sizetype,
> 0);
>>c->next_interp = c2->cand_num;
>> +  c2->first_interp = c->cand_num;
>>  }
> 
>>/* Add the first (or only) interpretation to the statement-candidate
>> @@ -1681,6 +1699,8 @@ slsr_process_copy (gimple *gs, tree rhs1, bool spe
> 
>>if (base_cand && base_cand->kind != CAND_PHI)
>>  {
>> +  slsr_cand_t first_cand = NULL;
>> +
>>while (base_cand)
>> {
>>   /* Propagate all data from the base candidate.  */
>> @@ -1693,6 +1713,12 @@ slsr_process_copy (gimple *gs, tree rhs1, bool spe
>> 

Re: [PATCH][RFC] Add dynamic edge/bb flag allocation

2018-05-23 Thread Michael Matz
Hi,

On Wed, 23 May 2018, Eric Botcazou wrote:

> > Maybe you should convert the thing to a template when the need arises
> > instead of before?  You have now added 54 lines of code for wrapping an
> > int!
> 
> Yeah, it took me 5 minutes to understand what all this fluff is about!

So, what I think this should look like: only one non-templated class for 
RAII purposes, which get's the pool to allocate from as a parameter in the 
ctor.

Use:

alloc_flags (&cfun->cfg->bb_flag_pool);
alloc_flags (&cfun->cfg->edge_flag_pool);

I don't see the sense in creating two classes for determining the pool 
(and then adding a third class when another pool is invented somewhere 
else) just for going from cfun to cfun->cfg->foopool.  Also Richi asked if 
the flag pools (sigh, a large word for an int) should be merged.  I think 
at this time they should be, but that the class ctor should still take the 
pool param (instead of the function), even if right now there'd only be 
one.

So much for bike shedding :)


Ciao,
Michael.


C++ PATCH for c++/85847, ICE with template_id_expr in new()

2018-05-23 Thread Marek Polacek
The diagnostic code in build_new{,_1} was using maybe_constant_value to fold
the array length, but that breaks while parsing a template, because we might
then leak template codes to the constexpr machinery.

Bootstrapped/regtested on x86_64-linux, ok for trunk/8?

2018-05-23  Marek Polacek  

PR c++/85847
* init.c (build_new_1): Use fold_non_dependent_expr.
(build_new): Likewise.

* g++.dg/cpp0x/new3.C: New test.

diff --git gcc/cp/init.c gcc/cp/init.c
index b558742abf6..d96fec46f65 100644
--- gcc/cp/init.c
+++ gcc/cp/init.c
@@ -2860,7 +2860,7 @@ build_new_1 (vec **placement, tree type, 
tree nelts,
   /* Lots of logic below. depends on whether we have a constant number of
  elements, so go ahead and fold it now.  */
   if (outer_nelts)
-outer_nelts = maybe_constant_value (outer_nelts);
+outer_nelts = fold_non_dependent_expr (outer_nelts);
 
   /* If our base type is an array, then make sure we know how many elements
  it has.  */
@@ -3639,7 +3639,7 @@ build_new (vec **placement, tree type, tree 
nelts,
   /* Try to determine the constant value only for the purposes
 of the diagnostic below but continue to use the original
 value and handle const folding later.  */
-  const_tree cst_nelts = maybe_constant_value (nelts);
+  const_tree cst_nelts = fold_non_dependent_expr (nelts);
 
   /* The expression in a noptr-new-declarator is erroneous if it's of
 non-class type and its value before converting to std::size_t is
diff --git gcc/testsuite/g++.dg/cpp0x/new3.C gcc/testsuite/g++.dg/cpp0x/new3.C
index e69de29bb2d..c388acf552e 100644
--- gcc/testsuite/g++.dg/cpp0x/new3.C
+++ gcc/testsuite/g++.dg/cpp0x/new3.C
@@ -0,0 +1,11 @@
+// PR c++/85847
+// { dg-do compile { target c++11 } }
+
+template 
+int f(int b) { return b; }
+
+template 
+void g()
+{
+  auto a = new int[f(2), 2];
+}


Re: [PATCH][RFC] Add dynamic edge/bb flag allocation

2018-05-23 Thread Richard Biener
On Wed, 23 May 2018, Michael Matz wrote:

> Hi,
> 
> On Wed, 23 May 2018, Eric Botcazou wrote:
> 
> > > Maybe you should convert the thing to a template when the need arises
> > > instead of before?  You have now added 54 lines of code for wrapping an
> > > int!
> > 
> > Yeah, it took me 5 minutes to understand what all this fluff is about!
> 
> So, what I think this should look like: only one non-templated class for 
> RAII purposes, which get's the pool to allocate from as a parameter in the 
> ctor.
> 
> Use:
> 
> alloc_flags (&cfun->cfg->bb_flag_pool);
> alloc_flags (&cfun->cfg->edge_flag_pool);

You'll end up with sth like

   alloc_flags flag (BB_FLAG_POOL_FOR_FN (cfun));

then, mixing C++ RAII and macros! (eh)  Note you missed to name the
variable you declare.  And yes, template deduction should make this
work w/o writing alloc_flags flag (...).

> I don't see the sense in creating two classes for determining the pool 
> (and then adding a third class when another pool is invented somewhere 
> else) just for going from cfun to cfun->cfg->foopool.  Also Richi asked if 
> the flag pools (sigh, a large word for an int) should be merged.  I think 
> at this time they should be, but that the class ctor should still take the 
> pool param (instead of the function), even if right now there'd only be 
> one.
> 
> So much for bike shedding :)

:/

Richard.


[PATCH] Remove memset VN restriction

2018-05-23 Thread Richard Biener

After Eric reminded me how things go with the middle-end and bit-offsets
the following adjusts VN accordingly.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2018-05-23  Richard Biener  

* tree-ssa-sccvn.c (vn_reference_lookup_3): Remove restriction
of fixed offset from memset VN.

* gcc.dg/tree-ssa/ssa-fre-66.c: New testcase.

On branch slpcost
Your branch is ahead of 'origin/trunk' by 1 commit.
  (use "git push" to publish your local commits)
Changes to be committed:
  (use "git reset HEAD ..." to unstage)

new file:   testsuite/gcc.dg/tree-ssa/ssa-fre-66.c
modified:   tree-ssa-sccvn.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-66.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-66.c
new file mode 100644
index 000..1c86a1f3866
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-66.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-fre1" } */
+
+int foo (int i)
+{
+  int a[16];
+  __builtin_memset (a, 42, sizeof (a));
+  return a[i];
+}
+
+/* { dg-final { scan-tree-dump "return 707406378;" "fre1" { target { int32plus 
} } } } */
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 96e80c7b5a3..73e8fc5df49 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -1962,7 +1962,6 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *vr_,
  || ((TREE_CODE (gimple_call_arg (def_stmt, 1)) == INTEGER_CST
   || (INTEGRAL_TYPE_P (vr->type) && known_eq (ref->size, 8)))
  && CHAR_BIT == 8 && BITS_PER_UNIT == 8
- && known_eq (ref->size, maxsize)
  && offset.is_constant (&offseti)
  && offseti % BITS_PER_UNIT == 0))
   && poly_int_tree_p (gimple_call_arg (def_stmt, 2))


Re: [PATCH][RFC] Add dynamic edge/bb flag allocation

2018-05-23 Thread Michael Matz
Hi,

On Wed, 23 May 2018, Richard Biener wrote:

> > alloc_flags (&cfun->cfg->bb_flag_pool);
> > alloc_flags (&cfun->cfg->edge_flag_pool);
> 
> You'll end up with sth like
> 
>alloc_flags flag (BB_FLAG_POOL_FOR_FN (cfun));
> 
> then, mixing C++ RAII and macros! (eh)

First: I don't see the problem.  Second: if the above needs to use a macro 
you should have used a macro as well in your two classes.

> Note you missed to name the variable you declare.

Sure.

> And yes, template deduction should make this work w/o writing 
> alloc_flags flag (...).

If you insist on an template, maybe.  But why would you?


Ciao,
Michael.


[PATCH][1/n] Dissect vectorizer GROUP_* info

2018-05-23 Thread Richard Biener

This patch splits GROUP_* into DR_GROUP_* and REDUC_GROUP_* as a step
towards sharing dataref and dependence analysis between vectorization
analysis of different vector sizes (and then comparing costs and choosing
the best one).

The next patch will move the DR_GROUP_* fields to the DR_AUX vectorizer
data structure which means that the DR group analysis could eventually
be used from outside of the vectorizer.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

I'm considering applying this independently because of the churn it
generates to make my life easier.

Richard - I remember you talking about patches walking in the very
same direction so please speak up if this disrupts your work and
lets coordinate.

Thanks,
Richard.

>From 340b53126600b77d46d1598132e484c5a99b9de9 Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Wed, 23 May 2018 15:40:19 +0200
Subject: [PATCH 3/3] Split GROUP_* into DR and REDUC variants

* tree-vectorizer.h (STMT_VINFO_GROUP_*, GROUP_*): Remove.
(DR_GROUP_*): New, assert we have non-NULL ->data_ref_info.
(REDUC_GROUP_*): New, assert we have NULL ->data_ref_info.
* ... adjust

diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 9608b769cf2..fe4c4a5a1be 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -307,8 +307,9 @@ vect_analyze_data_ref_dependence (struct 
data_dependence_relation *ddr,
   /* We do not have to consider dependences between accesses that belong
  to the same group, unless the stride could be smaller than the
  group size.  */
-  if (GROUP_FIRST_ELEMENT (stmtinfo_a)
-  && GROUP_FIRST_ELEMENT (stmtinfo_a) == GROUP_FIRST_ELEMENT (stmtinfo_b)
+  if (DR_GROUP_FIRST_ELEMENT (stmtinfo_a)
+  && (DR_GROUP_FIRST_ELEMENT (stmtinfo_a)
+ == DR_GROUP_FIRST_ELEMENT (stmtinfo_b))
   && !STMT_VINFO_STRIDED_P (stmtinfo_a))
 return false;
 
@@ -614,8 +615,8 @@ vect_slp_analyze_data_ref_dependence (struct 
data_dependence_relation *ddr)
   /* If dra and drb are part of the same interleaving chain consider
  them independent.  */
   if (STMT_VINFO_GROUPED_ACCESS (vinfo_for_stmt (DR_STMT (dra)))
-  && (GROUP_FIRST_ELEMENT (vinfo_for_stmt (DR_STMT (dra)))
- == GROUP_FIRST_ELEMENT (vinfo_for_stmt (DR_STMT (drb)
+  && (DR_GROUP_FIRST_ELEMENT (vinfo_for_stmt (DR_STMT (dra)))
+ == DR_GROUP_FIRST_ELEMENT (vinfo_for_stmt (DR_STMT (drb)
 return false;
 
   /* Unknown data dependence.  */
@@ -1056,9 +1057,9 @@ vect_update_misalignment_for_peel (struct data_reference 
*dr,
  /* For interleaved data accesses the step in the loop must be multiplied by
  the size of the interleaving group.  */
   if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
-dr_size *= GROUP_SIZE (vinfo_for_stmt (GROUP_FIRST_ELEMENT (stmt_info)));
+dr_size *= DR_GROUP_SIZE (vinfo_for_stmt (DR_GROUP_FIRST_ELEMENT 
(stmt_info)));
   if (STMT_VINFO_GROUPED_ACCESS (peel_stmt_info))
-dr_peel_size *= GROUP_SIZE (peel_stmt_info);
+dr_peel_size *= DR_GROUP_SIZE (peel_stmt_info);
 
   /* It can be assumed that the data refs with the same alignment as dr_peel
  are aligned in the vector loop.  */
@@ -1151,7 +1152,7 @@ vect_verify_datarefs_alignment (loop_vec_info vinfo)
 
   /* For interleaving, only the alignment of the first access matters.   */
   if (STMT_VINFO_GROUPED_ACCESS (stmt_info)
- && GROUP_FIRST_ELEMENT (stmt_info) != stmt)
+ && DR_GROUP_FIRST_ELEMENT (stmt_info) != stmt)
continue;
 
   /* Strided accesses perform only component accesses, alignment is
@@ -1208,7 +1209,7 @@ vector_alignment_reachable_p (struct data_reference *dr)
   elem_size = vector_element_size (vector_size, nelements);
   mis_in_elements = DR_MISALIGNMENT (dr) / elem_size;
 
-  if (!multiple_p (nelements - mis_in_elements, GROUP_SIZE (stmt_info)))
+  if (!multiple_p (nelements - mis_in_elements, DR_GROUP_SIZE (stmt_info)))
return false;
 }
 
@@ -1396,7 +1397,7 @@ vect_get_peeling_costs_all_drs (vec 
datarefs,
   /* For interleaving, only the alignment of the first access
  matters.  */
   if (STMT_VINFO_GROUPED_ACCESS (stmt_info)
-  && GROUP_FIRST_ELEMENT (stmt_info) != stmt)
+  && DR_GROUP_FIRST_ELEMENT (stmt_info) != stmt)
 continue;
 
   /* Strided accesses perform only component accesses, alignment is
@@ -1530,7 +1531,7 @@ vect_peeling_supportable (loop_vec_info loop_vinfo, 
struct data_reference *dr0,
   /* For interleaving, only the alignment of the first access
 matters.  */
   if (STMT_VINFO_GROUPED_ACCESS (stmt_info)
- && GROUP_FIRST_ELEMENT (stmt_info) != stmt)
+ && DR_GROUP_FIRST_ELEMENT (stmt_info) != stmt)
continue;
 
   /* Strided accesses perform only component accesses, alignment is
@@ -1718,7 +1719,7 @@ vect_enhance_data_refs_alignment (loop_vec_info 
loop_vinfo)
   /* For interleaving, only the alignment of

Re: C++ PATCH to implement P0614R1, Range-based for statements with initializer (take 2)

2018-05-23 Thread Marek Polacek
On Tue, May 22, 2018 at 09:46:10PM -0400, Jason Merrill wrote:
> On Tue, May 22, 2018 at 7:25 PM, Marek Polacek  wrote:
> > On Mon, May 21, 2018 at 09:51:44PM -0400, Jason Merrill wrote:
> >> On Mon, May 21, 2018 at 7:34 PM, Marek Polacek  wrote:
> >> > The previous version of this patch got confused by
> >> >
> >> >   for (int i = 0; n > 0 ? true : false; i++)
> >> > // ...
> >> >
> >> > because even though we see a ; followed by a :, it's not a range-based 
> >> > for with
> >> > an initializer.  I find it very strange that this didn't show up during 
> >> > the
> >> > regtest.
> >> >
> >> > To fix this, I had to uglify range_based_for_with_init_p to also check 
> >> > for a ?.
> >> > Yuck.
> >>
> >> Perhaps cp_parser_skip_to_closing_parenthesis_1 should handle balanced
> >> ?: like ()/[]/{}.
> >
> > Good point.  Clearly there's a difference between ?: and e.g. () because : 
> > can
> > stand alone--e.g. in asm (: "whatever"), labels, goacc arrays like a[0:N], 
> > and
> > so on.  The following seems to work well, and is certainly less ugly than 
> > the
> > previous version.
> >
> > +   case CPP_QUERY:
> > + if (!brace_depth)
> > +   ++condop_depth;
> > + break;
> > +
> > +   case CPP_COLON:
> > + if (!brace_depth && condop_depth > 0)
> > +   condop_depth--;
> > + break;
> 
> Since, as you say, colons can appear in more places, maybe we only
> want to adjust condop_depth when all the other depths are 0, not just
> brace_depth.

Yeah, I meant to do it but apparently I didn't :(.  Fixed below.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2018-05-23  Marek Polacek  

Implement P0614R1, Range-based for statements with initializer.
* parser.c (cp_parser_range_based_for_with_init_p): New.
(cp_parser_init_statement): Use it.  Parse the optional init-statement
for a range-based for loop.
(cp_parser_skip_to_closing_parenthesis_1): Handle balancing ?:.

* g++.dg/cpp2a/range-for1.C: New test.
* g++.dg/cpp2a/range-for2.C: New test.
* g++.dg/cpp2a/range-for3.C: New test.
* g++.dg/cpp2a/range-for4.C: New test.
* g++.dg/cpp2a/range-for5.C: New test.
* g++.dg/cpp2a/range-for6.C: New test.
* g++.dg/cpp2a/range-for7.C: New test.

diff --git gcc/cp/parser.c gcc/cp/parser.c
index 6f51f03f47c..d3e73488e84 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -3493,6 +3493,7 @@ cp_parser_skip_to_closing_parenthesis_1 (cp_parser 
*parser,
   unsigned paren_depth = 0;
   unsigned brace_depth = 0;
   unsigned square_depth = 0;
+  unsigned condop_depth = 0;
 
   if (recovering && or_ttype == CPP_EOF
   && cp_parser_uncommitted_to_tentative_parse_p (parser))
@@ -3504,7 +3505,7 @@ cp_parser_skip_to_closing_parenthesis_1 (cp_parser 
*parser,
 
   /* Have we found what we're looking for before the closing paren?  */
   if (token->type == or_ttype && or_ttype != CPP_EOF
- && !brace_depth && !paren_depth && !square_depth)
+ && !brace_depth && !paren_depth && !square_depth && !condop_depth)
return -1;
 
   switch (token->type)
@@ -3551,6 +3552,16 @@ cp_parser_skip_to_closing_parenthesis_1 (cp_parser 
*parser,
}
  break;
 
+   case CPP_QUERY:
+ if (!brace_depth && !paren_depth && !square_depth)
+   ++condop_depth;
+ break;
+
+   case CPP_COLON:
+ if (!brace_depth && !paren_depth && !square_depth && condop_depth > 0)
+   condop_depth--;
+ break;
+
default:
  break;
}
@@ -11255,6 +11266,40 @@ cp_parser_statement_seq_opt (cp_parser* parser, tree 
in_statement_expr)
 }
 }
 
+/* Return true if this is the C++20 version of range-based-for with
+   init-statement.  */
+
+static bool
+cp_parser_range_based_for_with_init_p (cp_parser *parser)
+{
+  bool r = false;
+
+  /* Save tokens so that we can put them back.  */
+  cp_lexer_save_tokens (parser->lexer);
+
+  /* There has to be an unnested ; followed by an unnested :.  */
+  if (cp_parser_skip_to_closing_parenthesis_1 (parser,
+  /*recovering=*/false,
+  CPP_SEMICOLON,
+  /*consume_paren=*/false) != -1)
+goto out;
+
+  /* We found the semicolon, eat it now.  */
+  cp_lexer_consume_token (parser->lexer);
+
+  /* Now look for ':' that is not nested in () or {}.  */
+  r = (cp_parser_skip_to_closing_parenthesis_1 (parser,
+   /*recovering=*/false,
+   CPP_COLON,
+   /*consume_paren=*/false) == -1);
+
+out:
+  /* Roll back the tokens we skipped.  */
+  cp_lexer_rollback_tokens (parser->lexer);
+
+  return r;
+}
+
 /* Return true if we're looking at (init; cond), false otherwise.  */
 
 static bool
@@ -12299,7 +12344,7 @@ cp_parser_i

[gomp5] Implement omp_[sg]et_affinity_format, omp_{capture,display}_affinity, OMP_DISPLAY_AFFINITY and OMP_AFFINITY_FORMAT

2018-05-23 Thread Jakub Jelinek
Hi!

The following patch implements functions and env vars to query and display
affinity related information.

Tested on x86_64-linux, committed to gomp-5.0 branch.

2018-05-23  Jakub Jelinek  

* configure.ac (HAVE_UNAME, HAVE_GETHOSTNAME, HAVE_GETPID): Add
new tests.
* configure.tgt: Add -DUSING_INITIAL_EXEC_TLS to XCFLAGS for Linux.
* Makefile.am (libgomp_la_SOURCES): Add affinity-fmt.c.
* libgomp.map (OMP_5.0): Export omp_{capture,display}_affinity{,_},
and omp_[gs]et_affinity_format{,_}.
* libgomp.h (gomp_display_affinity_var, gomp_affinity_format_var,
gomp_affinity_format_len): Declare.
(GOMP_NEEDS_THREAD_HANDLE): Define if needed.
(struct gomp_thread): Add handle field if GOMP_NEEDS_THREAD_HANDLE is
defined.
(gomp_display_affinity_place): Declare.
(gomp_set_affinity_format, gomp_display_string): Likewise.
(gomp_thread_handle): New typedef.
(gomp_display_affinity, gomp_display_affinity_thread): Declare.
(gomp_thread_self, gomp_thread_to_pthread_t): New inline functions.
* affinity-fmt.c: New file.
* affinity.c (gomp_display_affinity_place): New function.
* config/linux/affinity.c (gomp_display_affinity_place): New function.
* env.c (gomp_display_affinity_var, gomp_affinity_format_var,
gomp_affinity_format_len): New variables.
(handle_omp_display_env): Print OMP_DISPLAY_AFFINITY and
OMP_AFFINITY_FORMAT.
(initialize_env): Handle OMP_DISPLAY_AFFINITY and OMP_AFFINITY_FORMAT
env vars.
* fortran.c: Include stdio.h and string.h.
(omp_set_affinity_format_, omp_get_affinity_format_,
omp_display_affinity_, omp_capture_affinity_): New functions.
* omp.h.in (omp_set_affinity_format, omp_get_affinity_format,
omp_display_affinity, omp_capture_affinity): Declare.
* omp_lib.f90.in (omp_set_affinity_format, omp_get_affinity_format,
omp_display_affinity, omp_capture_affinity): Add new interfaces.
* omp_lib.h.in (omp_set_affinity_format, omp_get_affinity_format,
omp_display_affinity, omp_capture_affinity): New externs.
* team.c (struct gomp_thread_start_data): Add handle field.
(gomp_team_start): Handle OMP_DISPLAY_AFFINITY env var.
* configure: Regenerated.
* config.h.in: Regenerated.
* Makefile.in: Regenerated.
* testsuite/libgomp.c-c++-common/display-affinity-1.c: New test.
* testsuite/libgomp.fortran/display-affinity-1.f90: New test.

--- libgomp/configure.ac.jj 2018-04-30 13:19:48.198834863 +0200
+++ libgomp/configure.ac2018-05-22 14:15:24.425935883 +0200
@@ -266,6 +266,41 @@ if test $ac_cv_func_clock_gettime = no;
   [Define to 1 if you have the `clock_gettime' function.])])
 fi
 
+# Check for uname.
+AC_COMPILE_IFELSE(
+ [AC_LANG_PROGRAM(
+  [#include 
+   #include 
+   #include ],
+  [struct utsname buf;
+   volatile size_t len = 0;
+   if (!uname (buf))
+ len = strlen (buf.nodename);])],
+  AC_DEFINE(HAVE_UNAME, 1,
+[  Define if uname is supported and struct utsname has nodename field.]))
+
+# Check for gethostname.
+AC_COMPILE_IFELSE(
+ [AC_LANG_PROGRAM(
+  [#include ],
+  [
+changequote(,)dnl
+   char buf[256];
+   if (gethostname (buf, sizeof (buf) - 1) == 0)
+ buf[255] = '\0';
+changequote([,])dnl
+  ])],
+  AC_DEFINE(HAVE_GETHOSTNAME, 1,
+[  Define if gethostname is supported.]))
+
+# Check for getpid.
+AC_COMPILE_IFELSE(
+ [AC_LANG_PROGRAM(
+  [#include ],
+  [int pid = getpid ();])],
+  AC_DEFINE(HAVE_GETPID, 1,
+[  Define if getpid is supported.]))
+
 # See if we support thread-local storage.
 GCC_CHECK_TLS
 
--- libgomp/configure.tgt.jj2017-05-04 15:04:53.677371383 +0200
+++ libgomp/configure.tgt   2018-05-23 14:34:01.875414884 +0200
@@ -18,7 +18,7 @@ if test $gcc_cv_have_tls = yes ; then
;;
 
 *-*-linux* | *-*-gnu*)
-   XCFLAGS="${XCFLAGS} -ftls-model=initial-exec"
+   XCFLAGS="${XCFLAGS} -ftls-model=initial-exec -DUSING_INITIAL_EXEC_TLS"
;;
 
 *-*-rtems*)
--- libgomp/Makefile.am.jj  2017-05-04 15:04:53.679371358 +0200
+++ libgomp/Makefile.am 2018-05-21 17:21:48.717963247 +0200
@@ -63,7 +63,8 @@ libgomp_la_SOURCES = alloc.c atomic.c ba
parallel.c sections.c single.c task.c team.c work.c lock.c mutex.c \
proc.c sem.c bar.c ptrlock.c time.c fortran.c affinity.c target.c \
splay-tree.c libgomp-plugin.c oacc-parallel.c oacc-host.c oacc-init.c \
-   oacc-mem.c oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c
+   oacc-mem.c oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \
+   affinity-fmt.c
 
 include $(top_srcdir)/plugin/Makefrag.am
 
--- libgomp/libgomp.map.jj  2018-04-30 13:19:48.356834924 +0200
+++ libgomp/libgomp.map 2018-05-23 14:08:47.362850982 +0200
@@ -164,6 +164,18 @@ OMP_4.5 {
omp_target_disassociate_ptr;
 } OMP_4.0;
 
+OMP

Re: [PATCH] handle local aggregate initialization in strlen (PR 83821)

2018-05-23 Thread Jeff Law
On 05/10/2018 04:05 PM, Marc Glisse wrote:
> On Thu, 10 May 2018, Martin Sebor wrote:
> 
>> Can you please comment/respond to Jeff's question below and
>> confirm whether my understanding of the restriction (below)
>> is correct?
> 
> I don't remember it at all, I really should have expanded that comment...
> 
> The documentation of nonzero_chars seems to indicate that, unless
> full_string_p, it is only a lower bound on the length of the string, so
> not suitable for this kind of alias check. I don't know if we also have
> easy access to some upper bound.
> 
> (I noticed while looking at this pass that it could probably use
> POINTER_DIFF_EXPR more)
So ISTM that we'd need to guard the code that uses si->nonzero_chars in
maybe_invalidate to also check FULL_STRING_P since it appears we're
using si->nonzero_chars as a string length.

jeff



Re: PING^2: [PATCH] Don't mark IFUNC resolver as only called directly

2018-05-23 Thread H.J. Lu
On Wed, May 23, 2018 at 2:01 AM, Jan Hubicka  wrote:
>> On Tue, May 22, 2018 at 9:21 AM, Jan Hubicka  wrote:
>> >> > >  class ipa_opt_pass_d;
>> >> > >  typedef ipa_opt_pass_d *ipa_opt_pass;
>> >> > > @@ -2894,7 +2896,8 @@ 
>> >> > > cgraph_node::only_called_directly_or_aliased_p (void)
>> >> > >   && !DECL_STATIC_CONSTRUCTOR (decl)
>> >> > >   && !DECL_STATIC_DESTRUCTOR (decl)
>> >> > >   && !used_from_object_file_p ()
>> >> > > - && !externally_visible);
>> >> > > + && !externally_visible
>> >> > > + && !lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl)));
>> >> >
>> >> > How's it handled for our own generated resolver functions?  That is,
>> >> > isn't there sth cheaper than doing a lookup_attribute here?  I see
>> >> > that make_dispatcher_decl nor ix86_get_function_versions_dispatcher
>> >> > adds the 'ifunc' attribute (though they are TREE_PUBLIC there).
>> >> 
>> >>  Is there any drawback of setting force_output flag?
>> >>  Honza
>> >> >>>
>> >> >>> Setting force_output may prevent some optimizations.  Can we add a bit
>> >> >>> for IFUNC resolver?
>> >> >>>
>> >> >>
>> >> >> Here is the patch to add ifunc_resolver to cgraph_node. Tested on 
>> >> >> x86-64
>> >> >> and i686.  Any comments?
>> >> >>
>> >> >
>> >> > PING:
>> >> >
>> >> > https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00647.html
>> >> >
>> >>
>> >> PING.
>> > OK, but please extend the verifier that ifunc_resolver flag is equivalent 
>> > to
>> > lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl))
>> > so we are sure things stays in sync.
>> >
>>
>> Like this
>>
>> diff --git a/gcc/symtab.c b/gcc/symtab.c
>> index 80f6f910c3b..954920b6dff 100644
>> --- a/gcc/symtab.c
>> +++ b/gcc/symtab.c
>> @@ -998,6 +998,13 @@ symtab_node::verify_base (void)
>>error ("function symbol is not function");
>>error_found = true;
>>}
>> +  else if ((lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl))
>> + != NULL)
>> + != dyn_cast  (this)->ifunc_resolver)
>> +  {
>> +  error ("inconsistent `ifunc' attribute");
>> +  error_found = true;
>> +  }
>>  }
>>else if (is_a  (this))
>>  {
>>
>>
>> Thanks.
> Yes, thanks!
> Honza

I'd like to also fix it on GCC 8 branch for CET.  Should I backport my
patch to GCC 8 after a few days or use the simple patch for GCC 8:

https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00588.html

Thanks.

-- 
H.J.


Re: PING^2: [PATCH] Don't mark IFUNC resolver as only called directly

2018-05-23 Thread Jan Hubicka
> On Wed, May 23, 2018 at 2:01 AM, Jan Hubicka  wrote:
> >> On Tue, May 22, 2018 at 9:21 AM, Jan Hubicka  wrote:
> >> >> > >  class ipa_opt_pass_d;
> >> >> > >  typedef ipa_opt_pass_d *ipa_opt_pass;
> >> >> > > @@ -2894,7 +2896,8 @@ 
> >> >> > > cgraph_node::only_called_directly_or_aliased_p (void)
> >> >> > >   && !DECL_STATIC_CONSTRUCTOR (decl)
> >> >> > >   && !DECL_STATIC_DESTRUCTOR (decl)
> >> >> > >   && !used_from_object_file_p ()
> >> >> > > - && !externally_visible);
> >> >> > > + && !externally_visible
> >> >> > > + && !lookup_attribute ("ifunc", DECL_ATTRIBUTES 
> >> >> > > (decl)));
> >> >> >
> >> >> > How's it handled for our own generated resolver functions?  That 
> >> >> > is,
> >> >> > isn't there sth cheaper than doing a lookup_attribute here?  I see
> >> >> > that make_dispatcher_decl nor 
> >> >> > ix86_get_function_versions_dispatcher
> >> >> > adds the 'ifunc' attribute (though they are TREE_PUBLIC there).
> >> >> 
> >> >>  Is there any drawback of setting force_output flag?
> >> >>  Honza
> >> >> >>>
> >> >> >>> Setting force_output may prevent some optimizations.  Can we add a 
> >> >> >>> bit
> >> >> >>> for IFUNC resolver?
> >> >> >>>
> >> >> >>
> >> >> >> Here is the patch to add ifunc_resolver to cgraph_node. Tested on 
> >> >> >> x86-64
> >> >> >> and i686.  Any comments?
> >> >> >>
> >> >> >
> >> >> > PING:
> >> >> >
> >> >> > https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00647.html
> >> >> >
> >> >>
> >> >> PING.
> >> > OK, but please extend the verifier that ifunc_resolver flag is 
> >> > equivalent to
> >> > lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl))
> >> > so we are sure things stays in sync.
> >> >
> >>
> >> Like this
> >>
> >> diff --git a/gcc/symtab.c b/gcc/symtab.c
> >> index 80f6f910c3b..954920b6dff 100644
> >> --- a/gcc/symtab.c
> >> +++ b/gcc/symtab.c
> >> @@ -998,6 +998,13 @@ symtab_node::verify_base (void)
> >>error ("function symbol is not function");
> >>error_found = true;
> >>}
> >> +  else if ((lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl))
> >> + != NULL)
> >> + != dyn_cast  (this)->ifunc_resolver)
> >> +  {
> >> +  error ("inconsistent `ifunc' attribute");
> >> +  error_found = true;
> >> +  }
> >>  }
> >>else if (is_a  (this))
> >>  {
> >>
> >>
> >> Thanks.
> > Yes, thanks!
> > Honza
> 
> I'd like to also fix it on GCC 8 branch for CET.  Should I backport my
> patch to GCC 8 after a few days or use the simple patch for GCC 8:
> 
> https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00588.html

I would backport this one so we don't unnecesarily diverge.
Thanks!
Honza
> 
> Thanks.
> 
> -- 
> H.J.


Re: [RFT PATCH, AVX512]: Implement scalar float->unsigned int truncations with AVX512F

2018-05-23 Thread Uros Bizjak
On Wed, May 23, 2018 at 9:55 AM, Peryt, Sebastian
 wrote:
>> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
>> ow...@gcc.gnu.org] On Behalf Of Uros Bizjak
>> Sent: Monday, May 21, 2018 9:55 PM
>> To: gcc-patches@gcc.gnu.org
>> Cc: Jakub Jelinek ; Kirill Yukhin
>> 
>> Subject: Re: [RFT PATCH, AVX512]: Implement scalar float->unsigned int
>> truncations with AVX512F
>>
>> On Mon, May 21, 2018 at 4:53 PM, Uros Bizjak  wrote:
>> > Hello!
>> >
>> > Attached patch implements scalar float->unsigned int truncations
>> > with
>> AVX512F.
>> >
>> > 2018-05-21  Uros Bizjak  
>> >
>> > * config/i386/i386.md (fixuns_truncdi2): New insn pattern.
>> > (fixuns_truncsi2_avx512f): Ditto.
>> > (*fixuns_truncsi2_avx512f_zext): Ditto.
>> > (fixuns_truncsi2): Also enable for AVX512F and TARGET_SSE_MATH.
>> > Emit fixuns_truncsi2_avx512f for AVX512F targets.
>> >
>> > testsuite/ChangeLog:
>> >
>> > 2018-05-21  Uros Bizjak  
>> >
>> > * gcc.target/i386/cvt-2.c: New test.
>> >
>> > Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>> >
>> > Unfortunately, I have to means to test the patch on AVX512 target,
>> > so to avoid some hidden issue, I'd like to ask someone to test it on
>> > live target.
>
> I've bootstrapped and regression tested your patch on x86_64-linux-gnu 
> {,-m32} on SKX machine and I don't see any stability regression.

Thanks!

Committed to mainline SVN.

Uros.


Re: [RFT PATCH, AVX512]: Implement scalar unsigned int->float conversions with AVX512F

2018-05-23 Thread Uros Bizjak
On Wed, May 23, 2018 at 11:39 AM, Peryt, Sebastian
 wrote:
>> From: Uros Bizjak [mailto:ubiz...@gmail.com]
>> Sent: Tuesday, May 22, 2018 8:43 PM
>> To: gcc-patches@gcc.gnu.org
>> Cc: Peryt, Sebastian ; Jakub Jelinek
>> 
>> Subject: [RFT PATCH, AVX512]: Implement scalar unsigned int->float 
>> conversions
>> with AVX512F
>>
>> Hello!
>>
>> Attached patch implements scalar unsigned int->float conversions with
>> AVX512F.
>>
>> 2018-05-22  Uros Bizjak  
>>
>> * config/i386/i386.md (*floatuns2_avx512):
>> New insn pattern.
>> (floatunssi2): Also enable for AVX512F and TARGET_SSE_MATH.
>> Rewrite expander pattern.  Emit gen_floatunssi2_i387_with_xmm
>> for non-SSE modes.
>> (floatunsdisf2): Rewrite expander pattern.  Hanlde TARGET_AVX512F.
>> (floatunsdidf2): Ditto.
>>
>> testsuite/ChangeLog:
>>
>> 2018-05-22  Uros Bizjak  
>>
>> * gcc.target/i386/cvt-3.c: New test.
>>
>> Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}., 
>> but
>> not tested on AVX512 target.
>
> I have checked it on x86_64-linux-gnu {,-m32} on SKX and don't see any 
> stability regressions.

Thanks!

Committed to mainline SVN.

Uros.


Re: [PATCH] [AArch64, Falkor] Falkor address costs tuning

2018-05-23 Thread James Greenhalgh
On Tue, May 22, 2018 at 12:04:38PM -0500, Luis Machado wrote:
> Switch from using generic address costs to using Falkor-specific ones, which
> give Falkor better results overall.
> 
> OK for trunk?
> 
> Given this is a Falkor-specific adjustment, would this be an acceptable
> backport for GCC 8 as well?

OK for trunk.

It doesn't fix a regression, so it wouldn't really fit the definition of
a backport patch. That said, if it is important to you to have it in GCC 8,
it is sufficiently low-risk for non-Falkor targets that we can take it. So,
it is your call if you want to backport it or not.

Thanks,
James

> 
> gcc/ChangeLog:
> 
> 2018-05-22  Luis Machado  
> 
>   * config/aarch64/aarch64.c (qdf24xx_addrcost_table): New static
>   global.
>   (qdf24xx_tunings) : Set to qdf24xx_addrcost_table.
 


Re: [PATCH][arm][2/2] Remove support for -march=armv3 and older

2018-05-23 Thread Kyrill Tkachov


On 21/05/18 12:29, Kyrill Tkachov wrote:


On 18/05/18 11:33, Richard Earnshaw (lists) wrote:

On 17/05/18 11:26, Kyrill Tkachov wrote:

Hi all,

We deprecated architecture versions earlier than Armv4T in GCC 6 [1].
This patch removes support for architectures lower than Armv4.
That is the -march values armv2, armv2a, armv3, armv3m are removed
with this patch.  I did not remove armv4 because it's a bit more
involved code-wise and there has been some pushback on the implications
for -mcpu=strongarm support.

Removing armv3m and earlier though is pretty straightforward.
This allows us to get rid of the armv3m and mode32 feature bits
in arm-cpus.in as they can be assumed to be universally available.

Consequently the mcpu values arm2, arm250, arm3, arm6, arm60, arm600,
arm610, arm620, arm7, arm7d, arm7di, arm70, arm700, arm700i, arm710,
arm720, arm710c, arm7100, arm7500, arm7500fe, arm7m, arm7dm, arm7dm are
now also removed.

Bootstrapped and tested on arm-none-linux-gnueabihf and on arm-none-eabi
with an aprofile multilib configuration (which builds quite a lot of
library
configurations).

Ramana, Richard, I'd appreciate an ok from either of you that you're
happy for this to go ahead.

OK.


Thanks, I've committed the two patches.



It seems slightly strange that we remove the mode32 feature, but not the
mode26 feature, though I can see how it occurs as a consequence of this
cleanup.

However, we're no longer interested in supporting ARMv4 running in
mode26 (and I think in reality mode26 support was dropped several
releases back), so we should just drop that feature bit as well.
Perhaps you could do a follow-up to remove that too?


Yeah, that can be cleaned up separately.



And here is the patch doing that.
Committed to trunk after a bootstrap and test on arm-none-linux-gnueabihf.

Thanks,
Kyrill

2018-05-23  Kyrylo Tkachov  

* config/arm/arm-cpus.in (mode26): Delete.
(armv4): Delete mode26 reference.
* config/arm/arm.c (arm_configure_build_target): Delete use of
isa_bit_mode26.


Thanks,
Kyrill


Thanks,

R.


Thanks,
Kyrill

[1] https://gcc.gnu.org/gcc-6/changes.html#arm

2018-05-17  Kyrylo Tkachov  

 * config/arm/arm-cpus.in (armv3m, mode32): Delete features.
 (ARMv4): Update.
 (ARMv2, ARMv3, ARMv3m): Delete fgroups.
 (ARMv6m): Update.
 (armv2, armv2a, armv3, armv3m): Delete architectures.
 (arm2, arm250, arm3, arm6, arm60, arm600, arm610, arm620,
 arm7, arm7d, arm7di, arm70, arm700, arm700i, arm710, arm720,
 arm710c, arm7100, arm7500, arm7500fe, arm7m, arm7dm, arm7dmi):
 Delete cpus.
 * config/arm/arm.md (maddsidi4): Remove check for arm_arch3m.
 (*mulsidi3adddi): Likewise.
 (mulsidi3): Likewise.
 (*mulsidi3_nov6): Likewise.
 (umulsidi3): Likewise.
 (umulsidi3_nov6): Likewise.
 (umaddsidi4): Likewise.
 (*umulsidi3adddi): Likewise.
 (smulsi3_highpart): Likewise.
 (*smulsi3_highpart_nov6): Likewise.
 (umulsi3_highpart): Likewise.
 (*umulsi3_highpart_nov6): Likewise.
 * config/arm/arm.h (arm_arch3m): Delete.
 * config/arm/arm.c (arm_arch3m): Delete.
 (arm_option_override_internal): Update armv3-related comment.
 (arm_configure_build_target): Delete use of isa_bit_mode32.
 (arm_option_reconfigure_globals): Delete set of arm_ach3m.
 (arm_rtx_costs_internal): Delete check of arm_arch3m.
 * config/arm/arm-fixed.md (mulsq3): Delete check for arm_arch3m.
 (mulsa3): Likewise.
 (mulusa3): Likewise.
 * config/arm/arm-protos.h (arm_arch3m): Delete.
 * config/arm/arm-tables.opt: Regenerate.
 * config/arm/arm-tune.md: Likewise.
 * config/arm/t-arm-elf (all_early_nofp): Delete mentions of
 deleted architectures.

2018-05-17  Kyrylo Tkachov  

 * gcc.target/arm/pr62554.c: Delete.
 * gcc.target/arm/pr69610-1.c: Likewise.
 * gcc.target/arm/pr69610-2.c: Likewise.

armv3.patch


diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 
0a318877f10394e2c045d2a03a8f0757557136cf..16a381c86b6a7947e424b29fe67812990519ada9
 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -48,15 +48,9 @@
# Features - general convention: all lower case.
  -# Extended multiply
-define feature armv3m
-
  # 26-bit mode support
  define feature mode26
  -# 32-bit mode support
-define feature mode32
-
  # Architecture rel 4
  define feature armv4
  @@ -215,10 +209,7 @@ define fgroup ALL_FPU_INTERNALvfpv2 vfpv3 vfpv4 fpv5 
fp16conv fp_dbl ALL_SIMD_I
  # -mfpu support.
  define fgroup ALL_FPfp16 ALL_FPU_INTERNAL
  -define fgroup ARMv2   notm
-define fgroup ARMv3   ARMv2 mode32
-define fgroup ARMv3m  ARMv3 armv3m
-define fgroup ARMv4   ARMv3m armv4
+define fgroup ARMv4   armv4 notm
  define fgroup ARMv4t  ARMv4 thumb
  define fgroup ARMv5t  ARMv4t armv5t
  define fgroup ARMv5te ARMv5t armv5te
@@ -232,7 +223,7 @@ define fgroup ARMv6zk ARMv6k
  define fgroup ARMv6t2 ARMv6 thumb2
  # This is 

Re: Incremental LTO linking part 8: testsuite compensation

2018-05-23 Thread Jeff Law
On 05/08/2018 09:37 AM, Jan Hubicka wrote:
> 
> Hi,
> most testcases are written with assumption that -r will trigger code 
> generation.
> To make them still meaningful they need nolto-rel.  Bootstrapped/regtested 
> x86_64-linux
> with the rest of incremental link changes.
> 
> Honza
> 
> 2018-05-08  Jan Hubicka  
> 
>   * testsuite/g++.dg/lto/20081109-1_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/20081118_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/20081119-1_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/20081120-1_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/20081120-2_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/20081123_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/20081204-1_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/20081219_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/20090302_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/20090313_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/20091002-2_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/20091002-3_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/20091026-1_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/20100724-1_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/20101010-4_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/20101015-2_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/20110311-1_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/pr45621_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/pr48042_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/pr48354-1_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/pr54625-1_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/pr54625-2_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/lto/pr68811_0.C: Add -flinker-output=nolto-rel.
>   * testsuite/g++.dg/torture/pr43760.C: New test. Add 
> -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20081120-1_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20081120-2_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20081126_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20081204-1_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20081204-2_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20081212-1_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20081224_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20090116_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20090126-1_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20090126-2_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20090206-1_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20090219_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20091013-1_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20091014-1_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20091015-1_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20091016-1_0.c: Add -flinker-output=nolto-rel.
>   * testsuite/gcc.dg/lto/20091020-1_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gcc.dg/lto/20091020-2_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gcc.dg/lto/20091027-1_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gcc.dg/lto/20100426_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gcc.dg/lto/20100430-1_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gcc.dg/lto/20100603-1_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gcc.dg/lto/20100603-2_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gcc.dg/lto/20100603-3_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gcc.dg/lto/20111213-1_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gcc.dg/lto/pr45736_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gcc.dg/lto/pr52634_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gcc.dg/lto/pr54702_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gcc.dg/lto/pr59323-2_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gcc.dg/lto/pr59323_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gcc.dg/lto/pr60820_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gcc.dg/lto/pr81406_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gcc.dg/lto/pr83388_0.c: Add -flinker-output-nolto-rel.
>   * testsuite/gfortran.dg/lto/20091016-1_0.f90: Add 
> -flinker-output-nolto-rel.
>   * testsuite/gfortran.dg/lto/20091028-1_0.f90: Add 
> -flinker-output-nolto-rel.
>   * testsuite/gfortran.dg/lto/20091028-2_0.f90: Add 
> -flinker-output-nolto-rel.
>   * 

Re: Incremental LTO linking part 6: dwarf2out support

2018-05-23 Thread Jeff Law
On 05/08/2018 09:31 AM, Jan Hubicka wrote:
> Hi,
> this patch tells dwarf2out that it can have early debug not only in WPA mode
> but also when incrementally linking. This prevents ICE on almost every 
> testcase
> compiled with -g.
> 
> Bootstrapped/regtested x86_64-linux with rest of incremental linking patchet.
> Makes sense?
> 
> Honza
> 
>   * dwarf2out.c (dwarf2out_die_ref_for_decl,
>   darf2out_register_external_decl): Support incremental link.
OK.  I think the full series is ACK'd now.  Can you confirm?

Thanks,
Jeff


Re: [PATCH][1/n] Dissect vectorizer GROUP_* info

2018-05-23 Thread Richard Sandiford
Richard Biener  writes:
> This patch splits GROUP_* into DR_GROUP_* and REDUC_GROUP_* as a step
> towards sharing dataref and dependence analysis between vectorization
> analysis of different vector sizes (and then comparing costs and choosing
> the best one).
>
> The next patch will move the DR_GROUP_* fields to the DR_AUX vectorizer
> data structure which means that the DR group analysis could eventually
> be used from outside of the vectorizer.
>
> Bootstrap / regtest running on x86_64-unknown-linux-gnu.
>
> I'm considering applying this independently because of the churn it
> generates to make my life easier.
>
> Richard - I remember you talking about patches walking in the very
> same direction so please speak up if this disrupts your work and
> lets coordinate.

I don't think this patch clashes with what I was trying, so please
go ahead.  I was looking at patches to remove the globalness of
vinfo_for_stmt, mostly by avoiding it wherever possible and using
stmt_info instead of gimple *.  That's quite a lot of churn but leaves
very few vinfo_for_stmts left.

It also included an attempt to clean up how we handle getting the
vector versions of operands.

Thanks,
Richard


[PATCH , rs6000] Add builtin tests for vec_madds, vec_extract_fp32_from_shortl and vec_extract_fp32_from_shorth, vec_xst_be

2018-05-23 Thread Carl Love
GCC maintainers:

The following patch adds additional tests for the vec_madds,
vec_extract_fp32_from_shortl and vec_extract_fp32_from_shorth,
vec_xst_be builtin functions.

The patch was retested on:

powerpc64le-unknown-linux-gnu (Power 8 LE)   
powerpc64le-unknown-linux-gnu (Power 9 LE)
powerpc64-unknown-linux-gnu (Power 8 BE)

With no regressions.

Please let me know if the patch looks OK for GCC mainline.

 Carl Love
---

gcc/testsuite/ChangeLog:

2018-05-21  Carl Love  
* gcc.target/powerpc/altivec-35.c (foo): Add builtin test vec_madds.
* gcc.target/powerpc/builtins-3-p9.c (main): Add tests for
vec_extract_fp32_from_shortl and vec_extract_fp32_from_shorth.
* gcc.target/powerpc/builtins-6-runnable.c (main): Fix typo for output.
Add vec_xst_be for signed and unsigned arguments.
---
 gcc/testsuite/gcc.target/powerpc/altivec-35.c  |  4 ++
 .../gcc.target/powerpc/builtins-3-p9-runnable.c| 16 ++
 .../gcc.target/powerpc/builtins-6-runnable.c   | 62 +++---
 3 files changed, 74 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/altivec-35.c 
b/gcc/testsuite/gcc.target/powerpc/altivec-35.c
index 46e8eed..0836528 100644
--- a/gcc/testsuite/gcc.target/powerpc/altivec-35.c
+++ b/gcc/testsuite/gcc.target/powerpc/altivec-35.c
@@ -1,3 +1,4 @@
+
 /* { dg-do compile } */
 /* { dg-require-effective-target powerpc_altivec_ok } */
 /* { dg-options "-maltivec -mno-vsx -mno-power8-vector -O0" } */
@@ -19,7 +20,10 @@ void foo (vector signed int *vsir,
   *vssr++ = vec_madd (vssa, vusb, vusc);
   *vssr++ = vec_madd (vusa, vssb, vssc);
   *vusr++ = vec_madd (vusa, vusb, vusc);
+
+  *vssr++ = vec_madds (vssa, vssb, vssc);
 }
 
 /* { dg-final { scan-assembler-times "vaddcuw" 1 } } */
 /* { dg-final { scan-assembler-times "vmladduhm" 4 } } */
+/* { dg-final { scan-assembler-times "vmhaddshs" 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-3-p9-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/builtins-3-p9-runnable.c
index 3b67e53..e29c97c 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-3-p9-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-p9-runnable.c
@@ -32,4 +32,20 @@ int main() {
   if (vfr[i] != vfexpt[i])
  abort();
}
+
+   vfexpt = (vector float){1.0, -2.0, 0.0, 8.5};
+   vfr = vec_extract_fp32_from_shorth(vusha);
+
+   for (i=0; i<4; i++) {
+  if (vfr[i] != vfexpt[i])
+abort();
+   }
+
+   vfexpt = (vector float){1.5, 0.5, 1.25, -0.25};
+   vfr = vec_extract_fp32_from_shortl(vusha);
+
+   for (i=0; i<4; i++) {
+  if (vfr[i] != vfexpt[i])
+abort();
+   } 
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-6-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/builtins-6-runnable.c
index 5d31312..380a11a 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-6-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-6-runnable.c
@@ -60,11 +60,11 @@ void print_uc (vector unsigned char vec_expected,
 {
   int i;
 
-  printf("expected signed char data\n");
+  printf("expected unsigned char data\n");
   for (i = 0; i < 16; i++)
 printf(" %d,", vec_expected[i]);
 
-  printf("\nactual signed char data\n");
+  printf("\nactual unsigned char data\n");
   for (i = 0; i < 16; i++)
 printf(" %d,", vec_actual[i]);
   printf("\n");
@@ -197,13 +197,11 @@ void print_ull (vector unsigned long long vec_expected,
 
   printf("expected unsigned long long data\n");
   for (i = 0; i < 2; i++)
- //printf(" %llu,", vec_expected[i]);
-printf(" 0x%llx,", vec_expected[i]);
+printf(" %llu,", vec_expected[i]);
 
   printf("\nactual unsigned long long data\n");
   for (i = 0; i < 2; i++)
- //printf(" %llu,", vec_actual[i]);
-printf("0x %llx,", vec_actual[i]);
+printf(" %llu,", vec_actual[i]);
   printf("\n");
 }
 
@@ -745,6 +743,56 @@ int main() {
 #endif
  }
 
+   disp = 8;
+#ifdef __BIG_ENDIAN__
+   vec_si_expected1 = (vector signed int){  0, 0, -8, -7 };
+#else
+   vec_si_expected1 = (vector signed int){  0, 0, -5, -6 };
+#endif
+   store_data_si = (vector signed int){ -8, -7, -6, -5 };
+
+   for (i=0; i<4; i++)
+ vec_si_result1[i] = 0;
+
+   address_si = &vec_si_result1[0];
+
+   vec_xst_be (store_data_si, disp, address_si);
+
+   if (result_wrong_si (vec_si_expected1, vec_si_result1))
+ {
+#ifdef DEBUG
+   printf("Error: vec_xst_be, si disp = %d, result does not match expected 
result\n", disp);
+   print_si (vec_si_expected1, vec_si_result1);
+#else
+   abort();
+#endif
+ }
+
+   disp = 0;
+#ifdef __BIG_ENDIAN__
+   vec_ui_expected1 = (vector unsigned int){ 0, 1, 2, 3 };
+#else
+   vec_ui_expected1 = (vector unsigned int){ 3, 2, 1, 0 };
+#endif
+   store_data_ui = (vector unsigned int){ 0, 1, 2, 3 };
+
+   for (i=0; i<4; i++)
+ vec_ui_result1[i] = 0;
+
+   address_ui = &vec_ui_result1[0];
+
+   vec_xst

Re: PING^2: [PATCH] Don't mark IFUNC resolver as only called directly

2018-05-23 Thread H.J. Lu
On Wed, May 23, 2018 at 8:11 AM, Jan Hubicka  wrote:
>> On Wed, May 23, 2018 at 2:01 AM, Jan Hubicka  wrote:
>> >> On Tue, May 22, 2018 at 9:21 AM, Jan Hubicka  wrote:
>> >> >> > >  class ipa_opt_pass_d;
>> >> >> > >  typedef ipa_opt_pass_d *ipa_opt_pass;
>> >> >> > > @@ -2894,7 +2896,8 @@ 
>> >> >> > > cgraph_node::only_called_directly_or_aliased_p (void)
>> >> >> > >   && !DECL_STATIC_CONSTRUCTOR (decl)
>> >> >> > >   && !DECL_STATIC_DESTRUCTOR (decl)
>> >> >> > >   && !used_from_object_file_p ()
>> >> >> > > - && !externally_visible);
>> >> >> > > + && !externally_visible
>> >> >> > > + && !lookup_attribute ("ifunc", DECL_ATTRIBUTES 
>> >> >> > > (decl)));
>> >> >> >
>> >> >> > How's it handled for our own generated resolver functions?  That 
>> >> >> > is,
>> >> >> > isn't there sth cheaper than doing a lookup_attribute here?  I 
>> >> >> > see
>> >> >> > that make_dispatcher_decl nor 
>> >> >> > ix86_get_function_versions_dispatcher
>> >> >> > adds the 'ifunc' attribute (though they are TREE_PUBLIC there).
>> >> >> 
>> >> >>  Is there any drawback of setting force_output flag?
>> >> >>  Honza
>> >> >> >>>
>> >> >> >>> Setting force_output may prevent some optimizations.  Can we add a 
>> >> >> >>> bit
>> >> >> >>> for IFUNC resolver?
>> >> >> >>>
>> >> >> >>
>> >> >> >> Here is the patch to add ifunc_resolver to cgraph_node. Tested on 
>> >> >> >> x86-64
>> >> >> >> and i686.  Any comments?
>> >> >> >>
>> >> >> >
>> >> >> > PING:
>> >> >> >
>> >> >> > https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00647.html
>> >> >> >
>> >> >>
>> >> >> PING.
>> >> > OK, but please extend the verifier that ifunc_resolver flag is 
>> >> > equivalent to
>> >> > lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl))
>> >> > so we are sure things stays in sync.
>> >> >
>> >>
>> >> Like this
>> >>
>> >> diff --git a/gcc/symtab.c b/gcc/symtab.c
>> >> index 80f6f910c3b..954920b6dff 100644
>> >> --- a/gcc/symtab.c
>> >> +++ b/gcc/symtab.c
>> >> @@ -998,6 +998,13 @@ symtab_node::verify_base (void)
>> >>error ("function symbol is not function");
>> >>error_found = true;
>> >>}
>> >> +  else if ((lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl))
>> >> + != NULL)
>> >> + != dyn_cast  (this)->ifunc_resolver)
>> >> +  {
>> >> +  error ("inconsistent `ifunc' attribute");
>> >> +  error_found = true;
>> >> +  }
>> >>  }
>> >>else if (is_a  (this))
>> >>  {
>> >>
>> >>
>> >> Thanks.
>> > Yes, thanks!
>> > Honza
>>
>> I'd like to also fix it on GCC 8 branch for CET.  Should I backport my
>> patch to GCC 8 after a few days or use the simple patch for GCC 8:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00588.html
>
> I would backport this one so we don't unnecesarily diverge.
> Thanks!
> Honza

This is the backport which I will check into GCC 8 branch next week.

Thanks.

-- 
H.J.
From a938f6318261eb93cb02ede65b0966c2d8a78880 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Wed, 11 Apr 2018 12:31:21 -0700
Subject: [PATCH] Don't mark IFUNC resolver as only called directly

Since IFUNC resolver is called indirectly, don't mark IFUNC resolver as
only called directly.  This patch adds ifunc_resolver to cgraph_node,
sets ifunc_resolver for ifunc attribute and checks ifunc_resolver
instead of looking up ifunc attribute.

gcc/

	Backport from mainline
	2018-05-22  H.J. Lu  

	PR target/85345
	* cgraph.h (cgraph_node::create): Set ifunc_resolver for ifunc
	attribute.
	(cgraph_node::create_alias): Likewise.
	(cgraph_node::get_availability): Check ifunc_resolver instead
	of looking up ifunc attribute.
	* cgraphunit.c (maybe_diag_incompatible_alias): Likewise.
	* varasm.c (do_assemble_alias): Likewise.
	(assemble_alias): Likewise.
	(default_binds_local_p_3): Likewise.
	* cgraph.h (cgraph_node): Add ifunc_resolver.
	(cgraph_node::only_called_directly_or_aliased_p): Return false
	for IFUNC resolver.
	* lto-cgraph.c (input_node): Set ifunc_resolver for ifunc
	attribute.
	* symtab.c (symtab_node::verify_base): Verify that ifunc_resolver
	is equivalent to lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl)).
	(symtab_node::binds_to_current_def_p): Check ifunc_resolver
	instead of looking up ifunc attribute.

gcc/testsuite/

	Backport from mainline
	2018-05-22  H.J. Lu  

	PR target/85345
	* gcc.target/i386/pr85345.c: New test.
---
 gcc/cgraph.c|  7 +++-
 gcc/cgraph.h|  4 +++
 gcc/cgraphunit.c|  2 +-
 gcc/lto-cgraph.c|  2 ++
 gcc/symtab.c| 11 +--
 gcc/testsuite/gcc.target/i386/pr85345.c | 44 +
 gcc/varasm.c|  8 +++--
 7 files changed, 71 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr85345.c

diff --git a/gcc/cgraph.c b/gcc/cgraph.

Re: Incremental LTO linking part 6: dwarf2out support

2018-05-23 Thread Jan Hubicka
> On 05/08/2018 09:31 AM, Jan Hubicka wrote:
> > Hi,
> > this patch tells dwarf2out that it can have early debug not only in WPA mode
> > but also when incrementally linking. This prevents ICE on almost every 
> > testcase
> > compiled with -g.
> > 
> > Bootstrapped/regtested x86_64-linux with rest of incremental linking 
> > patchet.
> > Makes sense?
> > 
> > Honza
> > 
> > * dwarf2out.c (dwarf2out_die_ref_for_decl,
> > darf2out_register_external_decl): Support incremental link.
> OK.  I think the full series is ACK'd now.  Can you confirm?

I hope so. I am not 100% sure if I need ACK from liberty maintainer for 
simple-object change. But since Richard is happy with it and he is the author
of the code and gcc is the only user of this logic, I hope it is OK.

As discussed with Richard on IRC, I still plan to update the driver to accept
-fno-lto to imply codegen rather than requiring user to pass 
-flinker-output=nolto-rel
which is somewhat odd, so I will send updated patch for that part.

Honza
> 
> Thanks,
> Jeff


Re: Incremental LTO linking part 6: dwarf2out support

2018-05-23 Thread Jeff Law
On 05/23/2018 09:54 AM, Jan Hubicka wrote:
>> On 05/08/2018 09:31 AM, Jan Hubicka wrote:
>>> Hi,
>>> this patch tells dwarf2out that it can have early debug not only in WPA mode
>>> but also when incrementally linking. This prevents ICE on almost every 
>>> testcase
>>> compiled with -g.
>>>
>>> Bootstrapped/regtested x86_64-linux with rest of incremental linking 
>>> patchet.
>>> Makes sense?
>>>
>>> Honza
>>>
>>> * dwarf2out.c (dwarf2out_die_ref_for_decl,
>>> darf2out_register_external_decl): Support incremental link.
>> OK.  I think the full series is ACK'd now.  Can you confirm?
> 
> I hope so. I am not 100% sure if I need ACK from liberty maintainer for 
> simple-object change. But since Richard is happy with it and he is the author
> of the code and gcc is the only user of this logic, I hope it is OK.
I think Richi's ACK for the libiberty bits is sufficient.


> 
> As discussed with Richard on IRC, I still plan to update the driver to accept
> -fno-lto to imply codegen rather than requiring user to pass 
> -flinker-output=nolto-rel
> which is somewhat odd, so I will send updated patch for that part.
Understood.

jeff


[PATCH] Fix handling of an empty filename at end of a path

2018-05-23 Thread Jonathan Wakely

The C++17 std::filesystem::path grammar allows an empty filename as the
last component (to signify a trailing slash). The existing code does not
handle this consistently, sometimes an empty filename has type _Multi
and sometimes it has type _Filename. This can result in a non-empty
iterator range for an empty filename component.

This change ensures that empty paths always have type _Filename and will
yield an empty iterator range.

* include/bits/fs_path.h (path::_M_type): Change default member
initializer to _Filename.
(path::begin): Create past-the-end iterator for empty path.
* src/filesystem/std-path.cc (path::remove_filename()): Remove
debugging check.
(path::has_relative_path()): Return false for empty filenames.
(path::_M_split_cmpts): Set _M_type to _Filename for empty paths.
Fix offset of empty final component.
* testsuite/27_io/filesystem/path/itr/components.cc: New.
* testsuite/27_io/filesystem/path/itr/traversal.cc: Add new inputs.

Tested powerpc64le-linux, committed to trunk. I plan to backport this
to gcc-8-branch too.


commit ce947893c8e04ab3cde5f5dd3364a94a068b313c
Author: Jonathan Wakely 
Date:   Wed May 23 14:42:08 2018 +0100

Fix handling of an empty filename at end of a path

The C++17 std::filesystem::path grammar allows an empty filename as the
last component (to signify a trailing slash). The existing code does not
handle this consistently, sometimes an empty filename has type _Multi
and sometimes it has type _Filename. This can result in a non-empty
iterator range for an empty filename component.

This change ensures that empty paths always have type _Filename and will
yield an empty iterator range.

* include/bits/fs_path.h (path::_M_type): Change default member
initializer to _Filename.
(path::begin): Create past-the-end iterator for empty path.
* src/filesystem/std-path.cc (path::remove_filename()): Remove
debugging check.
(path::has_relative_path()): Return false for empty filenames.
(path::_M_split_cmpts): Set _M_type to _Filename for empty paths.
Fix offset of empty final component.
* testsuite/27_io/filesystem/path/itr/components.cc: New.
* testsuite/27_io/filesystem/path/itr/traversal.cc: Add new inputs.

diff --git a/libstdc++-v3/include/bits/fs_path.h 
b/libstdc++-v3/include/bits/fs_path.h
index 51af2891647..79a341830db 100644
--- a/libstdc++-v3/include/bits/fs_path.h
+++ b/libstdc++-v3/include/bits/fs_path.h
@@ -497,7 +497,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 struct _Cmpt;
 using _List = _GLIBCXX_STD_C::vector<_Cmpt>;
 _List _M_cmpts; // empty unless _M_type == _Type::_Multi
-_Type _M_type = _Type::_Multi;
+_Type _M_type = _Type::_Filename;
   };
 
   template<>
@@ -1076,7 +1076,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   {
 if (_M_type == _Type::_Multi)
   return iterator(this, _M_cmpts.begin());
-return iterator(this, false);
+return iterator(this, empty());
   }
 
   inline path::iterator
diff --git a/libstdc++-v3/src/filesystem/std-path.cc 
b/libstdc++-v3/src/filesystem/std-path.cc
index 6f594cec1d5..755cb7c883a 100644
--- a/libstdc++-v3/src/filesystem/std-path.cc
+++ b/libstdc++-v3/src/filesystem/std-path.cc
@@ -63,8 +63,6 @@ path::remove_filename()
 }
   else if (_M_type == _Type::_Filename)
 clear();
-  if (!empty() && _M_pathname.back() != '/')
-throw 1;
   return *this;
 }
 
@@ -292,7 +290,7 @@ path::has_root_path() const
 bool
 path::has_relative_path() const
 {
-  if (_M_type == _Type::_Filename)
+  if (_M_type == _Type::_Filename && !_M_pathname.empty())
 return true;
   if (!_M_cmpts.empty())
 {
@@ -301,7 +299,7 @@ path::has_relative_path() const
 ++__it;
   if (__it != _M_cmpts.end() && __it->_M_type == _Type::_Root_dir)
 ++__it;
-  if (__it != _M_cmpts.end())
+  if (__it != _M_cmpts.end() && !__it->_M_pathname.empty())
 return true;
 }
   return false;
@@ -514,11 +512,13 @@ path::_M_find_extension() const
 void
 path::_M_split_cmpts()
 {
-  _M_type = _Type::_Multi;
   _M_cmpts.clear();
-
   if (_M_pathname.empty())
-return;
+{
+  _M_type = _Type::_Filename;
+  return;
+}
+  _M_type = _Type::_Multi;
 
   size_t pos = 0;
   const size_t len = _M_pathname.size();
@@ -593,8 +593,7 @@ path::_M_split_cmpts()
   // An empty element, if trailing non-root directory-separator present.
   if (_M_cmpts.back()._M_type == _Type::_Filename)
{
- const auto& last = _M_cmpts.back();
- pos = last._M_pos + last._M_pathname.size();
+ pos = _M_pathname.size();
  _M_cmpts.emplace_back(string_type(), _Type::_Filename, pos);
}
 }
diff --git a/libstdc++-v3/testsuite/27_io/filesystem/path/itr/components.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/path/itr/components.cc
n

Re: C++ PATCH to implement P0614R1, Range-based for statements with initializer (take 2)

2018-05-23 Thread Jason Merrill
OK.

On Wed, May 23, 2018 at 10:45 AM, Marek Polacek  wrote:
> On Tue, May 22, 2018 at 09:46:10PM -0400, Jason Merrill wrote:
>> On Tue, May 22, 2018 at 7:25 PM, Marek Polacek  wrote:
>> > On Mon, May 21, 2018 at 09:51:44PM -0400, Jason Merrill wrote:
>> >> On Mon, May 21, 2018 at 7:34 PM, Marek Polacek  wrote:
>> >> > The previous version of this patch got confused by
>> >> >
>> >> >   for (int i = 0; n > 0 ? true : false; i++)
>> >> > // ...
>> >> >
>> >> > because even though we see a ; followed by a :, it's not a range-based 
>> >> > for with
>> >> > an initializer.  I find it very strange that this didn't show up during 
>> >> > the
>> >> > regtest.
>> >> >
>> >> > To fix this, I had to uglify range_based_for_with_init_p to also check 
>> >> > for a ?.
>> >> > Yuck.
>> >>
>> >> Perhaps cp_parser_skip_to_closing_parenthesis_1 should handle balanced
>> >> ?: like ()/[]/{}.
>> >
>> > Good point.  Clearly there's a difference between ?: and e.g. () because : 
>> > can
>> > stand alone--e.g. in asm (: "whatever"), labels, goacc arrays like a[0:N], 
>> > and
>> > so on.  The following seems to work well, and is certainly less ugly than 
>> > the
>> > previous version.
>> >
>> > +   case CPP_QUERY:
>> > + if (!brace_depth)
>> > +   ++condop_depth;
>> > + break;
>> > +
>> > +   case CPP_COLON:
>> > + if (!brace_depth && condop_depth > 0)
>> > +   condop_depth--;
>> > + break;
>>
>> Since, as you say, colons can appear in more places, maybe we only
>> want to adjust condop_depth when all the other depths are 0, not just
>> brace_depth.
>
> Yeah, I meant to do it but apparently I didn't :(.  Fixed below.
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
>
> 2018-05-23  Marek Polacek  
>
> Implement P0614R1, Range-based for statements with initializer.
> * parser.c (cp_parser_range_based_for_with_init_p): New.
> (cp_parser_init_statement): Use it.  Parse the optional init-statement
> for a range-based for loop.
> (cp_parser_skip_to_closing_parenthesis_1): Handle balancing ?:.
>
> * g++.dg/cpp2a/range-for1.C: New test.
> * g++.dg/cpp2a/range-for2.C: New test.
> * g++.dg/cpp2a/range-for3.C: New test.
> * g++.dg/cpp2a/range-for4.C: New test.
> * g++.dg/cpp2a/range-for5.C: New test.
> * g++.dg/cpp2a/range-for6.C: New test.
> * g++.dg/cpp2a/range-for7.C: New test.
>
> diff --git gcc/cp/parser.c gcc/cp/parser.c
> index 6f51f03f47c..d3e73488e84 100644
> --- gcc/cp/parser.c
> +++ gcc/cp/parser.c
> @@ -3493,6 +3493,7 @@ cp_parser_skip_to_closing_parenthesis_1 (cp_parser 
> *parser,
>unsigned paren_depth = 0;
>unsigned brace_depth = 0;
>unsigned square_depth = 0;
> +  unsigned condop_depth = 0;
>
>if (recovering && or_ttype == CPP_EOF
>&& cp_parser_uncommitted_to_tentative_parse_p (parser))
> @@ -3504,7 +3505,7 @@ cp_parser_skip_to_closing_parenthesis_1 (cp_parser 
> *parser,
>
>/* Have we found what we're looking for before the closing paren?  */
>if (token->type == or_ttype && or_ttype != CPP_EOF
> - && !brace_depth && !paren_depth && !square_depth)
> + && !brace_depth && !paren_depth && !square_depth && !condop_depth)
> return -1;
>
>switch (token->type)
> @@ -3551,6 +3552,16 @@ cp_parser_skip_to_closing_parenthesis_1 (cp_parser 
> *parser,
> }
>   break;
>
> +   case CPP_QUERY:
> + if (!brace_depth && !paren_depth && !square_depth)
> +   ++condop_depth;
> + break;
> +
> +   case CPP_COLON:
> + if (!brace_depth && !paren_depth && !square_depth && condop_depth > 
> 0)
> +   condop_depth--;
> + break;
> +
> default:
>   break;
> }
> @@ -11255,6 +11266,40 @@ cp_parser_statement_seq_opt (cp_parser* parser, tree 
> in_statement_expr)
>  }
>  }
>
> +/* Return true if this is the C++20 version of range-based-for with
> +   init-statement.  */
> +
> +static bool
> +cp_parser_range_based_for_with_init_p (cp_parser *parser)
> +{
> +  bool r = false;
> +
> +  /* Save tokens so that we can put them back.  */
> +  cp_lexer_save_tokens (parser->lexer);
> +
> +  /* There has to be an unnested ; followed by an unnested :.  */
> +  if (cp_parser_skip_to_closing_parenthesis_1 (parser,
> +  /*recovering=*/false,
> +  CPP_SEMICOLON,
> +  /*consume_paren=*/false) != -1)
> +goto out;
> +
> +  /* We found the semicolon, eat it now.  */
> +  cp_lexer_consume_token (parser->lexer);
> +
> +  /* Now look for ':' that is not nested in () or {}.  */
> +  r = (cp_parser_skip_to_closing_parenthesis_1 (parser,
> +   /*recovering=*/false,
> +   CPP_COLON,
> +

[wwwdocs] Buildstat update for 8.x

2018-05-23 Thread Tom G. Christensen
Latest results for 8.x

-tgc

Testresults for 8.1.0:
  i386-pc-solaris2.10
  i386-pc-solaris2.11
  sparc-sun-solaris2.10
  sparc-sun-solaris2.11
  x86_64-apple-darwin11.4.2
  x86_64-pc-linux-gnu (3)
  x86_64-w64-mingw32

--- /home/tgc/projects/gcc/wwwdocs/htdocs/gcc-8/buildstat.html  2018-04-25 
10:44:39.0 +0200
+++ /tmp/tmp.TT8mJUrfRi 2018-05-23 18:38:20.175098068 +0200
@@ -20,5 +20,67 @@
 http://gcc.gnu.org/install/finalinstall.html";>
 Installing GCC: Final Installation.
 
+
+
+
+i386-pc-solaris2.10
+ 
+Test results:
+https://gcc.gnu.org/ml/gcc-testresults/2018-05/msg00144.html";>8.1.0
+
+
+
+
+i386-pc-solaris2.11
+ 
+Test results:
+https://gcc.gnu.org/ml/gcc-testresults/2018-05/msg00146.html";>8.1.0
+
+
+
+
+sparc-sun-solaris2.10
+ 
+Test results:
+https://gcc.gnu.org/ml/gcc-testresults/2018-05/msg00143.html";>8.1.0
+
+
+
+
+sparc-sun-solaris2.11
+ 
+Test results:
+https://gcc.gnu.org/ml/gcc-testresults/2018-05/msg00139.html";>8.1.0
+
+
+
+
+x86_64-apple-darwin11.4.2
+ 
+Test results:
+https://gcc.gnu.org/ml/gcc-testresults/2018-05/msg00250.html";>8.1.0
+
+
+
+
+x86_64-pc-linux-gnu
+ 
+Test results:
+https://gcc.gnu.org/ml/gcc-testresults/2018-05/msg00374.html";>8.1.0,
+https://gcc.gnu.org/ml/gcc-testresults/2018-05/msg00132.html";>8.1.0,
+https://gcc.gnu.org/ml/gcc-testresults/2018-05/msg00127.html";>8.1.0
+
+
+
+
+x86_64-w64-mingw32
+ 
+Test results:
+https://gcc.gnu.org/ml/gcc-testresults/2018-05/msg00583.html";>8.1.0
+
+
+
+
+
 
 


Re: C++ PATCH for c++/85847, ICE with template_id_expr in new()

2018-05-23 Thread Jason Merrill
On Wed, May 23, 2018 at 9:46 AM, Marek Polacek  wrote:
> The diagnostic code in build_new{,_1} was using maybe_constant_value to fold
> the array length, but that breaks while parsing a template, because we might
> then leak template codes to the constexpr machinery.
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk/8?
>
> 2018-05-23  Marek Polacek  
>
> PR c++/85847
> * init.c (build_new_1): Use fold_non_dependent_expr.
> (build_new): Likewise.
>
> * g++.dg/cpp0x/new3.C: New test.
>
> @@ -2860,7 +2860,7 @@ build_new_1 (vec **placement, tree type, 
> tree nelts,
>/* Lots of logic below. depends on whether we have a constant number of
>   elements, so go ahead and fold it now.  */
>if (outer_nelts)
> -outer_nelts = maybe_constant_value (outer_nelts);
> +outer_nelts = fold_non_dependent_expr (outer_nelts);

If outer_nelts is non-constant, this will mean that it ends up
instantiated but still non-constant, which can lead to problems when
the result is used in building up other expressions.

I think we want to put the result of folding in a separate variable
for use with things that want to know about a constant size, and keep
the original outer_nelts for use in building outer_nelts_check.

>/* Try to determine the constant value only for the purposes
>  of the diagnostic below but continue to use the original
>  value and handle const folding later.  */
> -  const_tree cst_nelts = maybe_constant_value (nelts);
> +  const_tree cst_nelts = fold_non_dependent_expr (nelts);

...like we do here.

Jason


Re: C++ PATCH to implement P0614R1, Range-based for statements with initializer (take 2)

2018-05-23 Thread Jakub Jelinek
On Wed, May 23, 2018 at 10:45:43AM -0400, Marek Polacek wrote:
> 2018-05-23  Marek Polacek  
> 
>   Implement P0614R1, Range-based for statements with initializer.
>   * parser.c (cp_parser_range_based_for_with_init_p): New.
>   (cp_parser_init_statement): Use it.  Parse the optional init-statement
>   for a range-based for loop.
>   (cp_parser_skip_to_closing_parenthesis_1): Handle balancing ?:.
> 
>   * g++.dg/cpp2a/range-for1.C: New test.
>   * g++.dg/cpp2a/range-for2.C: New test.
>   * g++.dg/cpp2a/range-for3.C: New test.
>   * g++.dg/cpp2a/range-for4.C: New test.
>   * g++.dg/cpp2a/range-for5.C: New test.
>   * g++.dg/cpp2a/range-for6.C: New test.
>   * g++.dg/cpp2a/range-for7.C: New test.

Could you please add some testcases that would test the handling of
structured bindings in these new forms of range for, like:
for (int i = 0; auto [ x, y ] : z)
but perhaps for completeness also in the init-stmt and perhaps both spots
too?

Jakub


Re: [PATCH][1/n] Dissect vectorizer GROUP_* info

2018-05-23 Thread Richard Biener
On May 23, 2018 5:29:46 PM GMT+02:00, Richard Sandiford 
 wrote:
>Richard Biener  writes:
>> This patch splits GROUP_* into DR_GROUP_* and REDUC_GROUP_* as a step
>> towards sharing dataref and dependence analysis between vectorization
>> analysis of different vector sizes (and then comparing costs and
>choosing
>> the best one).
>>
>> The next patch will move the DR_GROUP_* fields to the DR_AUX
>vectorizer
>> data structure which means that the DR group analysis could
>eventually
>> be used from outside of the vectorizer.
>>
>> Bootstrap / regtest running on x86_64-unknown-linux-gnu.
>>
>> I'm considering applying this independently because of the churn it
>> generates to make my life easier.
>>
>> Richard - I remember you talking about patches walking in the very
>> same direction so please speak up if this disrupts your work and
>> lets coordinate.
>
>I don't think this patch clashes with what I was trying, so please
>go ahead.  I was looking at patches to remove the globalness of
>vinfo_for_stmt, mostly by avoiding it wherever possible and using
>stmt_info instead of gimple *.  That's quite a lot of churn but leaves
>very few vinfo_for_stmts left.

Ah, that's indeed nice. 

>It also included an attempt to clean up how we handle getting the
>vector versions of operands.

Ambitious ;) I've tried this many times... 

Richard. 

>Thanks,
>Richard



[PATCH][PR sanitizer/84250] Avoid global symbols collision when using both ASan and UBSan

2018-05-23 Thread Maxim Ostapenko
Hi,


as described in PR, when using both ASan and UBSan 
(-fsanitize=address,undefined ), we have symbols collision for global 
functions, like __sanitizer_set_report_path. This leads to fuzzy results 
when printing reports into files e.g. for this test case:

#include 
int main(int argc, char **argv) {
   __sanitizer_set_report_path("/tmp/sanitizer.txt");
   int i = 23;
   i <<= 32;
   int *array = new int[100];
   delete [] array;
   return array[argc];
}

only ASan's report gets written to file; UBSan output goes to stderr.

To resolve this issue we could use two approaches:

1) Use the same approach to that is implemented in Clang (UBSan embedded 
to ASan). The only caveat here is that we need to link (unused) C++ part 
of UBSan even in C programs when linking static ASan runtime. This 
happens because GCC, as opposed to Clang, doesn't split C and C++ 
runtimes for sanitizers.

2) Just add SANITIZER_INTERFACE_ATTRIBUTE to report_file global 
variable. In this case all __sanitizer_set_report_path calls will set 
the same report_file variable. IMHO this is a hacky way to fix the 
issue, it's better to use the first option if possible.


The attached patch fixes the symbols collision by embedding UBSan into 
ASan (variant 1), just like we do for LSan.


Regtested/bootstrapped on x86_64-unknown-linux-gnu, looks reasonable 
enough for trunk?


-Maxim

gcc/ChangeLog:

2018-05-23  Maxim Ostapenko  

	* config/gnu-user.h (LIBASAN_EARLY_SPEC): Pass -lstdc++ for static
	libasan.
	* gcc.c: Do not pass LIBUBSAN_SPEC if ASan is enabled with UBSan.

libsanitizer/ChangeLog:

2018-05-23  Maxim Ostapenko  

	* Makefile.am: Reorder libs.
	* Makefile.in: Regenerate.
	* asan/Makefile.am: Define DCAN_SANITIZE_UB=1, add dependancy from
	libsanitizer_ubsan.la.
	* asan/Makefile.in: Regenerate.
	* ubsan/Makefile.am: Define new libsanitizer_ubsan.la library.
	* ubsan/Makefile.in: Regenerate.

diff --git a/gcc/config/gnu-user.h b/gcc/config/gnu-user.h
index cba3c0b..ccae957 100644
--- a/gcc/config/gnu-user.h
+++ b/gcc/config/gnu-user.h
@@ -161,7 +161,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define LIBASAN_EARLY_SPEC "%{!shared:libasan_preinit%O%s} " \
   "%{static-libasan:%{!shared:" \
   LD_STATIC_OPTION " --whole-archive -lasan --no-whole-archive " \
-  LD_DYNAMIC_OPTION "}}%{!static-libasan:-lasan}"
+  LD_DYNAMIC_OPTION " -lstdc++ }}%{!static-libasan:-lasan}"
 #undef LIBTSAN_EARLY_SPEC
 #define LIBTSAN_EARLY_SPEC "%{!shared:libtsan_preinit%O%s} " \
   "%{static-libtsan:%{!shared:" \
diff --git a/gcc/gcc.c b/gcc/gcc.c
index a716f70..215e7a0 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -984,7 +984,7 @@ proper position among the other output files.  */
 %{static:%ecannot specify -static with -fsanitize=address}}\
 %{%:sanitize(thread):" LIBTSAN_SPEC "\
 %{static:%ecannot specify -static with -fsanitize=thread}}\
-%{%:sanitize(undefined):" LIBUBSAN_SPEC "}\
+%{!%:sanitize(address):%{%:sanitize(undefined):" LIBUBSAN_SPEC "}}\
 %{%:sanitize(leak):" LIBLSAN_SPEC "}}}"
 #endif
 
diff --git a/libsanitizer/Makefile.am b/libsanitizer/Makefile.am
index 018f0b0..08d952b 100644
--- a/libsanitizer/Makefile.am
+++ b/libsanitizer/Makefile.am
@@ -14,7 +14,7 @@ endif
 if LIBBACKTRACE_SUPPORTED
 SUBDIRS += libbacktrace
 endif
-SUBDIRS += lsan asan ubsan
+SUBDIRS += lsan ubsan asan
 nodist_saninclude_HEADERS += \
   include/sanitizer/lsan_interface.h \
   include/sanitizer/asan_interface.h \
diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in
index a9fea21e..9074292 100644
--- a/libsanitizer/Makefile.in
+++ b/libsanitizer/Makefile.in
@@ -140,8 +140,8 @@ AM_RECURSIVE_TARGETS = $(RECURSIVE_TARGETS:-recursive=) \
 	$(RECURSIVE_CLEAN_TARGETS:-recursive=) tags TAGS ctags CTAGS
 ETAGS = etags
 CTAGS = ctags
-DIST_SUBDIRS = sanitizer_common interception libbacktrace lsan asan \
-	ubsan tsan
+DIST_SUBDIRS = sanitizer_common interception libbacktrace lsan ubsan \
+	asan tsan
 ACLOCAL = @ACLOCAL@
 ALLOC_FILE = @ALLOC_FILE@
 AMTAR = @AMTAR@
@@ -294,7 +294,7 @@ ACLOCAL_AMFLAGS = -I .. -I ../config
 sanincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/include/sanitizer
 nodist_saninclude_HEADERS = $(am__append_1)
 @SANITIZER_SUPPORTED_TRUE@SUBDIRS = sanitizer_common $(am__append_2) \
-@SANITIZER_SUPPORTED_TRUE@	$(am__append_3) lsan asan ubsan \
+@SANITIZER_SUPPORTED_TRUE@	$(am__append_3) lsan ubsan asan \
 @SANITIZER_SUPPORTED_TRUE@	$(am__append_4)
 gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
 
diff --git a/libsanitizer/asan/Makefile.am b/libsanitizer/asan/Makefile.am
index f105b03..3ac49ee 100644
--- a/libsanitizer/asan/Makefile.am
+++ b/libsanitizer/asan/Makefile.am
@@ -3,7 +3,7 @@ AM_CPPFLAGS = -I $(top_srcdir)/include -I $(top_srcdir)
 # May be used by toolexeclibdir.
 gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
 
-DEFS = -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -DASAN_

C++ PATCHes to xvalue handling

2018-05-23 Thread Jason Merrill
The first patch implements the adjustments from core issues 616 and
1213 to the value category of subobjects of class prvalues: they were
considered prvalues themselves, but that was kind of nonsensical.  Now
they are considered xvalues.  Along with this, I've removed the
diagnostic distinction between xvalues and prvalues when trying to use
one or the other as an lvalue; the important thing is that they are
rvalues.

The second patch corrects various issues with casts and xvalues/rvalue
references: we were treating an xvalue operand to dynamic_cast as an
lvalue, and we were objecting to casts from prvalue to rvalue
reference type.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit e7b8c6f89b5e4c69bc2e74ade15a5364b9fac45e
Author: Jason Merrill 
Date:   Tue May 22 15:40:24 2018 -0400

CWG 616, 1213 - value category of subobject references.

* tree.c (lvalue_kind): A reference to a subobject of a prvalue is
an xvalue.
* typeck2.c (build_m_component_ref): Likewise.
* typeck.c (cp_build_addr_expr_1, lvalue_or_else): Remove diagnostic
distinction between temporary and xvalue.

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 15b9697a63b..efb8c2bf926 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -87,6 +87,7 @@ lvalue_kind (const_tree ref)
 {
 case SAVE_EXPR:
   return clk_none;
+
   /* preincrements and predecrements are valid lvals, provided
 	 what they refer to are valid lvals.  */
 case PREINCREMENT_EXPR:
@@ -94,7 +95,14 @@ lvalue_kind (const_tree ref)
 case TRY_CATCH_EXPR:
 case REALPART_EXPR:
 case IMAGPART_EXPR:
-  return lvalue_kind (TREE_OPERAND (ref, 0));
+case ARRAY_REF:
+case VIEW_CONVERT_EXPR:
+  op1_lvalue_kind = lvalue_kind (TREE_OPERAND (ref, 0));
+  if (op1_lvalue_kind == clk_class)
+	/* in the case of an array operand, the result is an lvalue if that
+	   operand is an lvalue and an xvalue otherwise */
+	op1_lvalue_kind = clk_rvalueref;
+  return op1_lvalue_kind;
 
 case MEMBER_REF:
 case DOTSTAR_EXPR:
@@ -104,6 +112,11 @@ lvalue_kind (const_tree ref)
 	op1_lvalue_kind = lvalue_kind (TREE_OPERAND (ref, 0));
   if (TYPE_PTRMEMFUNC_P (TREE_TYPE (TREE_OPERAND (ref, 1
 	op1_lvalue_kind = clk_none;
+  else if (op1_lvalue_kind == clk_class)
+	/* The result of a .* expression whose second operand is a pointer to a
+	   data member is an lvalue if the first operand is an lvalue and an
+	   xvalue otherwise.  */
+	op1_lvalue_kind = clk_rvalueref;
   return op1_lvalue_kind;
 
 case COMPONENT_REF:
@@ -119,6 +132,11 @@ lvalue_kind (const_tree ref)
 	return lvalue_kind (TREE_OPERAND (ref, 1));
 	}
   op1_lvalue_kind = lvalue_kind (TREE_OPERAND (ref, 0));
+  if (op1_lvalue_kind == clk_class)
+	/* If E1 is an lvalue, then E1.E2 is an lvalue;
+	   otherwise E1.E2 is an xvalue.  */
+	op1_lvalue_kind = clk_rvalueref;
+
   /* Look at the member designator.  */
   if (!op1_lvalue_kind)
 	;
@@ -165,7 +183,6 @@ lvalue_kind (const_tree ref)
   /* FALLTHRU */
 case INDIRECT_REF:
 case ARROW_EXPR:
-case ARRAY_REF:
 case PARM_DECL:
 case RESULT_DECL:
 case PLACEHOLDER_EXPR:
@@ -203,11 +220,7 @@ lvalue_kind (const_tree ref)
 	 type-dependent expr, that is, but we shouldn't be testing
 	 lvalueness if we can't even tell the types yet!  */
 	  gcc_assert (!type_dependent_expression_p (CONST_CAST_TREE (ref)));
-	  if (CLASS_TYPE_P (TREE_TYPE (ref))
-	  || TREE_CODE (TREE_TYPE (ref)) == ARRAY_TYPE)
-	return clk_class;
-	  else
-	return clk_none;
+	  goto default_;
 	}
   op1_lvalue_kind = lvalue_kind (TREE_OPERAND (ref, 1)
 ? TREE_OPERAND (ref, 1)
@@ -257,18 +270,14 @@ lvalue_kind (const_tree ref)
 case PAREN_EXPR:
   return lvalue_kind (TREE_OPERAND (ref, 0));
 
-case VIEW_CONVERT_EXPR:
-  if (location_wrapper_p (ref))
-	return lvalue_kind (TREE_OPERAND (ref, 0));
-  /* Fallthrough.  */
-
 default:
+default_:
   if (!TREE_TYPE (ref))
 	return clk_none;
   if (CLASS_TYPE_P (TREE_TYPE (ref))
 	  || TREE_CODE (TREE_TYPE (ref)) == ARRAY_TYPE)
 	return clk_class;
-  break;
+  return clk_none;
 }
 
   /* If one operand is not an lvalue at all, then this expression is
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index ecb334d19d2..82089c45105 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -5860,11 +5860,8 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue, tsubst_flags_t complain)
 	{
 	  if (!(complain & tf_error))
 	return error_mark_node;
-	  if (kind & clk_class)
-	/* Make this a permerror because we used to accept it.  */
-	permerror (input_location, "taking address of temporary");
-	  else
-	error ("taking address of xvalue (rvalue reference)");
+	  /* Make this a permerror because we used to accept it.  */
+	  permerror (input_location, "taking address of rvalue");
 	}
 }
 
@@ -9866,11 +9863,8 @@ lvalu

Re: libcpp PATCH to avoid deprecated copy assignment

2018-05-23 Thread Jason Merrill
On Wed, May 23, 2018 at 1:33 AM, Gerald Pfeifer  wrote:
> On Mon, 21 May 2018, Jason Merrill wrote:
>>> broke bootstrap on systems using libc++ instead of libstdc++
>
>>>   In file included from /usr/include/c++/v1/new:91:
>>>   /usr/include/c++/v1/exception:180:5: error: no member named 'fancy_abort' 
>>> in namespace 'std::__1'; did you mean simply 'fancy_abort'?
>>>   _VSTD::abort();
>
>>> The problem appears to be the added #include 
>> Does moving the #include  up higher help?
>
> Yes, it does!
>
> (Tested both with a direct bootstrap and by adding this to the
> FreeBSD port of gcc9-devel; both succeeded now.)

Great, applied.

Jason


Re: [PATCH] testsuite: Introduce be/le selectors

2018-05-23 Thread Segher Boessenkool
On Tue, May 22, 2018 at 03:21:30PM -0600, Jeff Law wrote:
> On 05/21/2018 03:46 PM, Segher Boessenkool wrote:
> > This patch creates "be" and "le" selectors, which can be used by all
> > architectures, similar to ilp32 and lp64.
>
> I think this is fine.  "be" "le" are used all over the place in gcc and
> the kernel to denote big/little endian.

Thanks.  This is what I checked in (to trunk):


2017-05-23  Segher Boessenkool  

* doc/sourcebuild.texi (Endianness): New subsubsection.

gcc/testsuite/
* lib/target-supports.exp (check_effective_target_be): New.
(check_effective_target_le): New.


diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index dfb0578..596007d 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1313,6 +1313,16 @@ By convention, keywords ending in @code{_nocache} can 
also include options
 specified for the particular test in an earlier @code{dg-options} or
 @code{dg-add-options} directive.
 
+@subsubsection Endianness
+
+@table @code
+@item be
+Target uses big-endian memory order for multi-byte and multi-word data.
+
+@item le
+Target uses little-endian memory order for multi-byte and multi-word data.
+@end table
+
 @subsubsection Data type sizes
 
 @table @code
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index aa1296e6..0a53d7b 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2523,6 +2523,22 @@ proc check_effective_target_next_runtime { } {
 }]
 }
 
+# Return 1 if we're generating code for big-endian memory order.
+
+proc check_effective_target_be { } {
+return [check_no_compiler_messages be object {
+   int dummy[__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ ? 1 : -1];
+}]
+}
+
+# Return 1 if we're generating code for little-endian memory order.
+
+proc check_effective_target_le { } {
+return [check_no_compiler_messages le object {
+   int dummy[__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ ? 1 : -1];
+}]
+}
+
 # Return 1 if we're generating 32-bit code using default options, 0
 # otherwise.
 
-- 
1.8.3.1



Re: [PATCH] PR target/85358: Add target hook to prevent default widening

2018-05-23 Thread Michael Meissner
On Wed, May 23, 2018 at 11:41:49AM +0200, Richard Biener wrote:
> Just a question for clarification - _is_ KFmode strictly a wider mode
> than IFmode?  That is, can it represent all values that IFmode can?

No.  IFmode consists of a pair of doubles.  I'm sure there are corner cases
IFmode can represent something that KFmode can't.

In addition, IFmode doesn't play to well with reseting rounding modes, etc.

> On another note I question the expanders considering wider FP modes
> somewhat in general.  So maybe the hook shouldn't be named
> default_widening_p but rather mode_covers_p ()?  And we can avoid
> calling the hook for integer modes.
> 
> That said, I wonder if the construction of mode_wider and friends
> should be (optionally) made more explicit in the modes .def file
> so powerpc could avoid any wider relation for IFmode.

This morning on the way to work, I was thinking that maybe the solution is to
have an ALTERNATE_FLOAT_MODE, where it doesn't list the other float modes of
the same size as canadates for widening.  That way we could express it the .def
file directly.

One of the problems I've faced over the years is the assumption that there is
only one type for a given size, and that isn't true for IF/KF/TFmode.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH , rs6000] Add missing builtin test cases, fix arguments to match specifications.

2018-05-23 Thread Segher Boessenkool
Hi Carl,

I committed the be/le selectors.

On Mon, May 21, 2018 at 08:15:30AM -0700, Carl Love wrote:
> --- a/gcc/testsuite/gcc.target/powerpc/builtins-1-be.c
> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-1-be.c
> @@ -1,4 +1,4 @@
> -/* { dg-do compile { target { powerpc64-*-* } } } */
> +/* { dg-do compile { target { powerpc*-*-* && be } } } */

Does this (and other similar tests) work on 32-bit as well?

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-le.c
> @@ -0,0 +1,77 @@
> +/* { dg-do compile { target powerpc64le-*-* } } */
> +/* { dg-require-effective-target powerpc_altivec_ok } */
> +/* { dg-options "-maltivec" } */

This now should be  powerpc*-*-* && le, possibly with && lp64 (but I don't
think we care about 32-bit LE in any of the rest of the testsuite; many
tests will fail there, so I wouldn't bother).

With the be/le selectors available, does it help to split the tests into
two still, or can things be better done with just one test, and be/le
selectors on each scan-assembler-times that needs one?


Segher


Re: [PATCH] PR target/85358: Add target hook to prevent default widening

2018-05-23 Thread Richard Biener
On May 23, 2018 7:53:01 PM GMT+02:00, Michael Meissner  
wrote:
>On Wed, May 23, 2018 at 11:41:49AM +0200, Richard Biener wrote:
>> Just a question for clarification - _is_ KFmode strictly a wider mode
>> than IFmode?  That is, can it represent all values that IFmode can?
>
>No.  IFmode consists of a pair of doubles.  I'm sure there are corner
>cases
>IFmode can represent something that KFmode can't.
>
>In addition, IFmode doesn't play to well with reseting rounding modes,
>etc.
>
>> On another note I question the expanders considering wider FP modes
>> somewhat in general.  So maybe the hook shouldn't be named
>> default_widening_p but rather mode_covers_p ()?  And we can avoid
>> calling the hook for integer modes.
>> 
>> That said, I wonder if the construction of mode_wider and friends
>> should be (optionally) made more explicit in the modes .def file
>> so powerpc could avoid any wider relation for IFmode.
>
>This morning on the way to work, I was thinking that maybe the solution
>is to
>have an ALTERNATE_FLOAT_MODE, where it doesn't list the other float
>modes of
>the same size as canadates for widening.  That way we could express it
>the .def
>file directly.

Yeah. Not sure if we need sth other than FLOAT_MODE or if avoiding the chaining 
in wider mode and friends is enough. I guess one could also change IFmode to be 
a non-scalar, composite mode - a different 'kind' of complex mode maybe... 

>One of the problems I've faced over the years is the assumption that
>there is
>only one type for a given size, and that isn't true for IF/KF/TFmode.



Re: C++ PATCH to implement P0614R1, Range-based for statements with initializer (take 2)

2018-05-23 Thread Marek Polacek
On Wed, May 23, 2018 at 06:45:41PM +0200, Jakub Jelinek wrote:
> On Wed, May 23, 2018 at 10:45:43AM -0400, Marek Polacek wrote:
> > 2018-05-23  Marek Polacek  
> > 
> > Implement P0614R1, Range-based for statements with initializer.
> > * parser.c (cp_parser_range_based_for_with_init_p): New.
> > (cp_parser_init_statement): Use it.  Parse the optional init-statement
> > for a range-based for loop.
> > (cp_parser_skip_to_closing_parenthesis_1): Handle balancing ?:.
> > 
> > * g++.dg/cpp2a/range-for1.C: New test.
> > * g++.dg/cpp2a/range-for2.C: New test.
> > * g++.dg/cpp2a/range-for3.C: New test.
> > * g++.dg/cpp2a/range-for4.C: New test.
> > * g++.dg/cpp2a/range-for5.C: New test.
> > * g++.dg/cpp2a/range-for6.C: New test.
> > * g++.dg/cpp2a/range-for7.C: New test.
> 
> Could you please add some testcases that would test the handling of
> structured bindings in these new forms of range for, like:
> for (int i = 0; auto [ x, y ] : z)
> but perhaps for completeness also in the init-stmt and perhaps both spots
> too?

Sure.

Tested on x86_64-linux, ok for trunk?

2018-05-23  Marek Polacek  

* g++.dg/cpp2a/range-for8.C: New test.
* g++.dg/cpp2a/range-for9.C: New test.
* g++.dg/cpp2a/range-for10.C: New test.

diff --git gcc/testsuite/g++.dg/cpp2a/range-for10.C 
gcc/testsuite/g++.dg/cpp2a/range-for10.C
index e69de29bb2d..a0d0e6d085e 100644
--- gcc/testsuite/g++.dg/cpp2a/range-for10.C
+++ gcc/testsuite/g++.dg/cpp2a/range-for10.C
@@ -0,0 +1,24 @@
+// P0614R1
+// { dg-do run }
+// { dg-options "-std=c++2a" }
+
+struct A { int i; long long j; } a[64];
+
+int
+main ()
+{
+  A b = { 1, 2 };
+  for (auto & [ u, v ] : a)
+{
+  u = 2;
+  v = 3;
+}
+
+  for (auto [x, y] = b; auto [ u, v ] : a)
+if (y + u != x + v)
+  __builtin_abort ();
+
+  for (auto [x, y] = b; auto & [ u, v ] : a)
+if (y + u != x + v)
+  __builtin_abort ();
+}
diff --git gcc/testsuite/g++.dg/cpp2a/range-for8.C 
gcc/testsuite/g++.dg/cpp2a/range-for8.C
index e69de29bb2d..204a63204ca 100644
--- gcc/testsuite/g++.dg/cpp2a/range-for8.C
+++ gcc/testsuite/g++.dg/cpp2a/range-for8.C
@@ -0,0 +1,37 @@
+// P0614R1
+// { dg-do run }
+// { dg-options "-std=c++2a" }
+
+struct A { int i; long long j; } a[64];
+
+int
+main ()
+{
+  for (int i = 0; auto &x : a)
+{
+  x.i = i;
+  x.j = 2 * i++;
+}
+  for (auto & [ x, y ] : a)
+{
+  x += 2;
+  y += 3;
+}
+  for (int i = 0; const auto [ u, v ] : a)
+{
+  if (u != i + 2 || v != 2 * i++ + 3)
+__builtin_abort ();
+}
+  for (int i = 0; auto [ x, y ] : a)
+{
+  x += 4;
+  y += 5;
+  if (x != i + 6 || y != 2 * i++ + 8)
+__builtin_abort ();
+}
+  for (int i = 0; const auto x : a)
+{
+  if (x.i != i + 2 || x.j != 2 * i++ + 3)
+__builtin_abort ();
+}
+}
diff --git gcc/testsuite/g++.dg/cpp2a/range-for9.C 
gcc/testsuite/g++.dg/cpp2a/range-for9.C
index e69de29bb2d..74d71b67213 100644
--- gcc/testsuite/g++.dg/cpp2a/range-for9.C
+++ gcc/testsuite/g++.dg/cpp2a/range-for9.C
@@ -0,0 +1,30 @@
+// P0614R1
+// { dg-do run }
+// { dg-options "-std=c++2a" }
+
+struct A { int i, j; };
+
+int
+main ()
+{
+  A a = { .i = 2, .j = 3 };
+  int arr[] = { 1, 1, 1 };
+
+  for (auto & [ x, y ] = a; auto z : arr)
+if (x + z != 3 || y + z != 4)
+  __builtin_abort ();
+
+  for (int d = 1; auto &z : arr)
+z += d;
+
+  for (const auto [ x, y ] = a; auto z : arr)
+if (x + z != 4 || y + z != 5)
+  __builtin_abort ();
+
+  for (int d = 1; auto &z : arr)
+z += d;
+
+  for (auto [ x, y ] = a; auto z : arr)
+if (x + z != 5 || y + z != 6)
+  __builtin_abort ();
+}


Re: C++ PATCH for c++/85847, ICE with template_id_expr in new()

2018-05-23 Thread Marek Polacek
On Wed, May 23, 2018 at 12:45:11PM -0400, Jason Merrill wrote:
> On Wed, May 23, 2018 at 9:46 AM, Marek Polacek  wrote:
> > The diagnostic code in build_new{,_1} was using maybe_constant_value to fold
> > the array length, but that breaks while parsing a template, because we might
> > then leak template codes to the constexpr machinery.
> >
> > Bootstrapped/regtested on x86_64-linux, ok for trunk/8?
> >
> > 2018-05-23  Marek Polacek  
> >
> > PR c++/85847
> > * init.c (build_new_1): Use fold_non_dependent_expr.
> > (build_new): Likewise.
> >
> > * g++.dg/cpp0x/new3.C: New test.
> >
> > @@ -2860,7 +2860,7 @@ build_new_1 (vec **placement, tree type, 
> > tree nelts,
> >/* Lots of logic below. depends on whether we have a constant number of
> >   elements, so go ahead and fold it now.  */
> >if (outer_nelts)
> > -outer_nelts = maybe_constant_value (outer_nelts);
> > +outer_nelts = fold_non_dependent_expr (outer_nelts);
> 
> If outer_nelts is non-constant, this will mean that it ends up
> instantiated but still non-constant, which can lead to problems when
> the result is used in building up other expressions.
> 
> I think we want to put the result of folding in a separate variable
> for use with things that want to know about a constant size, and keep
> the original outer_nelts for use in building outer_nelts_check.
> 
> >/* Try to determine the constant value only for the purposes
> >  of the diagnostic below but continue to use the original
> >  value and handle const folding later.  */
> > -  const_tree cst_nelts = maybe_constant_value (nelts);
> > +  const_tree cst_nelts = fold_non_dependent_expr (nelts);
> 
> ...like we do here.

Like this?

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2018-05-23  Marek Polacek  

PR c++/85847
* init.c (build_new_1): Use fold_non_dependent_expr.  Use a dedicated
variable for its result.  Fix a condition.
(build_new): Use fold_non_dependent_expr.  Tweak a condition.

* g++.dg/cpp0x/new3.C: New test.

diff --git gcc/cp/init.c gcc/cp/init.c
index b558742abf6..cd0110a1e19 100644
--- gcc/cp/init.c
+++ gcc/cp/init.c
@@ -2857,10 +2857,9 @@ build_new_1 (vec **placement, tree type, 
tree nelts,
   outer_nelts_from_type = true;
 }
 
-  /* Lots of logic below. depends on whether we have a constant number of
+  /* Lots of logic below depends on whether we have a constant number of
  elements, so go ahead and fold it now.  */
-  if (outer_nelts)
-outer_nelts = maybe_constant_value (outer_nelts);
+  const_tree cst_outer_nelts = fold_non_dependent_expr (outer_nelts);
 
   /* If our base type is an array, then make sure we know how many elements
  it has.  */
@@ -2912,11 +2911,12 @@ build_new_1 (vec **placement, tree type, 
tree nelts,
   /* Warn if we performed the (T[N]) to T[N] transformation and N is
  variable.  */
   if (outer_nelts_from_type
-  && !TREE_CONSTANT (outer_nelts))
+  && cst_outer_nelts != NULL_TREE
+  && !TREE_CONSTANT (cst_outer_nelts))
 {
   if (complain & tf_warning_or_error)
{
- pedwarn (EXPR_LOC_OR_LOC (outer_nelts, input_location), OPT_Wvla,
+ pedwarn (EXPR_LOC_OR_LOC (cst_outer_nelts, input_location), OPT_Wvla,
   typedef_variant_p (orig_type)
   ? G_("non-constant array new length must be specified "
"directly, not by typedef")
@@ -3011,9 +3011,10 @@ build_new_1 (vec **placement, tree type, 
tree nelts,
 
   size = size_binop (MULT_EXPR, size, fold_convert (sizetype, nelts));
 
-  if (INTEGER_CST == TREE_CODE (outer_nelts))
+  if (cst_outer_nelts != NULL_TREE
+ && TREE_CODE (cst_outer_nelts) == INTEGER_CST)
{
- if (tree_int_cst_lt (max_outer_nelts_tree, outer_nelts))
+ if (tree_int_cst_lt (max_outer_nelts_tree, cst_outer_nelts))
{
  /* When the array size is constant, check it at compile time
 to make sure it doesn't exceed the implementation-defined
@@ -3639,13 +3640,13 @@ build_new (vec **placement, tree type, 
tree nelts,
   /* Try to determine the constant value only for the purposes
 of the diagnostic below but continue to use the original
 value and handle const folding later.  */
-  const_tree cst_nelts = maybe_constant_value (nelts);
+  const_tree cst_nelts = fold_non_dependent_expr (nelts);
 
   /* The expression in a noptr-new-declarator is erroneous if it's of
 non-class type and its value before converting to std::size_t is
 less than zero. ... If the expression is a constant expression,
 the program is ill-fomed.  */
-  if (INTEGER_CST == TREE_CODE (cst_nelts)
+  if (TREE_CODE (cst_nelts) == INTEGER_CST
  && tree_int_cst_sgn (cst_nelts) == -1)
{
  if (complain & tf_error)
diff --git gcc/testsuite/g++.dg/cpp0x/new3.C gcc/test

Re: [PATCH , rs6000] Add missing builtin test cases, fix arguments to match specifications.

2018-05-23 Thread Carl Love
On Wed, 2018-05-23 at 13:26 -0500, Segher Boessenkool wrote:
> Hi Carl,
> 
> I committed the be/le selectors.
> 
> On Mon, May 21, 2018 at 08:15:30AM -0700, Carl Love wrote:
> > --- a/gcc/testsuite/gcc.target/powerpc/builtins-1-be.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/builtins-1-be.c
> > @@ -1,4 +1,4 @@
> > -/* { dg-do compile { target { powerpc64-*-* } } } */
> > +/* { dg-do compile { target { powerpc*-*-* && be } } } */
> 
> Does this (and other similar tests) work on 32-bit as well?
> 
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-le.c
> > @@ -0,0 +1,77 @@
> > +/* { dg-do compile { target powerpc64le-*-* } } */
> > +/* { dg-require-effective-target powerpc_altivec_ok } */
> > +/* { dg-options "-maltivec" } */
> 
> This now should be  powerpc*-*-* && le, possibly with && lp64 (but I
> don't
> think we care about 32-bit LE in any of the rest of the testsuite;
> many
> tests will fail there, so I wouldn't bother).

Yea, I thought about doing powerpc*-*-* && le.  But yea 32-bit isn't
supported so I figured we didn't want to try and test on that so left
it as powerpc64le-*-*.
> 
> With the be/le selectors available, does it help to split the tests
> into
> two still, or can things be better done with just one test, and be/le
> selectors on each scan-assembler-times that needs one?

The thing is the counts for probably 75% of the instructions is the
same for be/le.  Trying to maintain the be/le files and making sure we
have all the builtin tests accounted for in each file is a pain in the
.   So, yea given we now have the le/be qualifier we should try and
go with a single file.  In the long run I think it is cleaner.  Let me
take a shot at combining the tests.

   Carl



Re: C++ PATCH to implement P0614R1, Range-based for statements with initializer (take 2)

2018-05-23 Thread Jakub Jelinek
On Wed, May 23, 2018 at 02:49:09PM -0400, Marek Polacek wrote:
> > Could you please add some testcases that would test the handling of
> > structured bindings in these new forms of range for, like:
> > for (int i = 0; auto [ x, y ] : z)
> > but perhaps for completeness also in the init-stmt and perhaps both spots
> > too?
> 
> Sure.
> 
> Tested on x86_64-linux, ok for trunk?
> 
> 2018-05-23  Marek Polacek  
> 
>   * g++.dg/cpp2a/range-for8.C: New test.
>   * g++.dg/cpp2a/range-for9.C: New test.
>   * g++.dg/cpp2a/range-for10.C: New test.

LGTM, thanks.

> diff --git gcc/testsuite/g++.dg/cpp2a/range-for10.C 
> gcc/testsuite/g++.dg/cpp2a/range-for10.C
> index e69de29bb2d..a0d0e6d085e 100644
> --- gcc/testsuite/g++.dg/cpp2a/range-for10.C
> +++ gcc/testsuite/g++.dg/cpp2a/range-for10.C
> @@ -0,0 +1,24 @@
> +// P0614R1
> +// { dg-do run }
> +// { dg-options "-std=c++2a" }
> +
> +struct A { int i; long long j; } a[64];
> +
> +int
> +main ()
> +{
> +  A b = { 1, 2 };
> +  for (auto & [ u, v ] : a)
> +{
> +  u = 2;
> +  v = 3;
> +}
> +
> +  for (auto [x, y] = b; auto [ u, v ] : a)
> +if (y + u != x + v)
> +  __builtin_abort ();
> +
> +  for (auto [x, y] = b; auto & [ u, v ] : a)
> +if (y + u != x + v)
> +  __builtin_abort ();
> +}
> diff --git gcc/testsuite/g++.dg/cpp2a/range-for8.C 
> gcc/testsuite/g++.dg/cpp2a/range-for8.C
> index e69de29bb2d..204a63204ca 100644
> --- gcc/testsuite/g++.dg/cpp2a/range-for8.C
> +++ gcc/testsuite/g++.dg/cpp2a/range-for8.C
> @@ -0,0 +1,37 @@
> +// P0614R1
> +// { dg-do run }
> +// { dg-options "-std=c++2a" }
> +
> +struct A { int i; long long j; } a[64];
> +
> +int
> +main ()
> +{
> +  for (int i = 0; auto &x : a)
> +{
> +  x.i = i;
> +  x.j = 2 * i++;
> +}
> +  for (auto & [ x, y ] : a)
> +{
> +  x += 2;
> +  y += 3;
> +}
> +  for (int i = 0; const auto [ u, v ] : a)
> +{
> +  if (u != i + 2 || v != 2 * i++ + 3)
> +__builtin_abort ();
> +}
> +  for (int i = 0; auto [ x, y ] : a)
> +{
> +  x += 4;
> +  y += 5;
> +  if (x != i + 6 || y != 2 * i++ + 8)
> +__builtin_abort ();
> +}
> +  for (int i = 0; const auto x : a)
> +{
> +  if (x.i != i + 2 || x.j != 2 * i++ + 3)
> +__builtin_abort ();
> +}
> +}
> diff --git gcc/testsuite/g++.dg/cpp2a/range-for9.C 
> gcc/testsuite/g++.dg/cpp2a/range-for9.C
> index e69de29bb2d..74d71b67213 100644
> --- gcc/testsuite/g++.dg/cpp2a/range-for9.C
> +++ gcc/testsuite/g++.dg/cpp2a/range-for9.C
> @@ -0,0 +1,30 @@
> +// P0614R1
> +// { dg-do run }
> +// { dg-options "-std=c++2a" }
> +
> +struct A { int i, j; };
> +
> +int
> +main ()
> +{
> +  A a = { .i = 2, .j = 3 };
> +  int arr[] = { 1, 1, 1 };
> +
> +  for (auto & [ x, y ] = a; auto z : arr)
> +if (x + z != 3 || y + z != 4)
> +  __builtin_abort ();
> +
> +  for (int d = 1; auto &z : arr)
> +z += d;
> +
> +  for (const auto [ x, y ] = a; auto z : arr)
> +if (x + z != 4 || y + z != 5)
> +  __builtin_abort ();
> +
> +  for (int d = 1; auto &z : arr)
> +z += d;
> +
> +  for (auto [ x, y ] = a; auto z : arr)
> +if (x + z != 5 || y + z != 6)
> +  __builtin_abort ();
> +}

Jakub


Update svn.html documentation

2018-05-23 Thread Michael Meissner
It was pointed out to me that the section on svn branches was out of date WRT
to the ibm/* branches.  I have checked in this change:

Index: htdocs/svn.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/svn.html,v
retrieving revision 1.220
diff -p -c -r1.220 svn.html
*** htdocs/svn.html 23 May 2018 09:02:54 -  1.220
--- htdocs/svn.html 23 May 2018 19:04:49 -
*** be prefixed with the initials of the dis
*** 658,682 
mailto:cmt...@google.com";>cmt...@google.com.

  
!   ibm/gcc-4_1-branch
!   This branch provides decimal float support backported to GCC 4.1.x.
!   It is expected to be used primarily within IBM for PowerPC-64 GNU/Linux.
!   The branch is maintained by Janis Johnson
!   ja...@us.ibm.com>.
! 
!   ibm/gcc-4_3-branch
!   This branch is expected to be used primarily within IBM for
!   PowerPC-64 GNU/Linux.
! 
!   ibm/gcc-4_4-branch
!   This branch is expected to be used primarily within IBM for
!   PowerPC-64 GNU/Linux.
! 
!   ibm/power7-meissner
!   This branch is a private development branch for Power7 (PowerPC ISA 
2.06)
!   development, prior to the patches being submitted to the mainline.
!   The branch is maintained by Michael Meissner,
!   mailto:meiss...@linux.vnet.ibm.com";>meiss...@linux.vnet.ibm.com.
  
linaro/gcc-x_y-branch
Linaro compilers based on GCC x.y releases.  These branches
--- 658,667 
mailto:cmt...@google.com";>cmt...@google.com.

  
!   ibm/gcc-x-branch
!   Branches that track the GCC branches and are used to create the
!   IBM Advance Toolchain releases.  This family of branches is maintained by
!   personnel from IBM.
  
linaro/gcc-x_y-branch
Linaro compilers based on GCC x.y releases.  These branches


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797



Re: C++ PATCH for c++/85847, ICE with template_id_expr in new()

2018-05-23 Thread Jason Merrill
On Wed, May 23, 2018 at 2:50 PM, Marek Polacek  wrote:
> On Wed, May 23, 2018 at 12:45:11PM -0400, Jason Merrill wrote:
>> On Wed, May 23, 2018 at 9:46 AM, Marek Polacek  wrote:
>> > The diagnostic code in build_new{,_1} was using maybe_constant_value to 
>> > fold
>> > the array length, but that breaks while parsing a template, because we 
>> > might
>> > then leak template codes to the constexpr machinery.
>> >
>> > Bootstrapped/regtested on x86_64-linux, ok for trunk/8?
>> >
>> > 2018-05-23  Marek Polacek  
>> >
>> > PR c++/85847
>> > * init.c (build_new_1): Use fold_non_dependent_expr.
>> > (build_new): Likewise.
>> >
>> > * g++.dg/cpp0x/new3.C: New test.
>> >
>> > @@ -2860,7 +2860,7 @@ build_new_1 (vec **placement, tree 
>> > type, tree nelts,
>> >/* Lots of logic below. depends on whether we have a constant number of
>> >   elements, so go ahead and fold it now.  */
>> >if (outer_nelts)
>> > -outer_nelts = maybe_constant_value (outer_nelts);
>> > +outer_nelts = fold_non_dependent_expr (outer_nelts);
>>
>> If outer_nelts is non-constant, this will mean that it ends up
>> instantiated but still non-constant, which can lead to problems when
>> the result is used in building up other expressions.
>>
>> I think we want to put the result of folding in a separate variable
>> for use with things that want to know about a constant size, and keep
>> the original outer_nelts for use in building outer_nelts_check.
>>
>> >/* Try to determine the constant value only for the purposes
>> >  of the diagnostic below but continue to use the original
>> >  value and handle const folding later.  */
>> > -  const_tree cst_nelts = maybe_constant_value (nelts);
>> > +  const_tree cst_nelts = fold_non_dependent_expr (nelts);
>>
>> ...like we do here.
>
> Like this?
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
>
> 2018-05-23  Marek Polacek  
>
> PR c++/85847
> * init.c (build_new_1): Use fold_non_dependent_expr.  Use a dedicated
> variable for its result.  Fix a condition.
> (build_new): Use fold_non_dependent_expr.  Tweak a condition.
>
> * g++.dg/cpp0x/new3.C: New test.
>
> diff --git gcc/cp/init.c gcc/cp/init.c
> index b558742abf6..cd0110a1e19 100644
> --- gcc/cp/init.c
> +++ gcc/cp/init.c
> @@ -2857,10 +2857,9 @@ build_new_1 (vec **placement, tree type, 
> tree nelts,
>outer_nelts_from_type = true;
>  }
>
> -  /* Lots of logic below. depends on whether we have a constant number of
> +  /* Lots of logic below depends on whether we have a constant number of
>   elements, so go ahead and fold it now.  */
> -  if (outer_nelts)
> -outer_nelts = maybe_constant_value (outer_nelts);
> +  const_tree cst_outer_nelts = fold_non_dependent_expr (outer_nelts);
>
>/* If our base type is an array, then make sure we know how many elements
>   it has.  */
> @@ -2912,11 +2911,12 @@ build_new_1 (vec **placement, tree type, 
> tree nelts,
>/* Warn if we performed the (T[N]) to T[N] transformation and N is
>   variable.  */
>if (outer_nelts_from_type
> -  && !TREE_CONSTANT (outer_nelts))
> +  && cst_outer_nelts != NULL_TREE
> +  && !TREE_CONSTANT (cst_outer_nelts))

Why add the comparisons with NULL_TREE?  fold_non_dependent_expr only
returns null if its argument is null.

> - pedwarn (EXPR_LOC_OR_LOC (outer_nelts, input_location), OPT_Wvla,
> + pedwarn (EXPR_LOC_OR_LOC (cst_outer_nelts, input_location), 
> OPT_Wvla,

Let's drop this change, the original expression has the location we want.

Jason


Re: [PATCH] handle local aggregate initialization in strlen (PR 83821)

2018-05-23 Thread Martin Sebor

On 05/23/2018 08:57 AM, Jeff Law wrote:

On 05/10/2018 04:05 PM, Marc Glisse wrote:

On Thu, 10 May 2018, Martin Sebor wrote:


Can you please comment/respond to Jeff's question below and
confirm whether my understanding of the restriction (below)
is correct?


I don't remember it at all, I really should have expanded that comment...

The documentation of nonzero_chars seems to indicate that, unless
full_string_p, it is only a lower bound on the length of the string, so
not suitable for this kind of alias check. I don't know if we also have
easy access to some upper bound.

(I noticed while looking at this pass that it could probably use
POINTER_DIFF_EXPR more)

So ISTM that we'd need to guard the code that uses si->nonzero_chars in
maybe_invalidate to also check FULL_STRING_P since it appears we're
using si->nonzero_chars as a string length.


I'm not sure I see why.  Can you explain?

Here's my explanation of the current approach.  si->nonzero_chars
is the lower bound on si's length.  maybe_invalidate() invalidates
the string length which is only necessary when something overwrites
one of the first si->nonzero_chars of the array.  It doesn't matter
whether si is nul-terminated at that point.

The difference can be seen on the following test case which gets
optimized as I would expect only if full_string_p is not considered,
else the (minimum) string length is invalidated by the assignment
to a.b because full_string_p is false.

struct A {
  char a[9];
  int b;
};

void f (void)
{
  struct A a;

  __builtin_memcpy (a.a, "1234", 4);
  a.b = 0;// <<< maybe_invalidate()
  a.a[4] = 0;

  if (__builtin_strlen (a.a) != 4)
__builtin_abort ();
}

Martin


Re: [PATCH 1/2] Introduce prefetch-minimum stride option

2018-05-23 Thread H.J. Lu
On Tue, May 22, 2018 at 11:55 AM, Luis Machado  wrote:
>
>
> On 05/16/2018 08:18 AM, Luis Machado wrote:
>>
>>
>>
>> On 05/16/2018 06:08 AM, Kyrill Tkachov wrote:
>>>
>>>
>>> On 15/05/18 12:12, Luis Machado wrote:

 Hi,

 On 05/15/2018 06:37 AM, Kyrill Tkachov wrote:
>
> Hi Luis,
>
> On 14/05/18 22:18, Luis Machado wrote:
>>
>> Hi,
>>
>> Here's an updated version of the patch (now reverted) that addresses
>> the previous bootstrap problem (signedness and long long/int conversion).
>>
>> I've checked that it bootstraps properly on both aarch64-linux and
>> x86_64-linux and that tests look sane.
>>
>> James, would you please give this one a try to see if you can still
>> reproduce PR85682? I couldn't reproduce it in multiple attempts.
>>
>
> The patch doesn't hit the regressions in PR85682 from what I can see.
> I have a comment on the patch below.
>

 Great. Thanks for checking Kyrill.

> --- a/gcc/tree-ssa-loop-prefetch.c
> +++ b/gcc/tree-ssa-loop-prefetch.c
> @@ -992,6 +992,23 @@ prune_by_reuse (struct mem_ref_group *groups)
>   static bool
>   should_issue_prefetch_p (struct mem_ref *ref)
>   {
> +  /* Some processors may have a hardware prefetcher that may conflict
> with
> + prefetch hints for a range of strides.  Make sure we don't issue
> + prefetches for such cases if the stride is within this particular
> + range.  */
> +  if (cst_and_fits_in_hwi (ref->group->step)
> +  && abs_hwi (int_cst_value (ref->group->step)) <
> +  (HOST_WIDE_INT) PREFETCH_MINIMUM_STRIDE)
> +{
>
> The '<' should go on the line below together with
> PREFETCH_MINIMUM_STRIDE.


 I've fixed this locally now.
>>>
>>>
>>> Thanks. I haven't followed the patch in detail, are you looking for
>>> midend changes approval since the last version?
>>> Or do you need aarch64 approval?
>>
>>
>> The changes are not substantial, but midend approval i what i was aiming
>> at.
>>
>> Also the confirmation that PR85682 is no longer happening.
>
>
> James confirmed PR85682 is no longer reproducible with the updated patch and
> the bootstrap issue is fixed now. So i take it this should be OK to push to
> mainline?
>
> Also, i'd like to discuss the possibility of having these couple options
> backported to GCC 8. As is, the changes don't alter code generation by
> default, but they allow better tuning of the software prefetcher for targets
> that benefit from it.
>
> Maybe after letting the changes bake on mainline enough to be confirmed
> stable?

It breaks GCC bootstrap on i686:

../../src-trunk/gcc/tree-ssa-loop-prefetch.c: In function ‘bool
should_issue_prefetch_p(mem_ref*)’:
../../src-trunk/gcc/tree-ssa-loop-prefetch.c:1015:4: error: format
‘%ld’ expects argument of type ‘long int’, but argument 5 has type
‘long long int’ [-Werror=format=]
"Step for reference %u:%u (%ld) is less than the mininum "
^~
"required stride of %d\n",
~
ref->group->uid, ref->uid, int_cst_value (ref->group->step),
   

-- 
H.J.


Re: C++ PATCH for c++/85847, ICE with template_id_expr in new()

2018-05-23 Thread Marek Polacek
On Wed, May 23, 2018 at 03:24:20PM -0400, Jason Merrill wrote:
> On Wed, May 23, 2018 at 2:50 PM, Marek Polacek  wrote:
> > On Wed, May 23, 2018 at 12:45:11PM -0400, Jason Merrill wrote:
> >> On Wed, May 23, 2018 at 9:46 AM, Marek Polacek  wrote:
> >> > The diagnostic code in build_new{,_1} was using maybe_constant_value to 
> >> > fold
> >> > the array length, but that breaks while parsing a template, because we 
> >> > might
> >> > then leak template codes to the constexpr machinery.
> >> >
> >> > Bootstrapped/regtested on x86_64-linux, ok for trunk/8?
> >> >
> >> > 2018-05-23  Marek Polacek  
> >> >
> >> > PR c++/85847
> >> > * init.c (build_new_1): Use fold_non_dependent_expr.
> >> > (build_new): Likewise.
> >> >
> >> > * g++.dg/cpp0x/new3.C: New test.
> >> >
> >> > @@ -2860,7 +2860,7 @@ build_new_1 (vec **placement, tree 
> >> > type, tree nelts,
> >> >/* Lots of logic below. depends on whether we have a constant number 
> >> > of
> >> >   elements, so go ahead and fold it now.  */
> >> >if (outer_nelts)
> >> > -outer_nelts = maybe_constant_value (outer_nelts);
> >> > +outer_nelts = fold_non_dependent_expr (outer_nelts);
> >>
> >> If outer_nelts is non-constant, this will mean that it ends up
> >> instantiated but still non-constant, which can lead to problems when
> >> the result is used in building up other expressions.
> >>
> >> I think we want to put the result of folding in a separate variable
> >> for use with things that want to know about a constant size, and keep
> >> the original outer_nelts for use in building outer_nelts_check.
> >>
> >> >/* Try to determine the constant value only for the purposes
> >> >  of the diagnostic below but continue to use the original
> >> >  value and handle const folding later.  */
> >> > -  const_tree cst_nelts = maybe_constant_value (nelts);
> >> > +  const_tree cst_nelts = fold_non_dependent_expr (nelts);
> >>
> >> ...like we do here.
> >
> > Like this?
> >
> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> >
> > 2018-05-23  Marek Polacek  
> >
> > PR c++/85847
> > * init.c (build_new_1): Use fold_non_dependent_expr.  Use a 
> > dedicated
> > variable for its result.  Fix a condition.
> > (build_new): Use fold_non_dependent_expr.  Tweak a condition.
> >
> > * g++.dg/cpp0x/new3.C: New test.
> >
> > diff --git gcc/cp/init.c gcc/cp/init.c
> > index b558742abf6..cd0110a1e19 100644
> > --- gcc/cp/init.c
> > +++ gcc/cp/init.c
> > @@ -2857,10 +2857,9 @@ build_new_1 (vec **placement, tree 
> > type, tree nelts,
> >outer_nelts_from_type = true;
> >  }
> >
> > -  /* Lots of logic below. depends on whether we have a constant number of
> > +  /* Lots of logic below depends on whether we have a constant number of
> >   elements, so go ahead and fold it now.  */
> > -  if (outer_nelts)
> > -outer_nelts = maybe_constant_value (outer_nelts);
> > +  const_tree cst_outer_nelts = fold_non_dependent_expr (outer_nelts);
> >
> >/* If our base type is an array, then make sure we know how many elements
> >   it has.  */
> > @@ -2912,11 +2911,12 @@ build_new_1 (vec **placement, tree 
> > type, tree nelts,
> >/* Warn if we performed the (T[N]) to T[N] transformation and N is
> >   variable.  */
> >if (outer_nelts_from_type
> > -  && !TREE_CONSTANT (outer_nelts))
> > +  && cst_outer_nelts != NULL_TREE
> > +  && !TREE_CONSTANT (cst_outer_nelts))
> 
> Why add the comparisons with NULL_TREE?  fold_non_dependent_expr only
> returns null if its argument is null.

True, and it seemed to me that the argument can be null when NELTS is null,
which, according to the comment for build_new_1 could happen.  So I was just
being cautious.  But I dropped the checks and nothing in the testsuite broke.

> > - pedwarn (EXPR_LOC_OR_LOC (outer_nelts, input_location), OPT_Wvla,
> > + pedwarn (EXPR_LOC_OR_LOC (cst_outer_nelts, input_location), 
> > OPT_Wvla,
> 
> Let's drop this change, the original expression has the location we want.

Okay.

Bootstrapped/regtested on x86_64-linux, ok for trunk/8?

2018-05-23  Marek Polacek  

PR c++/85847
* init.c (build_new_1): Use fold_non_dependent_expr.  Use a dedicated
variable for its result.  Fix a condition.
(build_new): Use fold_non_dependent_expr.  Tweak a condition.

* g++.dg/cpp0x/new3.C: New test.

diff --git gcc/cp/init.c gcc/cp/init.c
index b558742abf6..5bfd0848fc4 100644
--- gcc/cp/init.c
+++ gcc/cp/init.c
@@ -2857,10 +2857,9 @@ build_new_1 (vec **placement, tree type, 
tree nelts,
   outer_nelts_from_type = true;
 }
 
-  /* Lots of logic below. depends on whether we have a constant number of
+  /* Lots of logic below depends on whether we have a constant number of
  elements, so go ahead and fold it now.  */
-  if (outer_nelts)
-outer_nelts = maybe_constant_value (outer_nelts);
+  const_tree cst_outer_

Re: [PATCH] PR target/85358: Add target hook to prevent default widening

2018-05-23 Thread Joseph Myers
On Wed, 23 May 2018, Michael Meissner wrote:

> One of the problems I've faced over the years is the assumption that there is
> only one type for a given size, and that isn't true for IF/KF/TFmode.

I think that's a different problem.

The problem here is that "wider" is only a partial ordering between 
floating-point modes; neither IFmode nor KFmode is wider than the other.  
(While TFmode has the same set of values as whichever of IFmode and KFmode 
it corresponds to.  And though IFmode is in some sense wider than DFmode, 
it's probably not a good idea to treat it as such given that it doesn't 
have the IEEE semantics of DFmode.)

As a separate issue, anything that tries to deduce a floating-point mode 
from a size should have had a mode in the first place (for example, the 
*_TYPE_SIZE target macros for floating-point types really ought to be some 
kind of target hook that returns the required mode rather than a size).

-- 
Joseph S. Myers
jos...@codesourcery.com


  1   2   >